Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fstab has carriage returns #1399

Closed
remline opened this issue Jul 31, 2022 · 13 comments · Fixed by #2400
Closed

fstab has carriage returns #1399

remline opened this issue Jul 31, 2022 · 13 comments · Fixed by #2400

Comments

@remline
Copy link

remline commented Jul 31, 2022

After running archinstall, the generated /etc/fstab contains carriage returns (0x0D). This causes the following message in the systemd journal when booting into the newly installed system:

systemd-fstab-generator[251]: Mount point is not a valid path, ignoring.

Expected behavior: /etc/fstab should use only line feeds (0x0A) to represent new lines.

This may be related to issue #1378.

@uhthomas
Copy link
Contributor

Yeah, I noticed this as well... vim shows ^M everywhere.

@kamiccolo
Copy link

kamiccolo commented Jan 18, 2023

short reproducer (with arch-install-scripts cloned as well):

try_fstab.py:

from archinstall.lib.general import SysCommand

print(SysCommand("arch-install-scripts/genfstab -pU /").decode())

Bare in mind that genfstab -pU / does NOT provide ^Ms.

Also, am I getting crazy or is there really something weird going on, suddenly in the middle of trace ^M gets prepended to every line of it o_0

$ python -m trace --trace try_fstab.py:

...
general.py(271): ^I^Iif len(args) >= 2 and args[1]:$
general.py(274): ^I^Iif self.exit_code != 0:$
general.py(500): ^I^Iif self.peak_output:$
general.py(504): ^I^Ireturn True$
 --- modulename: general, funcname: decode$
general.py(507): ^I^Iif self.session:$
general.py(508): ^I^I^Ireturn self.session._trace_log.decode(fmt)$
 --- modulename: _bootstrap_external, funcname: _path_stat^M$
<frozen importlib._bootstrap_external>(147): <frozen importlib._bootstrap_external>(1092): <frozen importlib._bootstrap_external>(973): <frozen importlib._bootstrap_external>(974): <frozen importlib._bootstrap_external>(975):  --- modulename: _bootstrap_external, funcname: get_data^M$
<frozen importlib._bootstrap_external>(1072): <frozen importlib._bootstrap_external>(1073): <frozen importlib._bootstrap_external>(1074): <frozen importlib._bootstrap_external>(1073): <frozen importlib._bootstrap_external>(980): <frozen importlib._bootstrap_external>(981): <frozen importlib._bootstrap_external>(979): <frozen importlib._bootstrap_external>(983): <frozen importlib._bootstrap_external>(984):  --- modulename: _bootstrap_external, funcname: _classify_pyc^M$
<frozen importlib._bootstrap_external>(601): <frozen importlib._bootstrap_external>(602): <frozen importlib._bootstrap_external>(606): <frozen importlib._bootstrap_external>(610):  --- modulename: _bootstrap_external, funcname: _unpack_uint32^M$
<frozen importlib._bootstrap_external>(86): <frozen importlib._bootstrap_external>(87): <frozen importlib._bootstrap_external>(612): <frozen importlib._bootstrap_external>(615): <frozen importlib._bootstrap_external>(985): <frozen importlib._bootstrap_external>(986): <frozen importlib._bootstrap_external>(987): <frozen importlib._bootstrap_external>(988): <frozen importlib._bootstrap_external>(989): <frozen importlib._bootstrap_external>(990): <frozen importlib._bootstrap_external>(989): <frozen importlib._bootstrap_external>(992):  --- modulename: _bootstrap_external, funcname: get_data^M$
<frozen importlib._bootstrap_external>(1072): <frozen importlib._bootstrap_external>(1073): <frozen importlib._bootstrap_external>(1074): <frozen importlib._bootstrap_external>(1073): <frozen importlib._bootstrap_external>(993): <frozen importlib._bootstrap_external>(994): <frozen importlib._bootstrap_external>(995): <frozen importlib._bootstrap_external>(993): <frozen importlib._bootstrap_external>(997): <frozen importlib._bootstrap_external>(998): <frozen importlib._bootstrap_external>(997):  --- modulename: _bootstrap_external, funcname: _validate_hash_pyc^M$
<frozen importlib._bootstrap_external>(663): <frozen importlib._bootstrap_external>(1010): <frozen importlib._bootstrap_external>(1011): <frozen importlib._bootstrap_external>(1010):  --- modulename: _bootstrap, funcname: _verbose_message^M$
<frozen importlib._bootstrap>(246): <frozen importlib._bootstrap_external>(1012): <frozen importlib._bootstrap_external>(1013): <frozen importlib._bootstrap_external>(1014): <frozen importlib._bootstrap_external>(1012):  --- modulename: _bootstrap_external, funcname: _compile_bytecode^M$
<frozen importlib._bootstrap_external>(672): <frozen importlib._bootstrap_external>(673): <frozen importlib._bootstrap_external>(674):  --- modulename: _bootstrap, funcname: _verbose_message^M$
<frozen importlib._bootstrap>(246): <frozen importlib._bootstrap_external>(675): <frozen importlib._bootstrap_external>(676): <frozen importlib._bootstrap_external>(677): <frozen importlib._bootstrap_external>(880): <frozen importlib._bootstrap_external>(883):  --- modulename: _bootstrap, funcname: _call_with_frames_removed^M$
<frozen importlib._bootstrap>(241):  --- modulename: tty, funcname: <module>^M$
tty.py(1): """Terminal utilities."""^M$
tty.py(5): from termios import *^M$
tty.py(7): __all__ = ["setraw", "setcbreak"]^M$
tty.py(10): IFLAG = 0^M$
tty.py(11): OFLAG = 1^M$
tty.py(12): CFLAG = 2^M$
tty.py(13): LFLAG = 3^M$
...

btw, for the context:

$ python --version
Python 3.10.9

@kamiccolo
Copy link

kamiccolo commented Jan 18, 2023

I'm almost certain by now it has something to do with pty.fork().

This (closest to the orginal) do reproduce the issue:

import os
import pty
import time

pid, child_fd = pty.fork()

if not pid:
    os.execv('../arch-install-scripts/genfstab', ['-pU', '/'])
else:
    time.sleep(1)
    output = os.read(child_fd, 8192)
    print(output)

And this does NOT:

import os

output = os.execv('../arch-install-scripts/genfstab', ['-pU', '/'])
print(output)

mhm, termios.ONLCR?

@codefiles
Copy link
Contributor

codefiles commented Jan 19, 2023

mhm, termios.ONLCR?

@kamiccolo, how about this?

Reproduce the issue:

import os
import pty
import time

pid, fd = pty.fork()
if pid == 0:
    os.execv('/usr/bin/printf', ['printf', 'foo\nbar\n'])
else:
    time.sleep(1)
    output = os.read(fd, 8192)
    print(output)

Output: b'foo\r\nbar\r\n'

Utilize the termios module [1] to disable the terminal output setting "translate newline to carriage return-newline" by negation of onlcr [2][3] :

import os
import pty
import termios
import time

pid, fd = pty.fork()
if pid == 0:
    os.execv('/usr/bin/printf', ['printf', 'foo\nbar\n'])
else:
    attr = termios.tcgetattr(fd)
    attr[1] &= ~termios.ONLCR
    termios.tcsetattr(fd, termios.TCSANOW, attr)
    time.sleep(1)
    output = os.read(fd, 8192)
    print(output)

Output: b'foo\nbar\n'

[1] https://docs.python.org/3/library/termios.html
[2] https://man.archlinux.org/man/stty.1#Output_settings%3A

Edit:
[3] https://man.archlinux.org/man/termios.3#ONLCR

@kamiccolo
Copy link

Woa, way nicer and clearer explanation. Been bashing my head with those different termios flags with not much of a further progress. I wonder, why did this happen in the first place? Some change in underlying vterms? Also, not sure if we should fix this in SysCommand(), because it may break some other stuff unrelated to fstab. On the other hand, replacing \r\n does not look nice as well.

@Fredolx
Copy link

Fredolx commented Feb 14, 2023

Is only fstab affected by this?

@Torxed
Copy link
Member

Torxed commented Feb 23, 2023

mhm, termios.ONLCR?

@kamiccolo, how about this?

Reproduce the issue:

import os
import pty
import time

pid, fd = pty.fork()
if pid == 0:
    os.execv('/usr/bin/printf', ['printf', 'foo\nbar\n'])
else:
    time.sleep(1)
    output = os.read(fd, 8192)
    print(output)

Output: b'foo\r\nbar\r\n'

Utilize the termios module [1] to disable the terminal output setting "translate newline to carriage return-newline" by negation of onlcr [2] :

import os
import pty
import termios
import time

pid, fd = pty.fork()
if pid == 0:
    os.execv('/usr/bin/printf', ['printf', 'foo\nbar\n'])
else:
    attr = termios.tcgetattr(fd)
    attr[1] &= ~termios.ONLCR
    termios.tcsetattr(fd, termios.TCSANOW, attr)
    time.sleep(1)
    output = os.read(fd, 8192)
    print(output)

Output: b'foo\nbar\n'

[1] https://docs.python.org/3/library/termios.html [2] https://man.archlinux.org/man/stty.1#Output_settings:

What the actual * haha, I love the pin point accuracy in the example.
I was wondering what created those as I've gone up and down the code trying to figure out where the stray \r comes from.

Thank you!

Is only fstab affected by this?

Any output relying on archinstall.SysCommand() would see these artifacts. But most readers of the output doesn't care and can use the data (for instance json.loads() accept \r\n and \n).

We could set a default flag to SysCommand() to honor *NIX line endings, and the user (we) could choose to preserve \r\n where needed (nowhere?).

But I think setting the tty attribute is the cleanest approach?

@codefiles
Copy link
Contributor

codefiles commented Feb 24, 2023

But I think setting the tty attribute is the cleanest approach?

Checking the output of the examples below will show that onlcr is enabled in both cases but carriage returns are only present in the last example. (Edit: In the first example the execution happens in the terminal the script is called in and the terminal handles the carriage returns by returning the cursor to the beginning of the line. In the second example a pseudoterminal is used and no handling of the carriage return is done.) Disabling onlcr will eliminate the carriage returns but it is a workaround.

import os

os.execv('/usr/bin/stty', ['stty', '-a'])
import os
import pty
import time

pid, fd = pty.fork()
if pid == 0:
    os.execv('/usr/bin/stty', ['stty', '-a'])
else:
    time.sleep(1)
    output = os.read(fd, 8192)
    print(output)

I do not understand why the subprocess or asyncio module is not being used to start subprocesses. The subprocess module was used until commit 1f89be3 but the commit message of "Swapping the run() command for something a little more potent" is obscure.

Edit: I made an incorrect conclusion from the examples I gave in this comment. Fixed the first example incorrectly showing os.execv() as having a return value.

@Torxed
Copy link
Member

Torxed commented Feb 24, 2023

"Swapping the run() command for something a little more potent" is obscure.

subprocess was not working reliably with machinectl, ssh or any other process that was forking their stdin handlers as well as detecting changes in running state of the executable.
There were probably other reasons too, and those could most likely be worked around with libraries and such, but at the time I felt more intimidated by building around the issues rather than creating a solution myself - without libraries.

But suggestions are always welcome :)

@codefiles
Copy link
Contributor

subprocess was not working reliably with machinectl, ssh or any other process that was forking their stdin handlers as well as detecting changes in running state of the executable.

There is not any use of machinectl or ssh in the code, in what context are they being used? When would "detecting changes in running state of the executable" be required? Could you provide an example that demonstrates the problem?

But suggestions are always welcome :)

I have none yet because I still lack an understanding of what necessitates the design choices for archinstall.SysCommand().

@codefiles
Copy link
Contributor

codefiles commented Mar 14, 2023

I understand now that the desire is for archinstall.SysCommand() to work with any command and therefore it is designed to use a pseudoterminal. A possible alternative to interacting with ssh in this manner would be asyncssh.

The only fix other than disabling onclr is to use bytes.replace(b'\r\n', b'\n').

import os
import pty

pid, fd = pty.fork()

if pid == 0:
    os.execv('/usr/bin/printf', ['printf', 'foo\nbar\nbaz\n'])
else:
    output = b''
    while true:
        try:
            data = os.read(fd, 1024)
            output += data.replace(b'\r\n', b'\n')
        except:
            break
    print(output)
    print(output.decode(), end='')

Output:

b'foo\nbar\nbaz\n'
foo
bar
baz

@kamiccolo
Copy link

I wonder, wasn't this particular issue already solved. There were a few simplifications of SysCommand and stuff already.

@codefiles
Copy link
Contributor

This issue has not been solved.

The issue was inadvertently fixed for Btrfs in commit 00b0ae7 but the last line of the fstab file will be missing a newline (\n).

gen_fstab = SysCommand(f'/usr/bin/genfstab {flags} {self.target}').decode()

def decode(self, encoding: str = 'utf-8', errors='backslashreplace', strip: bool = True) -> str:

if strip:
return val.strip()

The logic added to the SysCommand.decode() function in commit 5c903df to use str.strip() does not work well considering the output of many commands will be more than one line but that is another issue.

In this for loop if the file system is Btrfs then the fstab file, which will contain the carriage return-newlines (\r\n) at this point, is read in with file.readlines(). This function removes the carriage returns but leaves the newlines and splits up the lines of the file into a list. Then after potentially manipulating the contents of the list, it is written back to the fstab file with file.writelines().

for mod in self._disk_config.device_modifications:
for part_mod in mod.partitions:
if part_mod.fs_type != disk.FilesystemType.Btrfs:
continue
with fstab_path.open('r') as fp:
fstab = fp.readlines()
# Replace the {installation}/etc/fstab with entries
# using the compress=zstd where the mountpoint has compression set.
for index, line in enumerate(fstab):
# So first we grab the mount options by using subvol=.*? as a locator.
# And we also grab the mountpoint for the entry, for instance /var/log
subvoldef = re.findall(',.*?subvol=.*?[\t ]', line)
mountpoint = re.findall('[\t ]/.*?[\t ]', line)
if not subvoldef or not mountpoint:
continue
for sub_vol in part_mod.btrfs_subvols:
# We then locate the correct subvolume and check if it's compressed,
# and skip entries where compression is already defined
# We then sneak in the compress=zstd option if it doesn't already exist:
if sub_vol.compress and str(sub_vol.mountpoint) == Path(
mountpoint[0].strip()) and ',compress=zstd,' not in line:
fstab[index] = line.replace(subvoldef[0], f',compress=zstd{subvoldef[0]}')
break
with fstab_path.open('w') as fp:
fp.writelines(fstab)

For the intended purpose the read and write of the file only need to occur once if even at all, not iteratively by way of a double nested for loop but that is yet another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants