Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix load_devices() lsblk #2565

Merged
merged 2 commits into from
Aug 23, 2024
Merged

Conversation

codefiles
Copy link
Contributor

@codefiles codefiles commented Jul 6, 2024

Fixes #1993

  • Get all lsblk info and search it rather than making several calls to lsblk
  • Add udev sync before lsblk call

@codefiles codefiles requested a review from Torxed as a code owner July 6, 2024 14:30
@codefiles codefiles marked this pull request as draft July 6, 2024 18:18
@svartkanin
Copy link
Collaborator

Thanks for the fix, @codefiles the PR is in draft, is there anything left?

@codefiles
Copy link
Contributor Author

After submitting I noticed that when a user has an existing btrfs partition and it is not mounted then get_btrfs_info() will call mount(). I think this patch will just move the bug to the lsblk call in mount() given that case.

Maybe using udevadm settle to synchronize with udev is needed after newDisk() and freshDisk(). Take a look at the output of udevadm monitor --udev while executing those within the python interpreter. You see udev events, right? I think they could be causing a race condition with the lsblk calls that follow. What do you think?

@svartkanin
Copy link
Collaborator

Looking at #1993 I'm wondering if archinstall was run multiple times? If a ISO gets booted and archinstall is run for the first time then this error shouldn't happen as all disks are most likely there and can be found without issues.

I think they could be causing a race condition with the lsblk calls that follow. What do you think?

I didn't have time to play around with it, but could a potential wait for a couple of seconds before continuing resolve that problem?

@svartkanin
Copy link
Collaborator

I wonder if enforcing a restart after a successful installation or at least preventing to run it multiple installations might be worth exploring?

@codefiles codefiles marked this pull request as ready for review August 22, 2024 20:44
@codefiles
Copy link
Contributor Author

codefiles commented Aug 22, 2024

This does not fix the issue it was intended to but I am fine with it being merged.

@svartkanin, in regards to the linked issue, I do not agree with your conclusions or suggestions but thanks.

@codefiles
Copy link
Contributor Author

codefiles commented Aug 23, 2024

I figured out how to reproduce the bug. I added an additional commit and with that the bug should be fixed. This is ready to be merged.

@Torxed Torxed merged commit 179ba3c into archlinux:master Aug 23, 2024
6 checks passed
@Torxed
Copy link
Member

Torxed commented Aug 23, 2024

Nicely done, was it tricky to reproduce the bug?

@codefiles
Copy link
Contributor Author

codefiles commented Aug 23, 2024

From #2536 (comment) the traceback shows this bug happens during an lsblk call in load_devices():

Traceback (most recent call last):
  File "/usr/bin/archinstall", line 5, in <module>
    from archinstall import run_as_a_module
  File "/usr/lib/python3.12/site-packages/archinstall/__init__.py", line 11, in <module>
    from .lib import disk
  File "/usr/lib/python3.12/site-packages/archinstall/lib/disk/__init__.py", line 1, in <module>
    from .device_handler import device_handler, disk_layouts
  File "/usr/lib/python3.12/site-packages/archinstall/lib/disk/device_handler.py", line 649, in <module>
    device_handler = DeviceHandler()
                     ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/archinstall/lib/disk/device_handler.py", line 38, in __init__
    self.load_devices()
  File "/usr/lib/python3.12/site-packages/archinstall/lib/disk/device_handler.py", line 74, in load_devices
    lsblk_info = get_lsblk_info(partition.path)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/archinstall/lib/disk/device_model.py", line 1136, in get_lsblk_info
    if infos := _fetch_lsblk_info(dev_path):
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/archinstall/lib/disk/device_model.py", line 1121, in _fetch_lsblk_info
    raise DiskError(f'Failed to read disk "{dev_path}" with lsblk')
archinstall.lib.exceptions.DiskError: Failed to read disk "/dev/sda1" with lsblk

The function load_devices() is called when the application launches and I had seen the bug before from my own testing. I put the function call in an infinite loop.

diff --git a/archinstall/lib/disk/device_handler.py b/archinstall/lib/disk/device_handler.py
index 8d38dbc8..6db3d94b 100644
--- a/archinstall/lib/disk/device_handler.py
+++ b/archinstall/lib/disk/device_handler.py
@@ -38,7 +38,8 @@ class DeviceHandler(object):

        def __init__(self):
                self._devices: Dict[Path, BDevice] = {}
-               self.load_devices()
+               while True:
+                       self.load_devices()

        @property
        def devices(self) -> List[BDevice]:

With this patch I am able to produce the bug with the commit previous to this merge by executing archinstall and letting this loop run until the error is raised. It takes a relatively short amount of time till the error is raised.

I then used that same process to check this pull request with a small addition. I used tail -f /var/log/archinstall/install.log | grep -E '(Device|Partition) lsblk info not found' in another tty while the application was running. This was to verify that no devices or partitions were skipped due to missing lsblk infomation since that is a feature of this pull request. Without the last commit to this pull request there were skips but after adding it there were no skips and no error raised.

#2536 and #2559 can be closed as well.

@codefiles codefiles deleted the load-devices-lsblk branch August 23, 2024 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Archinstall encounters Error when starting - "/dev/sda3: not a block device"
3 participants