New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR#23218 introduced a regression for devices with db_persist
option
#23429
Comments
I wouldn't call this a regression. No actual negative impact is known except the debug messages. See also #23218 (comment) |
This is a slightly different approach than the one taken by commit 75d7b59 to fix issue systemd#12953 and systemd#23208. This patch forces PID1 to forget all devices (except those with the "db_persist" option see below) that were known by PID1 before switching root by pretending that the devices were in DEAD state before being serialized. Hence no artificial "plugged->dead" state transitions happen when PID1 is reexecuting from a switch root followed by "dead->plugged" state transitions when all devices are coldplugged with the new set of udev rule from the host. As mentioned previously, devices with the "db_persistent" option are exceptions of the previously described mechanism. Since these devices remain in the udev DB even after the DB has been cleared, they still continue to be deserialized in plugged state and remain in this state hence following the description of the option. This should fix the regression introduced by 75d7b59. Fixes: systemd#23429 Replaces: systemd#23218
Hi Martin, it's pretty hard to say for sure that it won't have any negative impact but the facts are that this transition never happened before and the doc clearly describes the use of "db_persist" as way to make sure that the device unit state remains always plugged. |
Yeah, the condition seems slightly offensive, I will fix it after dinner. But, the basic idea of the block is that setting DEVICE_TENTATIVE with DEVICE_NOT_FOUND is super strange. |
Not if you consider complex storage stacks. If a device has been "plugged" but not mounted (instead, member of an MD RAID, or an LVM PV or what not), and you mask out
|
Correct me if I'm wrong: the issue in #12953 was that we observed plugged→dead transitions. These transitions are fatal, as are dead→plugged transitions based on "guessing" (rather than actual udev device detection). Both these types of wrong transitions are fixed by @yuwata's patch set. What you've been observing is that |
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429.
@mwilck I'd like to recall that it is dangerous to set DEVICE_TENTATIVE with DEVICE_NOT_FOUND. If we do so in device_coldplug(), then the device will not enter the dead state even if |
If a device has persistent database, then the |
I guess the unnecessary state transition reported in this issue triggers the following mount unit being stopped:
Here, important point is the devices in What= and Requires= are different. |
IIUC "tentative" corresponds to "activating", and thus will be subject to systemd's general timeout handling for unit activation. IOW if a device was actually "plugged" before switching root, but not found any more in the root FS, it would switch to "dead" state when the timeout expires. Am I missing something? As noted before, the combination
But nothing bad would happen with |
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429.
I've found additional evidence that this is not a regression. See #23437 (comment). |
I suspect https://bugzilla.redhat.com/show_bug.cgi?id=2088788, https://bugzilla.redhat.com/show_bug.cgi?id=2087225, and coreos/fedora-coreos-tracker#1200 could be related. I'll ask the reporters to test with the patch reverted. I'm doing a scratch build in koji right now. |
My system failed to boot with systemd 251, and reverting #23218 fixed that. diff(no interesting changes before switch root)
systemd 251 running in system mode (+PAM -AUDIT -SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE -TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
Detected architecture x86-64.
initrd-switch-root.service: Deactivated successfully.
Stopped initrd-switch-root.service.
systemd-journald.service: Scheduled restart job, restart counter is at 1.
Created slice system-getty.slice.
Created slice system-modprobe.slice.
Created slice system-systemd\x2dfsck.slice.
Created slice user.slice.
Set up automount proc-sys-fs-binfmt_misc.automount.
systemd-ask-password-console.path was skipped because of a failed condition check (ConditionPathExists=!/run/plymouth/pid).
Started systemd-ask-password-wall.path.
-Reached target blockdev@dev-mapper-px_root.target.
-Reached target blockdev@dev-mapper-px_swap.target.
Reached target getty.target.
Stopped target initrd-switch-root.target.
Stopped target initrd-fs.target.
Stopped target initrd-root-fs.target.
Reached target integritysetup.target.
Reached target remote-cryptsetup.target.
Reached target remote-fs.target.
Reached target slices.target.
Reached target veritysetup.target.
Listening on systemd-coredump.socket.
Listening on systemd-initctl.socket.
Listening on systemd-networkd.socket.
Listening on systemd-udevd-control.socket.
-Listening on systemd-udevd-kernel.socket.
-Activating swap dev-mapper-px_swap.swap...
dev-hugepages.mount was skipped because of a failed condition check (ConditionPathExists=/sys/kernel/mm/hugepages).
Mounting dev-mqueue.mount...
Mounting sys-kernel-debug.mount...
Mounting sys-kernel-tracing.mount...
Starting kmod-static-nodes.service...
Starting modprobe@configfs.service...
Starting modprobe@drm.service...
Starting modprobe@fuse.service...
plymouth-switch-root.service: Deactivated successfully.
Stopped plymouth-switch-root.service.
-Adding 67106812k swap on /dev/mapper/px_swap. Priority:-2 extents:1 across:67106812k SS
-Starting systemd-cryptsetup@px_home.service...
-Starting systemd-cryptsetup@storage.service...
-Starting systemd-cryptsetup@storage\x2dvm.service...
-Starting systemd-cryptsetup@storage\x2dxl.service...
+Stopping systemd-cryptsetup@px_root.service...
+Stopping systemd-cryptsetup@px_swap.service...
Stopped systemd-journald.service.
Starting systemd-journald.service...
Starting systemd-modules-load.service...
fuse: init (API version 7.36)
Starting systemd-remount-fs.service...
systemd-repart.service was skipped because all trigger condition checks failed.
-Starting systemd-udev-trigger.service...
-Activated swap dev-mapper-px_swap.swap.
Mounted dev-mqueue.mount.
Mounted sys-kernel-debug.mount.
Mounted sys-kernel-tracing.mount.
Finished kmod-static-nodes.service.
modprobe@configfs.service: Deactivated successfully.
Finished modprobe@configfs.service.
modprobe@drm.service: Deactivated successfully.
Finished modprobe@drm.service.
-EXT4-fs (dm-0): re-mounted. Quota mode: none.
modprobe@fuse.service: Deactivated successfully.
Finished modprobe@fuse.service.
+systemd-cryptsetup@px_root.service: Control process exited, code=exited, status=1/FAILURE
+systemd-cryptsetup@px_root.service: Failed with result 'exit-code'.
+Stopped systemd-cryptsetup@px_root.service.
Finished systemd-modules-load.service.
-Finished systemd-remount-fs.service.
-Reached target swap.target.
-Journal started
-Runtime Journal (/run/log/journal/40a7ddbe283140c18111fd16b04c18f6) is 8.0M, max 638.6M, 630.6M free.
-Queued start job for default target graphical.target.
-systemd-journald.service: Deactivated successfully.
-Set cipher aes, mode xts-plain64, key size 256 bits for device /dev/disk/by-partlabel/storage.
-Set cipher aes, mode xts-plain64, key size 256 bits for device /dev/disk/by-partlabel/px_home.
-Failed to find module 'ipmi-devintf'
-Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-partlabel/storage-vm.
-Set cipher aes, mode xts-plain64, key size 512 bits for device /dev/disk/by-partlabel/storage-xl.
+Reached target blockdev@dev-mapper-px_root.target.
Mounting sys-fs-fuse-connections.mount...
Mounting sys-kernel-config.mount...
+Starting systemd-sysctl.service...
+Stopped target blockdev@dev-mapper-px_root.target.
+Mounted sys-fs-fuse-connections.mount.
+Mounted sys-kernel-config.mount.
+EXT4-fs (dm-0): re-mounted. Quota mode: none.
+Finished systemd-remount-fs.service.
systemd-firstboot.service was skipped because of a failed condition check (ConditionFirstBoot=yes).
systemd-hwdb-update.service was skipped because of a failed condition check (ConditionNeedsUpdate=/etc).
systemd-pstore.service was skipped because of a failed condition check (ConditionDirectoryNotEmpty=/sys/fs/pstore).
Starting systemd-random-seed.service...
-Starting systemd-sysctl.service...
systemd-sysusers.service was skipped because of a failed condition check (ConditionNeedsUpdate=/etc).
Starting systemd-tmpfiles-setup-dev.service...
+Finished systemd-sysctl.service.
+Journal started
+Runtime Journal (/run/log/journal/40a7ddbe283140c18111fd16b04c18f6) is 8.0M, max 638.6M, 630.6M free.
+Queued start job for default target graphical.target.
+Unnecessary job was removed for dev-disk-by\x2duuid-1c281d44\x2d991d\x2d4121\x2da855\x2d0f01d9c5512e.device.
+Unnecessary job was removed for dev-disk-by\x2duuid-c1adc4ce\x2dd27e\x2d4d9f\x2d859e\x2dcf58d725f773.device.
+systemd-journald.service: Deactivated successfully.
+Device px_root is still in use.
Started systemd-journald.service.
-Mounted sys-fs-fuse-connections.mount.
-Mounted sys-kernel-config.mount.
+Failed to deactivate: Device or resource busy
(from this point the logs diverge almost completely) |
Sigh… so it seems we have an issue with LUKS, or dm-crypt in general. That matches also some of the bugs mentioned in #23429 (comment). @tanriol, can you please describe your block device stack, and provide |
dm-crypt devices have I suppose this is a fallout of the fact that systemd recognizes devices used by mounts or swaps, but no other, more complex block device dependencies. |
Nothing really fancy - multiple separate disks, each has GPT, LUKS1, filesystems.
For a failing boot? In a couple days, I think, when I have time to test that.
Okay, will check when I have time. |
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: backport of a5dedd6]
On switching root, the database for the device may be cleared, or the device itself may be unplugged. The function device_enumerate() cannot distinguish the two cases. Let's explicitly check if the device is still around in device_catchup(). This mostly reverts 75d7b59. Fixes systemd#23429.
On switching root, the database for the device may be cleared, or the device itself may be unplugged. The function device_enumerate() cannot distinguish the two cases. Let's explicitly check if the device is still around in device_catchup(). This mostly reverts 75d7b59. Fixes systemd#23429. Unfortunately, this re-introduces systemd#23208.
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: backport of a5dedd6]
I've pushed another patch to #23489 which fixes the LUKS issue in my test setup. Analysis:dm-crypt device units generated by systemd-cryptsetup-generator The
By not setting the state of these devices to |
This should cover cases regarding devices with `OPTIONS+="db_persist"` during initrd->sysroot transition. See: * systemd#23429 * systemd#23218 * https://bugzilla.redhat.com/show_bug.cgi?id=2087225
This should cover cases regarding devices with `OPTIONS+="db_persist"` during initrd->sysroot transition. See: * systemd#23429 * systemd#23218 * systemd#23489 * https://bugzilla.redhat.com/show_bug.cgi?id=2087225
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437]
dm-crypt device units generated by systemd-cryptsetup-generator habe BindsTo= dependencies on their backend devices. The dm-crypt devices have the db_persist flag set, and thus survive the udev db cleanup while switching root. But backend devices usually don't survive. These devices are neither mounted nor used for swap, thus they will seen as DEVICE_NOT_FOUND after switching root. The BindsTo dependency will cause systemd to schedule a stop job for the dm-crypt device, breaking boot: [ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device. [ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace [ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343 Avoid this by not setting the state of the backend devices to DEVICE_DEAD. Fixes the LUKS setup issue reported in systemd#23429.
This should cover cases regarding devices with `OPTIONS+="db_persist"` during initrd->sysroot transition. See: * systemd#23429 * systemd#23218 * systemd#23489 * https://bugzilla.redhat.com/show_bug.cgi?id=2087225
dm-crypt device units generated by systemd-cryptsetup-generator habe BindsTo= dependencies on their backend devices. The dm-crypt devices have the db_persist flag set, and thus survive the udev db cleanup while switching root. But backend devices usually don't survive. These devices are neither mounted nor used for swap, thus they will seen as DEVICE_NOT_FOUND after switching root. The BindsTo dependency will cause systemd to schedule a stop job for the dm-crypt device, breaking boot: [ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device. [ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace [ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343 Avoid this by not setting the state of the backend devices to DEVICE_DEAD. Fixes the LUKS setup issue reported in systemd#23429.
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437]
dm-crypt device units generated by systemd-cryptsetup-generator habe BindsTo= dependencies on their backend devices. The dm-crypt devices have the db_persist flag set, and thus survive the udev db cleanup while switching root. But backend devices usually don't survive. These devices are neither mounted nor used for swap, thus they will seen as DEVICE_NOT_FOUND after switching root. The BindsTo dependency will cause systemd to schedule a stop job for the dm-crypt device, breaking boot: [ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device. [ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace [ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343 Avoid this by not setting the state of the backend devices to DEVICE_DEAD. Fixes the LUKS setup issue reported in systemd#23429.
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437]
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437]
This should cover cases regarding devices with `OPTIONS+="db_persist"` during initrd->sysroot transition. See: * systemd/systemd#23429 * systemd/systemd#23218 * systemd/systemd#23489 * https://bugzilla.redhat.com/show_bug.cgi?id=2087225 (cherry picked from commit 1fb7f8e)
This should cover cases regarding devices with `OPTIONS+="db_persist"` during initrd->sysroot transition. See: * systemd/systemd#23429 * systemd/systemd#23218 * systemd/systemd#23489 * https://bugzilla.redhat.com/show_bug.cgi?id=2087225 (cherry picked from commit 1fb7f8e)
dm-crypt device units generated by systemd-cryptsetup-generator habe BindsTo= dependencies on their backend devices. The dm-crypt devices have the db_persist flag set, and thus survive the udev db cleanup while switching root. But backend devices usually don't survive. These devices are neither mounted nor used for swap, thus they will seen as DEVICE_NOT_FOUND after switching root. The BindsTo dependency will cause systemd to schedule a stop job for the dm-crypt device, breaking boot: [ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device. [ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace [ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343 Avoid this by not setting the state of the backend devices to DEVICE_DEAD. Fixes the LUKS setup issue reported in systemd#23429. (cherry picked from commit cf1ac0c)
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437] (cherry picked from commit 4fc69e8)
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437] (cherry picked from commit 4fc69e8) (cherry picked from commit 131206d)
dm-crypt device units generated by systemd-cryptsetup-generator habe BindsTo= dependencies on their backend devices. The dm-crypt devices have the db_persist flag set, and thus survive the udev db cleanup while switching root. But backend devices usually don't survive. These devices are neither mounted nor used for swap, thus they will seen as DEVICE_NOT_FOUND after switching root. The BindsTo dependency will cause systemd to schedule a stop job for the dm-crypt device, breaking boot: [ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device. [ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace [ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343 Avoid this by not setting the state of the backend devices to DEVICE_DEAD. Fixes the LUKS setup issue reported in systemd#23429. (cherry picked from commit cf1ac0c) (cherry picked from commit 4f86dd2)
On switching root, a device may have a persistent databse. In that case, Device.enumerated_found may have DEVICE_FOUND_UDEV flag, and it is not necessary to downgrade the Device.deserialized_found and Device.deserialized_state. Otherwise, the state of the device unit may be changed plugged -> dead -> plugged, if the device has not been mounted. Fixes systemd#23429. [mwilck: cherry-picked from systemd#23437] (cherry picked from commit 4fc69e8)
dm-crypt device units generated by systemd-cryptsetup-generator habe BindsTo= dependencies on their backend devices. The dm-crypt devices have the db_persist flag set, and thus survive the udev db cleanup while switching root. But backend devices usually don't survive. These devices are neither mounted nor used for swap, thus they will seen as DEVICE_NOT_FOUND after switching root. The BindsTo dependency will cause systemd to schedule a stop job for the dm-crypt device, breaking boot: [ 68.929457] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Unit is stopped because bound to inactive unit dev-disk-by\x2duuid-3bf91f73\x2d1ee8\x2d4cfc\x2d9048\x2d93ba349b786d.device. [ 68.945660] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Trying to enqueue job systemd-cryptsetup@cr_root.service/stop/replace [ 69.473459] krypton systemd[1]: systemd-cryptsetup@cr_root.service: Installed new job systemd-cryptsetup@cr_root.service/stop as 343 Avoid this by not setting the state of the backend devices to DEVICE_DEAD. Fixes the LUKS setup issue reported in systemd#23429. (cherry picked from commit cf1ac0c)
Devices with
OPTIONS+="db_persist"
are not supposed to createdead->plugged
transition right after switching root, at least that wasn't the case before and the udev man page describes the option to actually prevent it from happening.However pr #23218 changed this behavior. I don't know currently whether it will cause any real regression in practice but I think it deserves a discussion before releasing v251.
The text was updated successfully, but these errors were encountered: