Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upsystemd keeps btrfs Volume in tentative state forever… #5781
Comments
|
There is no way around it without serious redesign to remove "one filesystem - one device" paradigm. As implemented currently, only the last device that caused btrfs to become complete and suitable for mount is announced to systemd; all other devices remain with |
|
@arvidjaar that might be the case, but it has little to do with the issue at hand, afaics. @encbladexp what is the precise device name listed in /proc/self/mountinfo for the mount in question? What is the precise device string used in /etc/fstab (or /proc/cmdline or where ever you mount it from)? What does "udevadm info /dev/...." say about the device? If the device stays around in "tentative" state, then this indicates that a device appears in /proc/self/mountinfo with some name and systemd can't find a matching device in /sys for it, probably because for some reason it has a different name... |
Do you need any further informations? |
Your report was against |
Looks like storage2 is active, as it should be. If i want do depend on an multi disk btrfs filesystem, which .device Unit it the right one to depend on? Is there any chance for systemd to detect such situations in an expected manner? |
|
And |
|
so something is really strange here: what precisely is putting together the btrfs raid for you? systemd? or something in the initrd that isn't systemd? |
Of course. |
|
@arvidjaar well, there are people who use systemd on the host, but an initrd that is not systemd-based (debian?). On those setups btrfs raid might be assembled before systemd takes over... The reason I am asking: the device name showing up in /proc/self/mountinfo is different for @encbladexp than the one showing up as .device unit. And that's really weird! That's because if udev/systemd assembles the device it wouldn't do so until the final .device has shown up, and then would use that last device's name for mounting the file system. The discrepancy between the name used for the mounting and for the final .device unit is hence very very strange, and my immediate guess would be that it it's not systemd that assembles/mounts the btrfs fs initially, but something else, that uses a different name for the device. |
It is the stock Arch Linux initrd with the following hooks enabled:
AFAIK udev is responsible for building the whole devices related stuff. |
Does the arch initrd run systemd as PID 1? or is it something homegrown? |
|
The initrd uses an ash Script as /sbin/init: #!/usr/bin/ash
udevd_running=0
mount_handler=default_mount_handler
init=/sbin/init
rd_logmask=0
. /init_functions
mount_setup
# parse the kernel command line
parse_cmdline </proc/cmdline
# setup logging as early as possible
rdlogger_start
for d in ${disablehooks//,/ }; do
[ -e "/hooks/$d" ] && chmod 644 "/hooks/$d"
done
. /config
run_hookfunctions 'run_earlyhook' 'early hook' $EARLYHOOKS
if [ -n "$earlymodules$MODULES" ]; then
modprobe -qab ${earlymodules//,/ } $MODULES
fi
run_hookfunctions 'run_hook' 'hook' $HOOKS
# honor the old behavior of break=y as a synonym for break=premount
if [ "${break}" = "y" ] || [ "${break}" = "premount" ]; then
echo ":: Pre-mount break requested, type 'exit' to resume operation"
launch_interactive_shell
fi
rootdev=$(resolve_device "$root") && root=$rootdev
unset rootdev
fsck_root
# Mount root at /new_root
"$mount_handler" /new_root
run_hookfunctions 'run_latehook' 'late hook' $LATEHOOKS
run_hookfunctions 'run_cleanuphook' 'cleanup hook' $CLEANUPHOOKS
if [ "$(stat -c %D /)" = "$(stat -c %D /new_root)" ]; then
# Nothing got mounted on /new_root. This is the end, we don't know what to do anymore
# We fall back into a shell, but the shell has now PID 1
# This way, manual recovery is still possible.
err "Failed to mount the real root device."
echo "Bailing out, you are on your own. Good luck."
echo
launch_interactive_shell --exec
elif [ ! -x "/new_root${init}" ]; then
# Successfully mounted /new_root, but ${init} is missing
# The same logic as above applies
err "Root device mounted successfully, but ${init} does not exist."
echo "Bailing out, you are on your own. Good luck."
echo
launch_interactive_shell --exec
fi
if [ "${break}" = "postmount" ]; then
echo ":: Post-mount break requested, type 'exit' to resume operation"
launch_interactive_shell
fi
# this should always be the last thing we do before the switch_root.
rdlogger_stop
exec env -i \
"TERM=$TERM" \
/usr/bin/switch_root /new_root $init "$@"
# vim: set ft=sh ts=4 sw=4 et:Optional it is possible to use an systemd based init, but this is not supported on all setups. Keep in mind that my Storage btrfs is not used as root filesystem, i can try to switch to an systemd based initramfs if required (for testing, maybe). |
|
@encbladexp
What |
|
@encbladexp ah, so the initrd is not using systemd. Next question: do you know whether it is the initrd or the host system that mounts that btrfs raid fs? |
|
@arvidjaar => @poettering => I would think the host system mounts this, but who knows… |
|
@encbladexp Could you make available |
and
|
well, "journalctl -b" should include a message about that, in particular if you boot with "systemd.log_level=debug". Just check if that message that says "mounting /home" is before the message that tells you that the initrd transition already took place. |
|
I will take a look at my next reboot. |
|
I just reproduced it (no, I do not know how to do it on purpose). No initrd involved, plain /etc/fstab mount.
Just to record the state before it is lost on reboot :) |
|
OK, so the issue here is that btrfs shows in
Then systemd gets event from It is non-deterministic which device will be detected last during boot. E.g. on my next reboot |
|
Actually current multi-device handling is simply wrong. We rely on UUID/LABEL link to point to the "correct" device but it it is unpredictable - events are processed concurrently and event that is finished last wins. For the third time my VM stops on reboot due to "missing" device:
|
|
can you check if doing |
No. You have no control over rules that others may install and in another rule installed by OS vendor these devices get |
|
In any case, this is separate problem, and setting |
Uh? we ship a set of default rules that include link_priority usage already, i see no problem to set it here too |
Uh? afaics the link_priority setting will fix the issue at hand, as it means the symlinks will always point to the one btrfs backing device that has BTRFS_READY=1 set (and hence SYSTEMD_READY=1). systemd picks up the symlink names for its .device units, and hence this should fix the issue at hand. |
|
@encbladexp any chance you can play around with link_priority in the udev rules? either bump the btrfs backing device that has ready set up or all the ones that do not have it down. Please check if that fixes your issue at hand. |
Which is absolutely irrelevant for the issue reported here. Links are already correct and point exactly to this device. |
are they? your paste suggests otherwise. first you show this:
this suggests /dev/disk/by-label/Storage points to → /dev/dm-0. And then a bit later you post this:
Which suggests it now points to /dev/dm-1. All I am saying: we should make reliable that the one SYSTEMD_READY=1 is set on is also strictly the one that has the symlinks pointing to it. |
|
Question to anyone able to repro the issue: Are you using keyfiles to auto-open the devices via crypttab? If so, are you using the same keyfile for each of the devices? And if so, can you check what happens when you use separate keyfiles for each of the devices? Thanks. |
|
My Crypttab:
It's the same key, for different devices. |
|
Okay. Can you please try to copy the keyfile to a different name and use the two different files (it's okay that they have the same content) as keys for storage1 and storage2 and check whether this changes anything regarding the issue? I.e. something like
For the change to take effect it's probably easiest to reboot such that systemd can do its work opening the crypt devices. |
|
In can test his. |
Did you read what I wrote earlier?
It is trivially reproduced using btrfs on two physical devices, no keyfiles involved. |
|
Any things i should test to get this issue fixed? On systemd 237 is looks like:
All fine now? |
|
Likely result of 0e8856d which retriggers events on previously "non-existing" block devices. |
|
Unable to reproduce this issue anymore. Seems to be fixed or at least working right now. |
Submission type
systemd version the issue has been seen with
232
Used distribution
Arch Linux
In case of bug report: Expected behaviour you didn't see
Active state of the filesystem changes to active after mounting
In case of bug report: Unexpected behaviour you saw
Active state of the filesystem is "activating (tentative)" forever, even if mounted
In case of bug report: Steps to reproduce the problem
Create a multiple device btrfs filesystem on at least two LUKS devices and
mount it by Label via fstab