Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd keeps btrfs Volume in tentative state forever… #5781

Closed
encbladexp opened this issue Apr 22, 2017 · 38 comments
Closed

systemd keeps btrfs Volume in tentative state forever… #5781

encbladexp opened this issue Apr 22, 2017 · 38 comments

Comments

@encbladexp
Copy link

@encbladexp encbladexp commented Apr 22, 2017

Submission type

  • Bug report
  • Request for enhancement (RFE)

systemd version the issue has been seen with

232

Used distribution

Arch Linux

In case of bug report: Expected behaviour you didn't see

Active state of the filesystem changes to active after mounting

In case of bug report: Unexpected behaviour you saw

Active state of the filesystem is "activating (tentative)" forever, even if mounted

[root@server:~] # systemctl status dev-mapper-storage1.device
● dev-mapper-storage1.device - /dev/mapper/storage1
   Loaded: loaded
  Drop-In: /run/systemd/generator/dev-mapper-storage1.device.d
           └─90-device-timeout.conf
   Active: activating (tentative) since Sat 2017-04-22 09:19:02 CEST; 27min ago
   Device: /sys/devices/virtual/block/dm-0

In case of bug report: Steps to reproduce the problem

Create a multiple device btrfs filesystem on at least two LUKS devices and
mount it by Label via fstab

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 22, 2017

There is no way around it without serious redesign to remove "one filesystem - one device" paradigm. As implemented currently, only the last device that caused btrfs to become complete and suitable for mount is announced to systemd; all other devices remain with SYSTEMD_READY=0 forever.

@poettering
Copy link
Member

@poettering poettering commented Apr 24, 2017

@arvidjaar that might be the case, but it has little to do with the issue at hand, afaics.

@encbladexp what is the precise device name listed in /proc/self/mountinfo for the mount in question? What is the precise device string used in /etc/fstab (or /proc/cmdline or where ever you mount it from)? What does "udevadm info /dev/...." say about the device?

If the device stays around in "tentative" state, then this indicates that a device appears in /proc/self/mountinfo with some name and systemd can't find a matching device in /sys for it, probably because for some reason it has a different name...

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 24, 2017

LABEL=Storage /mnt/storage btrfs defaults 0 2
LABEL=Storage /home btrfs defaults,subvol=home 0 2
LABEL=Storage /mnt/windowsbackup btrfs defaults,subvol=windowsbackup 0 2
LABEL=Storage /srv/nfs/dwhelper btrfs defaults,subvol=dwhelper 0 2
LABEL=Storage /srv/nfs/movies btrfs defaults,subvol=movies 0 2
LABEL=Storage /srv/nfs/music btrfs defaults,subvol=music 0 2
LABEL=Storage /srv/nfs/transfer btrfs defaults,subvol=transfer 0 2
LABEL=Storage /srv/nfs/vboximages btrfs defaults,subvol=vboximages 0 2
LABEL=Storage /srv/nfs/windows btrfs defaults,subvol=windows 0 2
LABEL=Storage /var/cache/pacman/pkg btrfs defaults,subvol=pacman 0 2
[root@server:~] # udevadm info /dev/disk/by-label/Storage 
P: /devices/virtual/block/dm-2
N: dm-2
S: disk/by-id/dm-name-storage2
S: disk/by-id/dm-uuid-CRYPT-LUKS1-60525f998cd64f42a5d4fcb215992d42-storage2
S: disk/by-label/Storage
S: disk/by-uuid/4fbb24c9-8185-45fc-aa89-84ae31e4f07e
S: mapper/storage2
E: DEVLINKS=/dev/disk/by-label/Storage /dev/disk/by-uuid/4fbb24c9-8185-45fc-aa89-84ae31e4f07e /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-60525f998cd64f42a5d4fcb215992d42-storage2 /dev/disk/by-id/dm-name-storage2 /dev/mapper/storage2
E: DEVNAME=/dev/dm-2
E: DEVPATH=/devices/virtual/block/dm-2
E: DEVTYPE=disk
E: DM_ACTIVATION=1
E: DM_NAME=storage2
E: DM_SUSPENDED=0
E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
E: DM_UDEV_RULES_VSN=2
E: DM_UUID=CRYPT-LUKS1-60525f998cd64f42a5d4fcb215992d42-storage2
E: ID_BTRFS_READY=1
E: ID_FS_LABEL=Storage
E: ID_FS_LABEL_ENC=Storage
E: ID_FS_TYPE=btrfs
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=4fbb24c9-8185-45fc-aa89-84ae31e4f07e
E: ID_FS_UUID_ENC=4fbb24c9-8185-45fc-aa89-84ae31e4f07e
E: ID_FS_UUID_SUB=1f952bf2-6259-491d-bb61-32f44d489966
E: ID_FS_UUID_SUB_ENC=1f952bf2-6259-491d-bb61-32f44d489966
E: MAJOR=254
E: MINOR=2
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=7815470
78 22 9:0 / /srv/vmstorage rw,relatime shared:30 - xfs /dev/md0 rw,attr2,inode64,sunit=1024,swidth=2048,noquota
91 22 0:45 / /mnt/storage rw,relatime shared:33 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=5,subvol=/
87 22 0:45 /vboximages /srv/nfs/vboximages rw,relatime shared:34 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=262,subvol=/vboximages
99 22 0:45 /windows /srv/nfs/windows rw,relatime shared:35 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=263,subvol=/windows
85 22 0:45 /transfer /srv/nfs/transfer rw,relatime shared:36 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=267,subvol=/transfer
103 22 0:45 /windowsbackup /mnt/windowsbackup rw,relatime shared:37 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=266,subvol=/windowsbackup
95 22 0:45 /dwhelper /srv/nfs/dwhelper rw,relatime shared:38 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=258,subvol=/dwhelper
93 22 0:45 /home /home rw,relatime shared:39 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=264,subvol=/home
94 22 0:45 /home /srv/nfs/home rw,relatime shared:39 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=264,subvol=/home
97 22 0:45 /movies /srv/nfs/movies rw,relatime shared:40 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=259,subvol=/movies
89 22 0:45 /pacman /var/cache/pacman/pkg rw,relatime shared:41 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=265,subvol=/pacman
90 22 0:45 /pacman /srv/nfs/pacman rw,relatime shared:41 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=265,subvol=/pacman
101 22 0:45 /music /srv/nfs/music rw,relatime shared:42 - btrfs /dev/mapper/storage1 rw,space_cache,subvolid=260,subvol=/music

Do you need any further informations?

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 24, 2017

mapper/storage2

Your report was against /dev/mapper/storage1 and you show /dev/mapper/storage2. Please show the same information for /dev/mapper/storage1.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 24, 2017

[root@server:~] # systemctl status dev-mapper-storage2.device
● dev-mapper-storage2.device - /dev/mapper/storage2
   Follow: unit currently follows state of sys-devices-virtual-block-dm\x2d2.device
   Loaded: loaded
  Drop-In: /run/systemd/generator/dev-mapper-storage2.device.d
           └─90-device-timeout.conf
   Active: active (plugged) since Sat 2017-04-22 09:19:00 CEST; 2 days ago
   Device: /sys/devices/virtual/block/dm-2

Apr 22 09:19:00 server systemd[1]: Found device /dev/mapper/storage2.

Looks like storage2 is active, as it should be. If i want do depend on an multi disk btrfs filesystem, which .device Unit it the right one to depend on? Is there any chance for systemd to detect such situations in an expected manner?

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 24, 2017

And udevadm info /dev/mapper/storage1 please.

@poettering
Copy link
Member

@poettering poettering commented Apr 24, 2017

so something is really strange here: what precisely is putting together the btrfs raid for you? systemd? or something in the initrd that isn't systemd?

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 24, 2017

@poettering

what precisely is putting together the btrfs raid for you? systemd?

Of course. rules/64-btrfs.rules

@poettering
Copy link
Member

@poettering poettering commented Apr 24, 2017

@arvidjaar well, there are people who use systemd on the host, but an initrd that is not systemd-based (debian?). On those setups btrfs raid might be assembled before systemd takes over...

The reason I am asking: the device name showing up in /proc/self/mountinfo is different for @encbladexp than the one showing up as .device unit. And that's really weird! That's because if udev/systemd assembles the device it wouldn't do so until the final .device has shown up, and then would use that last device's name for mounting the file system. The discrepancy between the name used for the mounting and for the final .device unit is hence very very strange, and my immediate guess would be that it it's not systemd that assembles/mounts the btrfs fs initially, but something else, that uses a different name for the device.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 24, 2017

@arvidjaar:

P: /devices/virtual/block/dm-0
N: dm-0
S: disk/by-id/dm-name-storage1
S: disk/by-id/dm-uuid-CRYPT-LUKS1-9bcf5ce9e06c4154a25d660b1686a499-storage1
S: disk/by-label/Storage
S: disk/by-uuid/4fbb24c9-8185-45fc-aa89-84ae31e4f07e
S: mapper/storage1
E: DEVLINKS=/dev/disk/by-id/dm-name-storage1 /dev/disk/by-uuid/4fbb24c9-8185-45fc-aa89-84ae31e4f07e /dev/disk/by-id/dm-uuid-CRYPT-LUKS1-9bcf5ce9e06c4154a25d660b1686a499-storage1 /dev/mapper/storage1 /dev/disk/by-label/Storage
E: DEVNAME=/dev/dm-0
E: DEVPATH=/devices/virtual/block/dm-0
E: DEVTYPE=disk
E: DM_ACTIVATION=1
E: DM_NAME=storage1
E: DM_SUSPENDED=0
E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
E: DM_UDEV_RULES_VSN=2
E: DM_UUID=CRYPT-LUKS1-9bcf5ce9e06c4154a25d660b1686a499-storage1
E: ID_BTRFS_READY=0
E: ID_FS_LABEL=Storage
E: ID_FS_LABEL_ENC=Storage
E: ID_FS_TYPE=btrfs
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=4fbb24c9-8185-45fc-aa89-84ae31e4f07e
E: ID_FS_UUID_ENC=4fbb24c9-8185-45fc-aa89-84ae31e4f07e
E: ID_FS_UUID_SUB=97040368-884d-4aab-8ffc-c5ca5bddf2c7
E: ID_FS_UUID_SUB_ENC=97040368-884d-4aab-8ffc-c5ca5bddf2c7
E: MAJOR=254
E: MINOR=0
E: SUBSYSTEM=block
E: SYSTEMD_READY=0
E: TAGS=:systemd:
E: USEC_INITIALIZED=7439301

It is the stock Arch Linux initrd with the following hooks enabled:

HOOKS="base udev autodetect modconf keyboard block filesystems fsck"

AFAIK udev is responsible for building the whole devices related stuff.

@poettering
Copy link
Member

@poettering poettering commented Apr 24, 2017

AFAIK udev is responsible for building the whole devices related stuff.

Does the arch initrd run systemd as PID 1? or is it something homegrown?

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 24, 2017

The initrd uses an ash Script as /sbin/init:

#!/usr/bin/ash

udevd_running=0
mount_handler=default_mount_handler
init=/sbin/init
rd_logmask=0

. /init_functions

mount_setup

# parse the kernel command line
parse_cmdline </proc/cmdline

# setup logging as early as possible
rdlogger_start

for d in ${disablehooks//,/ }; do
    [ -e "/hooks/$d" ] && chmod 644 "/hooks/$d"
done

. /config

run_hookfunctions 'run_earlyhook' 'early hook' $EARLYHOOKS

if [ -n "$earlymodules$MODULES" ]; then
    modprobe -qab ${earlymodules//,/ } $MODULES
fi

run_hookfunctions 'run_hook' 'hook' $HOOKS

# honor the old behavior of break=y as a synonym for break=premount
if [ "${break}" = "y" ] || [ "${break}" = "premount" ]; then
    echo ":: Pre-mount break requested, type 'exit' to resume operation"
    launch_interactive_shell
fi

rootdev=$(resolve_device "$root") && root=$rootdev
unset rootdev

fsck_root

# Mount root at /new_root
"$mount_handler" /new_root

run_hookfunctions 'run_latehook' 'late hook' $LATEHOOKS
run_hookfunctions 'run_cleanuphook' 'cleanup hook' $CLEANUPHOOKS

if [ "$(stat -c %D /)" = "$(stat -c %D /new_root)" ]; then
    # Nothing got mounted on /new_root. This is the end, we don't know what to do anymore
    # We fall back into a shell, but the shell has now PID 1
    # This way, manual recovery is still possible.
    err "Failed to mount the real root device."
    echo "Bailing out, you are on your own. Good luck."
    echo
    launch_interactive_shell --exec
elif [ ! -x "/new_root${init}" ]; then
    # Successfully mounted /new_root, but ${init} is missing
    # The same logic as above applies
    err "Root device mounted successfully, but ${init} does not exist."
    echo "Bailing out, you are on your own. Good luck."
    echo
    launch_interactive_shell --exec
fi

if [ "${break}" = "postmount" ]; then
    echo ":: Post-mount break requested, type 'exit' to resume operation"
    launch_interactive_shell
fi

# this should always be the last thing we do before the switch_root.
rdlogger_stop

exec env -i \
    "TERM=$TERM" \
    /usr/bin/switch_root /new_root $init "$@"

# vim: set ft=sh ts=4 sw=4 et:

Optional it is possible to use an systemd based init, but this is not supported on all setups. Keep in mind that my Storage btrfs is not used as root filesystem, i can try to switch to an systemd based initramfs if required (for testing, maybe).

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 25, 2017

@encbladexp
OK, as expected

P: /devices/virtual/block/dm-0
...
S: disk/by-label/Storage
S: mapper/storage1
...
E: ID_BTRFS_READY=0
...
E: SYSTEMD_READY=0

What ls -l /dev/disk/by-label/Storage says currently (if you rebooted since then please compare with both /dev/mapper/storage1 and /dev/mapper/storage2)? Actually it would be interesting to get output for all links in DEVLINKS - are they consistent?

@poettering
Copy link
Member

@poettering poettering commented Apr 25, 2017

@encbladexp ah, so the initrd is not using systemd. Next question: do you know whether it is the initrd or the host system that mounts that btrfs raid fs?

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 25, 2017

@arvidjaar => lrwxrwxrwx 1 root root 10 22. Apr 09:19 /dev/disk/by-label/Storage -> ../../dm-2 (i reboot my homeserver only 2-3 times per year, so we have the same state until i reboot).

@poettering => I would think the host system mounts this, but who knows…

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 25, 2017

@encbladexp Could you make available systemctl show home.mount and systemctl status home.mount ?

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 25, 2017

[root@server:~] # systemctl status home.mount
● home.mount - /home
   Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
   Active: active (mounted) since Sat 2017-04-22 09:19:02 CEST; 3 days ago
    Where: /home
     What: /dev/mapper/storage1
     Docs: man:fstab(5)
           man:systemd-fstab-generator(8)
  Process: 494 ExecMount=/usr/bin/mount /dev/disk/by-label/Storage /home -t btrfs -o defaults,subvol=home (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/home.mount

Apr 22 09:19:00 server systemd[1]: Mounting /home...
Apr 22 09:19:02 server systemd[1]: Mounted /home.

and

Where=/home
What=/dev/mapper/storage1
Options=rw,relatime,space_cache,subvolid=264,subvol=/home
Type=btrfs
TimeoutUSec=1min 30s
ControlPID=0
DirectoryMode=0755
SloppyOptions=no
LazyUnmount=no
ForceUnmount=no
Result=success
UID=4294967295
GID=4294967295
ExecMount={ path=/usr/bin/mount ; argv[]=/usr/bin/mount /dev/disk/by-label/Storage /home -t btrfs -o defaults,subvol=home ; ignore_errors=no ; start_time=[Sat 2017
Slice=system.slice
ControlGroup=/system.slice/home.mount
MemoryCurrent=18446744073709551615
CPUUsageNSec=18446744073709551615
TasksCurrent=0
Delegate=no
CPUAccounting=no
CPUWeight=18446744073709551615
StartupCPUWeight=18446744073709551615
CPUShares=18446744073709551615
StartupCPUShares=18446744073709551615
CPUQuotaPerSecUSec=infinity
IOAccounting=no
IOWeight=18446744073709551615
StartupIOWeight=18446744073709551615
BlockIOAccounting=no
BlockIOWeight=18446744073709551615
StartupBlockIOWeight=18446744073709551615
MemoryAccounting=no
MemoryLow=0
MemoryHigh=18446744073709551615
MemoryMax=18446744073709551615
MemorySwapMax=18446744073709551615
MemoryLimit=18446744073709551615
DevicePolicy=auto
TasksAccounting=yes
TasksMax=4915
UMask=0022
LimitCPU=18446744073709551615
LimitCPUSoft=18446744073709551615
LimitFSIZE=18446744073709551615
LimitFSIZESoft=18446744073709551615
LimitDATA=18446744073709551615
LimitDATASoft=18446744073709551615
LimitSTACK=18446744073709551615
LimitSTACKSoft=8388608
LimitCORE=18446744073709551615
LimitCORESoft=18446744073709551615
LimitRSS=18446744073709551615
LimitRSSSoft=18446744073709551615
LimitNOFILE=4096
LimitNOFILESoft=1024
LimitAS=18446744073709551615
LimitASSoft=18446744073709551615
LimitNPROC=127517
LimitNPROCSoft=127517
LimitMEMLOCK=65536
LimitMEMLOCKSoft=65536
LimitLOCKS=18446744073709551615
LimitLOCKSSoft=18446744073709551615
LimitSIGPENDING=127517
LimitSIGPENDINGSoft=127517
LimitMSGQUEUE=819200
LimitMSGQUEUESoft=819200
LimitNICE=0
LimitNICESoft=0
LimitRTPRIO=0
LimitRTPRIOSoft=0
LimitRTTIME=18446744073709551615
LimitRTTIMESoft=18446744073709551615
OOMScoreAdjust=0
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=inherit
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SyslogLevel=6
SyslogFacility=3
SecureBits=0
CapabilityBoundingSet=18446744073709551615
AmbientCapabilities=0
DynamicUser=no
RemoveIPC=no
MountFlags=0
PrivateTmp=no
PrivateDevices=no
ProtectKernelTunables=no
ProtectKernelModules=no
ProtectControlGroups=no
PrivateNetwork=no
PrivateUsers=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=yes
UtmpMode=init
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
RuntimeDirectoryMode=0755
MemoryDenyWriteExecute=no
RestrictRealtime=no
KillMode=control-group
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=home.mount
Names=home.mount
Requires=systemd-fsck@dev-disk-by\x5cx2dlabel-Storage.service -.mount system.slice
BindsTo=dev-disk-by\x5cx2dlabel-Storage.device
RequiredBy=local-fs.target srv-nfs-home.mount
WantedBy=dev-disk-by\x5cx2dlabel-Storage.device
Conflicts=umount.target
Before=umount.target local-fs.target srv-nfs-home.mount
After=local-fs-pre.target system.slice dev-disk-by\x5cx2dlabel-Storage.device systemd-fsck@dev-disk-by\x5cx2dlabel-Storage.service -.mount
RequiresMountsFor=/ /dev/disk/by-label/Storage
Documentation=man:fstab(5) man:systemd-fstab-generator(8)
Description=/home
LoadState=loaded
ActiveState=active
Where=/home
What=/dev/mapper/storage1
Options=rw,relatime,space_cache,subvolid=264,subvol=/home
Type=btrfs
TimeoutUSec=1min 30s
ControlPID=0
DirectoryMode=0755
SloppyOptions=no
LazyUnmount=no
ForceUnmount=no
Result=success
UID=4294967295
GID=4294967295
ExecMount={ path=/usr/bin/mount ; argv[]=/usr/bin/mount /dev/disk/by-label/Storage /home -t btrfs -o defaults,subvol=home ; ignore_errors=no ; start_time=[Sat 2017
Slice=system.slice
ControlGroup=/system.slice/home.mount
MemoryCurrent=18446744073709551615
CPUUsageNSec=18446744073709551615
TasksCurrent=0
Delegate=no
CPUAccounting=no
CPUWeight=18446744073709551615
StartupCPUWeight=18446744073709551615
CPUShares=18446744073709551615
StartupCPUShares=18446744073709551615
CPUQuotaPerSecUSec=infinity
IOAccounting=no
IOWeight=18446744073709551615
StartupIOWeight=18446744073709551615
BlockIOAccounting=no
BlockIOWeight=18446744073709551615
StartupBlockIOWeight=18446744073709551615
MemoryAccounting=no
MemoryLow=0
MemoryHigh=18446744073709551615
MemoryMax=18446744073709551615
MemorySwapMax=18446744073709551615
MemoryLimit=18446744073709551615
DevicePolicy=auto
@poettering
Copy link
Member

@poettering poettering commented Apr 29, 2017

@poettering => I would think the host system mounts this, but who knows…

well, "journalctl -b" should include a message about that, in particular if you boot with "systemd.log_level=debug". Just check if that message that says "mounting /home" is before the message that tells you that the initrd transition already took place.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Apr 29, 2017

I will take a look at my next reboot.

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 30, 2017

I just reproduced it (no, I do not know how to do it on purpose). No initrd involved, plain /etc/fstab mount.

localhost:~ # grep /thin /proc/self/mountinfo
96 59 0:63 / /thin rw,relatime shared:47 - btrfs /dev/dm-0 rw,space_cache,subvolid=5,subvol=/
localhost:~ # systemctl status /dev/dm-0
● dev-dm\x2d0.device - /dev/dm-0
   Loaded: loaded
   Active: activating (tentative) since Sun 2017-04-30 08:03:07 MSK; 5min ago
   Device: /sys/devices/virtual/block/dm-0
localhost:~ # udevadm info /dev/dm-0 | grep READY
E: ID_BTRFS_READY=0
E: SYSTEMD_READY=0
localhost:~ # systemctl --no-pager status thin.mount
● thin.mount - /thin
   Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
   Active: active (mounted) since Sun 2017-04-30 08:03:07 MSK; 6min ago
    Where: /thin
     What: /dev/dm-0
     Docs: man:fstab(5)
           man:systemd-fstab-generator(8)
  Process: 982 ExecMount=/usr/bin/mount /dev/disk/by-label/Storage /thin -t btrfs (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/thin.mount

Apr 30 08:03:07 localhost systemd[1]: Mounting /thin...
Apr 30 08:03:07 localhost systemd[1]: Mounted /thin.
localhost:~ # ll /dev/disk/by-label/
total 0
lrwxrwxrwx 1 root root 10 Apr 30 08:03 Storage -> ../../dm-1
localhost:~ # udevadm info /dev/dm-1 | grep READY
E: ID_BTRFS_READY=1
localhost:~ #

Just to record the state before it is lost on reboot :)

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 30, 2017

OK, so the issue here is that btrfs shows in /proc/self/mountinfo not device name used to mount it, but device name with the smallest devid (as shows e.g. by btrsf fi sh /mnt:

localhost:~ # btrfs fi show /thin
Label: 'Storage'  uuid: a6f9dd05-460c-418b-83ab-ebdf81f2931a
Total devices 2 FS bytes used 640.00KiB
devid    1 size 20.00GiB used 2.01GiB path /dev/mapper/vg01-storage1
devid    2 size 10.00GiB used 2.01GiB path /dev/mapper/vg01-storage2
localhost:~ # udevadm info /dev/dm-1 | grep READY
E: ID_BTRFS_READY=1
localhost:~ # udevadm info /dev/dm-0 | grep READY
E: ID_BTRFS_READY=0
E: SYSTEMD_READY=0
localhost:~ # ll /dev/mapper/
total 0
crw------- 1 root root 10, 236 Apr 30 08:50 control
lrwxrwxrwx 1 root root       7 Apr 30 08:50 vg01-storage1 -> ../dm-0
lrwxrwxrwx 1 root root       7 Apr 30 08:50 vg01-storage2 -> ../dm-1
localhost:~ # mount /dev/mapper/vg01-storage2 /thin
localhost:~ # dmesg | tail
[   71.242526] BTRFS info (device dm-1): disk space caching is enabled
[   71.242528] BTRFS info (device dm-1): has skinny extents
localhost:~ # tail /proc/self/mountinfo
169 59 0:66 / /thin rw,relatime shared:101 - btrfs /dev/dm-0 rw,space_cache,subvolid=5,subvol=/
localhost:~ # 

Then systemd gets event from /proc/self/mountinfo, and updates What on mount unit.

It is non-deterministic which device will be detected last during boot. E.g. on my next reboot /dev/dm-0 "exists" and /dev/dm-1 "does not exist".

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 30, 2017

Actually current multi-device handling is simply wrong. We rely on UUID/LABEL link to point to the "correct" device but it it is unpredictable - events are processed concurrently and event that is finished last wins. For the third time my VM stops on reboot due to "missing" device:

localhost:~ # grep thin /etc/fstab
LABEL=Storage /thin btrfs defaults 0 1
localhost:~ # ll /dev/disk/by-label
total 0
lrwxrwxrwx 1 root root 10 Apr 30 09:10 Storage -> ../../dm-0
localhost:~ # systemctl status /dev/dm-0
● dev-dm\x2d0.device
   Loaded: loaded
   Active: inactive (dead)
localhost:~ # systemctl status /dev/dm-1
● dev-dm\x2d1.device - /dev/dm-1
   Follow: unit currently follows state of sys-devices-virtual-block-dm\x2d1.device
   Loaded: loaded
   Active: active (plugged) since Sun 2017-04-30 09:10:13 MSK; 7min ago
   Device: /sys/devices/virtual/block/dm-1
bor@bor-Latitude-E5450:~/src/systemd$ 
@poettering
Copy link
Member

@poettering poettering commented Apr 30, 2017

can you check if doing OPTIONS+="link_priority=-100" for all devices where ID_READY=0 is set helps? That way only the one device showing up last that completes the array is also the one getting the symlink

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 30, 2017

@poettering

can you check if doing OPTIONS+="link_priority=-100" for all devices where ID_READY=0 is set helps?

No. You have no control over rules that others may install and in another rule installed by OS vendor these devices get link_priority=50.

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Apr 30, 2017

@poettering

In any case, this is separate problem, and setting link_priority cannot help with original issue reported here.

@poettering
Copy link
Member

@poettering poettering commented May 2, 2017

No. You have no control over rules that others may install and in another rule installed by OS vendor these devices get link_priority=50.

Uh? we ship a set of default rules that include link_priority usage already, i see no problem to set it here too

@poettering
Copy link
Member

@poettering poettering commented May 2, 2017

In any case, this is separate problem, and setting link_priority cannot help with original issue reported here.

Uh? afaics the link_priority setting will fix the issue at hand, as it means the symlinks will always point to the one btrfs backing device that has BTRFS_READY=1 set (and hence SYSTEMD_READY=1). systemd picks up the symlink names for its .device units, and hence this should fix the issue at hand.

@poettering
Copy link
Member

@poettering poettering commented May 2, 2017

@encbladexp any chance you can play around with link_priority in the udev rules? either bump the btrfs backing device that has ready set up or all the ones that do not have it down. Please check if that fixes your issue at hand.

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented May 2, 2017

it means the symlinks will always point to the one btrfs backing device that has BTRFS_READY=1 set

Which is absolutely irrelevant for the issue reported here. Links are already correct and point exactly to this device.

@poettering
Copy link
Member

@poettering poettering commented May 3, 2017

Which is absolutely irrelevant for the issue reported here. Links are already correct and point exactly to this device.

are they? your paste suggests otherwise. first you show this:

● thin.mount - /thin
   Loaded: loaded (/etc/fstab; generated; vendor preset: disabled)
   Active: active (mounted) since Sun 2017-04-30 08:03:07 MSK; 6min ago
    Where: /thin
     What: /dev/dm-0
     Docs: man:fstab(5)
           man:systemd-fstab-generator(8)
  Process: 982 ExecMount=/usr/bin/mount /dev/disk/by-label/Storage /thin -t btrfs (code=exited, status=0/SUCCESS)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/thin.mount

this suggests /dev/disk/by-label/Storage points to → /dev/dm-0.

And then a bit later you post this:

localhost:~ # ll /dev/disk/by-label/
total 0
lrwxrwxrwx 1 root root 10 Apr 30 08:03 Storage -> ../../dm-1

Which suggests it now points to /dev/dm-1.

All I am saying: we should make reliable that the one SYSTEMD_READY=1 is set on is also strictly the one that has the symlinks pointing to it.

@ochilan
Copy link

@ochilan ochilan commented May 12, 2017

Question to anyone able to repro the issue: Are you using keyfiles to auto-open the devices via crypttab? If so, are you using the same keyfile for each of the devices? And if so, can you check what happens when you use separate keyfiles for each of the devices? Thanks.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented May 12, 2017

My Crypttab:

storage1 PARTLABEL=StorageOne /root/storage-encryption-key
storage2 PARTLABEL=StorageTwo /root/storage-encryption-key

It's the same key, for different devices.

@ochilan
Copy link

@ochilan ochilan commented May 12, 2017

Okay. Can you please try to copy the keyfile to a different name and use the two different files (it's okay that they have the same content) as keys for storage1 and storage2 and check whether this changes anything regarding the issue? I.e. something like

storage1 PARTLABEL=StorageOne /root/storage-encryption-key1
storage2 PARTLABEL=StorageTwo /root/storage-encryption-key2

For the change to take effect it's probably easiest to reboot such that systemd can do its work opening the crypt devices.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented May 12, 2017

In can test his.

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented May 13, 2017

@poettering

we should make reliable that the one SYSTEMD_READY=1 is set on is also strictly the one that has the symlinks pointing to it

Did you read what I wrote earlier? SYSTEMD_READY=1 is set on device that is link target. But device shown in /proc/self/mountinfo for btrfs has nothing to do with device used to mount it (i.e. device that got SYSTEMD_READY=1). systemd updates in-memory mount unit with information from /proc/self/mountinfo.

@ochilan

Are you using keyfiles

It is trivially reproduced using btrfs on two physical devices, no keyfiles involved.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Feb 9, 2018

Any things i should test to get this issue fixed? On systemd 237 is looks like:

[root@server:~] # systemctl status dev-mapper-storage1.device
● dev-mapper-storage1.device - /dev/mapper/storage1
   Follow: unit currently follows state of sys-devices-virtual-block-dm\x2d2.device
   Loaded: loaded
  Drop-In: /run/systemd/generator/dev-mapper-storage1.device.d
           └─90-device-timeout.conf
   Active: active (plugged) since Sun 2018-01-28 06:52:01 CET; 1 weeks 5 days ago
   Device: /sys/devices/virtual/block/dm-2
[root@server:~] # systemctl status dev-mapper-storage2.device
● dev-mapper-storage2.device - /dev/mapper/storage2
   Follow: unit currently follows state of sys-devices-virtual-block-dm\x2d0.device
   Loaded: loaded
  Drop-In: /run/systemd/generator/dev-mapper-storage2.device.d
           └─90-device-timeout.conf
   Active: active (plugged) since Sun 2018-01-28 06:52:01 CET; 1 weeks 5 days ago
   Device: /sys/devices/virtual/block/dm-0

All fine now?

@arvidjaar
Copy link
Contributor

@arvidjaar arvidjaar commented Feb 11, 2018

Likely result of 0e8856d which retriggers events on previously "non-existing" block devices.

@encbladexp
Copy link
Author

@encbladexp encbladexp commented Dec 31, 2019

Unable to reproduce this issue anymore. Seems to be fixed or at least working right now.

@encbladexp encbladexp closed this Dec 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.