Ubuntu: Not all /dev/disk/ symlinks are available when mountall runs #2472

frymaster · 2014-07-08T14:56:28Z

I have a pool where the devices were added using their /dev/disk/by-id aliases. When I reboot, one device is always consistently offline with the status UNAVAIL. zpool online always succeeds, and zpool scrub never finds any errors, either after or just before the reboot. There are no indications from the SMART data of any problems (timeouts, reallocated sectors etc.)

If I export the pool and reimport with -d /dev/disk instead of -d /dev/disk-by-id then I can reboot fine without issues.

Putting a sleep statement in mountall.conf has no effect (other than delaying boot), even if values up to a minute or over are used.

The system's root drive is an SSD connected to an add-on card and is (almost) the last disk to be initialised - it starts after the offending disk but completes before. A previous incarnation of this system (Ubuntu 12, ZFS 0.6.2) had the root drive on a USB stick, which was initialised much earlier, and in that system, the pool didn't mount at all, as too many devices were missing. I don't know how the best way to debug before you have a disk to write to, but I was able to output the resuts of ls /dev/disk/by-id to the screen on that system and the symlinks were missing, even though the actual devices in /dev/ were present.

The workaround was to add wait_for_udev in the initramfs scripts, as if I were trying to have ZFS as my root filesystem (or use /dev/ instead of /dev/disk/by-id)

So there seems to be some kind of race condition where the mountall script is run before udev has finished adding the symlinks

Title in #1103 seems related but the description is very different - for one thing, adding a delay makes no difference in my case

root@gregor:~# dpkg -l | grep zfs
ii  dkms                                2.2.0.3-1.1ubuntu5+zfs9~trusty      all          Dynamic Kernel Module Support Framework
ii  libzfs2                             0.6.3-2~trusty                      amd64        Native ZFS filesystem library for Linux
ii  mountall                            2.53-zfs1                           amd64        filesystem mounting tool
ii  ubuntu-zfs                          8~trusty                            amd64        Native ZFS filesystem metapackage for Ubuntu.
ii  zfs-dkms                            0.6.3-2~trusty                      amd64        Native ZFS filesystem kernel modules for Linux
ii  zfsutils                            0.6.3-2~trusty                      amd64        Native ZFS management utilities for Linux
root@gregor:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04 LTS
Release:        14.04
Codename:       trusty
root@gregor:~# uname -a
Linux gregor 3.13.0-30-generic #55-Ubuntu SMP Fri Jul 4 21:40:53 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I added the following to the mountall.conf script, just before the exec mountall line:

exec >>/dev/.initramfs/myjob.log 2>&1
set -x

echo "=== Sleeping for 10 seconds"
/bin/sleep 10 || true
echo "=== Output of /dev"
/bin/ls -lh /dev/sd* || true
/bin/ls -lh /dev/disk/by-id || true

The output of this, along with the zdb output (using both /dev and /dev/disk/by-id) are at https://gist.github.com/frymaster/0f864b579943d53b9107

The devices corresponding to the pool are a,b,d,e,f and g. Note the partrition symlinks for sdg are missing, but are present in dev

i7 920 CPU, 18GB of RAM. The RAM is non-ECC but I am not reporting a data corruption issue. The system is not running under virtualisation.

The text was updated successfully, but these errors were encountered:

rdlugosz · 2014-07-11T16:22:55Z

Hi - I opened #1103 and should clarify that the sleep 30 only appeared to work for me; it didn't actually solve the issue (I think I hit a lucky streak where the drives happened to be all ready a couple of times in a row). I've edited my comment over there to prevent future confusion.

I've also run through the various troubleshooting tips listed here to no avail. This is a persistent bug, but tough to track down.

FWIW, I've found that simply running a zpool export tank immediately followed by a zpool import tank will resolve the issue... tempted to just drop that in rc.local but that just feels wrong (and dangerous). Be careful that you don't have other services start by default that may want access to your zpool mounts. I've seen issues where the pool isn't available & some process ends up creating a directory at the expected mount point. Then you'll need to clean that up before you'll be able to mount your zpool again.

ryao · 2014-07-11T21:13:03Z

@frymaster Adding wait_for_udev() would seem to be the right solution here. Perhaps @dajhorn could add it to the mountall package in the PPA.

dajhorn · 2014-07-11T21:33:44Z

@ryao, mountall already has udev logic to do something like this near https://github.com/zfsonlinux/mountall/blob/master/ubuntu/trusty/src/mountall.c#L2973 and we iterated on similar solutions in zfsonlinux/mountall#1 (plus older tickets in the obsolete repository).

Plus, @rdlugosz reported earlier that he is using duff hardware.

Past that, patches are welcome, but getting a solution without side-effects will be a non-trivial amount of work. This kind of issue should be resolved by implementing #330.

frymaster · 2014-07-11T21:49:16Z

@dajhorn - a 10 minute sleep in the upstart script for mountall doesn't resolve this for me. I don't know how upstart and udev interact, but for whatever reason, creating those symlinks is paused while processing of upstart's "startup" signal is underway, so in my situation that code in mountall is of no benefit. I either have to add wait_for_udev() in the initramfs script (before upstart starts), or not mount the zfs until afterwards.

My concern would be that if my pool was mounted later via this hotplug solution it might not come up before services which store data on the pool (database, virtual machines etc.)

dajhorn · 2014-07-11T22:01:05Z

a 10 minute sleep in the upstart script for mountall doesn't resolve this for me

It sounds like you have a secondary problem, like a stale /etc/zfs/zpool.cache file or udev rules that didn't survive the upgrade to trusty. (Ubuntu 14.04 rearranged many things in /dev and /etc.)

If you want to pursue this issue, then please submit the materials bulleted in the FAQ.

My concern would be that if my pool was mounted later via this hotplug solution it might not come up before services which store data on the pool (database, virtual machines etc.)

It is worth noting that an upstart dependency on a mount point can be specified when a storage resource is pathologically slow:

# /etc/init/MyDatabaseServer.conf
start on (local-filesystems and mounted MOUNTPOINT=/tank/MyDatabaseFiles)

Documentation here:

http://manpages.ubuntu.com/manpages/natty/man7/mounted.7.html

rdlugosz · 2014-07-11T23:18:03Z

In the past I was using questionable hardware, but at this point all ZFS-related drives are attached directly to the motherboard. The issue is manageable enough since I rarely reboot, but every once in a while I go on the hunt for a fix. I will say I'm somewhat glad to see another person having the same issue.

frymaster · 2014-07-12T22:32:12Z

It sounds like you have a secondary problem, like a stale /etc/zfs/zpool.cache file or udev rules that didn't survive the upgrade to trusty. (Ubuntu 14.04 rearranged many things in /dev and /etc .)

Although this is more or less the same hardware as my 12.04 setup (which exhibited the same symptoms), it was a clean install

Any time I've been exporting the pool in order to troubleshoot, there's been no zpool.cache remaining. I've updated the top post with zdb output from both configurations, and also the output of an "ls" of the /dev and /dev/disk/by-id directories after a 10 second wait in the mountall script.

dajhorn · 2014-07-12T22:37:37Z

@frymaster, please submit the materials bulleted in the FAQ, particularly the full unmodified dmesg.

frymaster · 2014-07-13T00:01:44Z

@dajhorn The only mention of the dmesg in the FAQ is in relation to getting stack traces for dealing with hung processes. The rest of the details have been provided. Can you be more specific about what information you require?

dajhorn · 2014-07-13T00:03:08Z

@frymaster: https://github.com/zfsonlinux/pkg-zfs/wiki/Ubuntu-ZFS-mountall-FAQ-and-troubleshooting

frymaster · 2014-07-13T16:28:46Z

@dajhorn ah, sorry, I was looking at http://zfsonlinux.org/faq.html#HowCanIHelp

kern.log
dmesg

Note I still have a 10 second sleep in the mountall.conf script, hence the 10 second jump after drive detection but just before SPL and ZFS output

zpool status - this is after I online'd the UNAVAIL disk (hence the resilver) and then rebooted (hence the UNAVAIL again)
zfs get all

dajhorn · 2014-07-13T17:40:41Z

@frymaster, this is happening because udev is not mapping the partitions on one of the disks. In particular, these device aliases seem to be missing during pool import:

/dev/disk/by-id/ata-TOSHIBA_DT01ACA300_Z3GHJ3DGS-part1
/dev/disk/by-id/ata-TOSHIBA_DT01ACA300_Z3GHJ3DGS-part9
/dev/disk/by-wwn/wwn-0x5000039ff4d5258b-part1
/dev/disk/by-wwn/wwn-0x5000039ff4d5258b-part9

Please gist these things for the Z3GHJ3DGS disk:

# smartctl --all /dev/sdg
# fdisk -l /dev/sdg
# parted /dev/sdg print

Unfortunately, you have a novel problem. This could be caused by a udev misconfiguration, a failing disk, a bad SATA connector, or something else. An easy way to shorten the troubleshooting process here is to replace this disk with a spare.

frymaster · 2014-07-13T19:09:00Z

smartctl
fdisk
parted

Two comments:

The device aliases are created at some point after mountall.conf runs - this is how I am able to online the drive without issues
In my 12.04 install using the same motherboard and pool disks, the main difference was I was booting from a USB drive rather than an SSD plugged into a PCIE SATA card. This device got detected much earlier in the boot process, and in that setup the device aliases were missing for three drives during boot, with the result that the pool couldn't mount at all (too many disks missing)

Because of this, I'm not inclined to believe it's an issue with one specific disk or SATA port.

dajhorn · 2014-07-13T19:28:22Z

@frymaster, okay, the next step is to trace udev. Edit the /etc/init/udev.conf file and change the last line to exec /lib/systemd/systemd-udevd --daemon --debug.

Reboot, and post the /var/log/udev and /var/log/kern.log files. Create a new gist for these files. Let the system run long enough so that udev creates all of the device aliases.

frymaster · 2014-07-13T19:48:40Z

https://gist.github.com/frymaster/0fb49ead9c7ef4a30d53

Let the system run long enough so that udev creates all of the device aliases.

Certainly they're created by the time rc.local runs.

dajhorn · 2014-07-13T20:11:30Z

@frymaster, I'm stumped. Fixing it will require a local reproducer.

frymaster · 2014-07-13T21:21:16Z

I'm confused as to how it works at all now.

udev creates the symlinks
the udev task is run when virtual-filesystems is emitted
the mountall script is what emits that
my ls before mountall shows almost all the symlinks created

And in fact if you look at the udev log, the first mention of the partition symlinks is

UDEV  [17.065605] add      /devices/pci0000:00/0000:00:01.0/0000:01:00.0/ata9/host8/target8:0:0/8:0:0:0/block/sdh/sdh1 (block)
ACTION=add
DEVLINKS=/dev/disk/by-id/ata-SanDisk_SDSSDP064G_141251406216-part1 /dev/disk/by-id/wwn-0x5001b44beed26588-part1 /dev/disk/by-uuid/e68ec506-a886-4507-9fcb-124561df6527

...while kern.log shows the pool is being created earlier

Jul 13 20:35:05 gregor kernel: [   16.568653]  zd0: p1 p2

...as does the udev log

KERNEL[16.963564] add      /devices/virtual/block/zd0 (block)

so, at the time mountall is running, none of the symlinks have been created by the udev run by upstart

_Wild-ass-guessing follows. I am _not* an expert in this and it shows*

But... doesn't initramfs have its own udev? What I think could be happening here is the symlinks that do exist for me are created during the initramfs phase, before upstart is run, then, when my boot device becomes available, initramfs runs upstart (stopping its instance of udev from creating the remaining symlinks), which doesn't run its own udev process until too late to create the symlinks in time for zfs to use them. Is it possible the wait code in mountall only waits for the devices themselves, not for the symlinks?

Does this make sense?

ryao · 2014-07-13T21:21:55Z

@dajhorn Is there any chance #2455 might help?

On Jul 13, 2014, at 4:11 PM, Darik Horn notifications@github.com wrote:

@frymaster, I'm stumped. Fixing it will require a local reproducer.

—
Reply to this email directly or view it on GitHub.

dajhorn · 2014-07-13T21:38:06Z

Is there any chance #2455 might help?

@ryao, it might.

@frymaster, try putting a udevadm trigger line above the exec in the /etc/init/mountall.conf file. If that doesn't work, then try installing the zfs-initramfs package afterwards.

But... doesn't initramfs have its own udev?

Yes, it does. The systemd-udev upstart job is the second invocation, so the time stamps in the system log might not be what you expect. We wanted to see whether the DEVLINKS= were happening after the regular system started, but they were happening on time.

Having something like /dev/sdg1 appear without udev reliably creating a corresponding /dev/disk/by-id or /dev/disk/by-wwn alias for it is unusual.

ryao · 2014-07-13T23:10:37Z

On Jul 13, 2014, at 5:38 PM, Darik Horn notifications@github.com wrote:

Is there any chance #2455 might help?

@ryao, it might.

@frymaster, try putting a udevadm trigger line above the exec in the /etc/init/mountall.conf file. If that doesn't work, then try installing the zfs-initramfs package afterwards.

@dajhorn udevadm trigger operates asynchronously, so we could race with udevd's creation of symlinks. I included a call to yield() in that patch to try to close the race. In the case of mountall.conf, I think it might be necessary to place sleep 0.1 after udevadm trigger so that we lose the race.

behlendorf · 2016-03-24T20:57:45Z

I'm closing this issue out due to age. It's gotten a bit stale and with Ubuntu moving to systemd and shipping a version of ZoL with 16.04 this exact issue seems no longer relevant.

frymaster changed the title ~~Ubuntu: Not all /dev/disk/ symlinks are avilable when mountall runs~~ Ubuntu: Not all /dev/disk/ symlinks are available when mountall runs Jul 9, 2014

ryao mentioned this issue Jul 15, 2014

Call udevadm trigger before scanning devices #2455

Closed

behlendorf added the Bug label Jul 15, 2014

behlendorf added this to the 0.7.0 milestone Jul 15, 2014

behlendorf closed this as completed Mar 24, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ubuntu: Not all /dev/disk/ symlinks are available when mountall runs #2472

Ubuntu: Not all /dev/disk/ symlinks are available when mountall runs #2472

frymaster commented Jul 8, 2014

rdlugosz commented Jul 11, 2014

ryao commented Jul 11, 2014

dajhorn commented Jul 11, 2014

frymaster commented Jul 11, 2014

dajhorn commented Jul 11, 2014

rdlugosz commented Jul 11, 2014

frymaster commented Jul 12, 2014

dajhorn commented Jul 12, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

ryao commented Jul 13, 2014

dajhorn commented Jul 13, 2014

ryao commented Jul 13, 2014

behlendorf commented Mar 24, 2016

Ubuntu: Not all /dev/disk/ symlinks are available when mountall runs #2472

Ubuntu: Not all /dev/disk/ symlinks are available when mountall runs #2472

Comments

frymaster commented Jul 8, 2014

rdlugosz commented Jul 11, 2014

ryao commented Jul 11, 2014

dajhorn commented Jul 11, 2014

frymaster commented Jul 11, 2014

dajhorn commented Jul 11, 2014

rdlugosz commented Jul 11, 2014

frymaster commented Jul 12, 2014

dajhorn commented Jul 12, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

dajhorn commented Jul 13, 2014

frymaster commented Jul 13, 2014

ryao commented Jul 13, 2014

dajhorn commented Jul 13, 2014

ryao commented Jul 13, 2014

behlendorf commented Mar 24, 2016