Cannot create zpool using links in /dev/disk/by-id #3708

Gregy · 2015-08-27T14:21:25Z

ZFS refuses to create new pool using links in /dev/disk/by-id. Using device nodes directly in /dev/ works fine. See snippet:

for disk in `ls /dev/sd[a-n]`; do dd if=/dev/zero of=$disk bs=1000M count=1; done
reboot
ls -l /dev/disk/by-id/
...
lrwxrwxrwx 1 root root 9 Aug 27 15:51 ata-ST2000VN0001-1SF174_Z4H033G5 -> ../../sdb
lrwxrwxrwx 1 root root 9 Aug 27 15:51 ata-ST2000VN0001-1SF174_Z4H047T6 -> ../../sda
...

zpool create -o ashift=12 -f mainStorage mirror /dev/disk/by-id/ata-ST2000VN0001-1SF174_Z4H047T6 /dev/disk/by-id/ata-ST2000VN0001-1SF174_Z4H033G5
cannot create 'mainStorage': one or more devices is currently unavailable

zpool create -o ashift=12 -f mainStorage mirror /dev/sda /dev/sdb
<finishes ok>

Is there some way to make zpool create more verbose?

I am running debian jessie with zfsutils 0.6.4-1.2-1

dmesg | grep -E 'SPL:|ZFS:'
[   46.699468] SPL: Loaded module v0.6.4-1b
[   46.798113] ZFS: Loaded module v0.6.4-1.2-1, ZFS pool version 5000, ZFS filesystem version 5
[  165.222847] SPL: using hostid 0x00000000

Thank you

The text was updated successfully, but these errors were encountered:

ryao · 2015-08-27T19:41:22Z

Thanks for the bug report. For now, try using the short names, ata-ST2000VN0001-1SF174_Z4H047T6 and ata-ST2000VN0001-1SF174_Z4H033G5.

rlanyi · 2015-09-03T21:23:30Z

The same is happening for me on Debian Jessie, even with short names, without the "/dev/disk/by-id" prefix.

I also found a workaround: create the pool with device names (eg. sda, sdb), then destroy the pool and recreate it with the previous command. This way I could use disk ids while creating the pool.

roosteng · 2015-09-04T19:30:45Z

This is also happening on OpenSuse 13.2. Does not happen on OpenSuse 13.1. Both compiled with latest git. It creates partition 1 and 9 on the drives but I think maybe they are not showing up fast enough in the /dev/disk/by-id/ directory?

dracwyrm · 2015-09-08T20:28:20Z

@rlanyi Instead of destroying the ZPool, you can export and then import it like this:
$ sudo zpool export tank
$ sudo zpool import -d /dev/disk/by-id tank
That will switch all /dev/sdx drives to the full ID.

Cheers.

atikir · 2015-10-08T20:21:35Z

I had this similar issue today. It worked via sdb and sdc but not with disk-ids. I realized if I used lower case for the pool name it worked. ie,
zpool create Storage -o autoexpand=on mirror ID1 ID2
did not work , but
zpool create storage -o autoexpand=on mirror ID1 ID2

did work.

mdsitton · 2015-10-10T08:06:52Z

I'm having the same issue.the name of the pool probably makes no difference for me since mine was lowercase already.

Could it be an issue with symbolic links? Since thats what the by-id names are for me.

mdsitton · 2015-10-10T08:32:08Z

What @dracwyrm suggested seems to work though so thanks!

siberx · 2015-11-07T06:40:21Z

Just ran into this bug myself today; wasted an hour fighting to create a new zpool for the first time using ZoL with by-id names until I came across this thread and tried creating using the /dev/sd* names instead. Worked fine, and once my migration is done I'll re-import by-id as described by dracwyrm to clean things up. I'm running under Fedora 22 if that helps.

jbrodriguez · 2015-11-24T18:43:30Z

This also happened to me.

Applied the workaround suggested by @dracwyrm and it was ok.

I'm running under Arch Linux (2015.11.01)

wdennis · 2015-12-16T21:24:48Z

Another "one or more devices is currently unavailable" sufferer when trying to create zpool with "by-id" devnames... This on Ubuntu 15.04 (ran into it on two separate systems so far.) Again, @dracwyrm 's workaround solved the problem for me. I did have one case where after a 2nd/3rd attempt to create with by-id devname's did work however (fwiw, used the "-f" flag on create...) Didn't work for me on this last system tho.

dmaziuk · 2016-01-11T20:53:34Z

Another one here. Adding raidz1-10 (sdae, sdaf, sdag) in a 36-bay AMD supermicro. Centos 7, kmod-zfs-0.6.5.3-1.el7.centos.x86_64 -- also fixed by @dracwyrm's workaround.

fractalram · 2016-01-13T06:50:42Z

Another one here..

root@debian:/dev/disk/by-id# zpool create -f mypool2 /dev/disk/by-id/scsi-35000c5008e59858f /dev/disk/by-id/scsi-35000c5008e5ba103
cannot create 'mypool2': no such pool or dataset

Works with names :+1
root@debian:/home/debian# zpool create -f mypool mirror sdi sdj
root@debian:/home/debian# zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
mypool 5.44T 224K 5.44T - 0% 0% 1.00x ONLINE -
root@debian:/home/debian# zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 240K 5.27T 96K /mypool
root@debian:/home/debian# zpool status
pool: mypool
state: ONLINE
scan: none requested
config:

NAME        STATE     READ WRITE CKSUM
mypool      ONLINE       0     0     0
  mirror-0  ONLINE       0     0     0
    sdi     ONLINE       0     0     0
    sdj     ONLINE       0     0     0

errors: No known data errors

dmaziuk · 2016-02-05T22:49:22Z

Got a different error message today:

# zpool add tank -o ashift=12 raidz1 /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z84038QC /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z84038TR /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z8403EB5 
cannot add to 'tank': no such pool or dataset

adding sda[h-j]1 worked. I haven't looked closely the last time, but this time it failed after creating -part1 and -part9 on the new disks, and after subsequent export/import:

# zpool status
  pool: tank
 state: ONLINE
  scan: scrub repaired 3.26M in 52h58m with 0 errors on Fri Jan  8 18:19:55 2016
config:

    NAME                                          STATE     READ WRITE CKSUM
    tank                                          ONLINE       0     0     0
      raidz1-0                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EGL4           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EGNK           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EHNP           ONLINE       0     0     0
      raidz1-2                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EJ9F           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EJCD           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EKD5           ONLINE       0     0     0
      raidz1-3                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EKXR           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301EL9Z           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301F0K7           ONLINE       0     0     0
      raidz1-4                                    ONLINE       0     0     0
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E2LVNLYS  ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301DD0E           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301DE6R           ONLINE       0     0     0
      raidz1-5                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301LGR9           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_S301LH9F           ONLINE       0     0     0
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E4TSKF1D  ONLINE       0     0     0
      raidz1-6                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z3041V61           ONLINE       0     0     0
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E7LLAU1Z  ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z3041VR9           ONLINE       0     0     0
      raidz1-7                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z304NLWM           ONLINE       0     0     0
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E5UXCTN8  ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z304NLJN           ONLINE       0     0     0
      raidz1-8                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z304VZWY           ONLINE       0     0     0
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E5NS4ZH4  ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z304X42V           ONLINE       0     0     0
      raidz1-9                                    ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z3057DKN           ONLINE       0     0     0
        ata-WDC_WD40EFRX-68WT0N0_WD-WCC4E0SERR6H  ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z305838A           ONLINE       0     0     0
      raidz1-10                                   ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z3041QD6           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z3041QXP           ONLINE       0     0     0
        ata-ST4000VN000-1H4168_Z3041RG3           ONLINE       0     0     0
      raidz1-11                                   ONLINE       0     0     0
        ata-ST8000AS0002-1NA17Z_Z84038QC-part1    ONLINE       0     0     0
        ata-ST8000AS0002-1NA17Z_Z84038TR-part1    ONLINE       0     0     0
        ata-ST8000AS0002-1NA17Z_Z8403EB5-part1    ONLINE       0     0     0
    logs
      mirror-1                                    ONLINE       0     0     0
        ata-CT1000BX100SSD1_1506F002DAF8-part3    ONLINE       0     0     0
        ata-CT1000BX100SSD1_1506F002DB7A-part3    ONLINE       0     0     0
    spares
      ata-ST4000VN000-1H4168_S301EHVW             AVAIL

-- note that the latest additions: raidz1-11 got imported as -part1's.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -- OK that's my bad, (looking at the command history now) I accidentally added partitions instead of whole devices.

fusionstream · 2016-02-24T23:41:58Z

Just thought I'd come in and say @dracwyrm workaround is also working for me on Fedora 22. Thanks!

Ralithune · 2016-03-12T00:20:16Z

This is still happening.

I'm trying to write some software that will work generically across different OS's and systems running ZFS, and using the WWN ID to specify the drives is the best way to accomplish that. Is this being looked at at all?

I'm running CentOS 7.

_EDIT_
It appears to actually create partitions on the disks, even though the command failes with "no such pool or dataset".

jdmaloney · 2016-03-12T04:21:58Z

I'm seeing this as well on CentOS 7 on zpool creates, but to me more critically on zpool replaces. Managing some large JBODs and have scripts to walk the disks looking for failures and start the rebuild after the new disk is inserted. Can't get consistent success on the replace, both by hand or in the scripts when using anything besides the raw /dev/sdxx device. Same error: "no such pool or dataset"

It's not too painful create zpools, export and re-import /dev/disk/by-vdev, but I need to replace disks without having to take the system down to export and reimport.

partoneoftwo · 2016-03-14T19:00:08Z

I just hit this problem myself, on Proxmox 4.1 distrib of Debian on kernel 4.3 with zfs on linux.
The direct solution to the problem is the workaround above. I used this Arch Linux forum post to guide me.

dmaziuk · 2016-03-14T19:03:56Z

@thomasfrivold: yes, we know. The problem is downtime, there wasn't supposed to be any.

Ralithune · 2016-03-14T20:07:50Z

And secondly, using the workaround in a utility meant to run on different systems of foreign configs is really sloppy and will be prone to failure.

We need the issue fixed. I don't have a lot of experience digging in to open source software (just writing my own convenience stuff) but I suppose I could take a crack at it.

behlendorf · 2016-03-16T00:10:39Z

As mentioned above the problem is likely that the partition links under /dev/disk/by-id/ aren't being created. When given a block device zpool create|add|replace| will partition the device and expect those partitions to be available fairly promptly. If they're not it will fail as described. This is something udev takes care of on most platforms.

Unfortunately, I wasn't able to easy reproduce this issue so I'm hoping someone who can can answer a few questions for me:

After the command fails are the new partitions visible in the /dev/disk/by-id directory? They will be called something like ata-ST0000VN000-123456_S1234567-part1 and ``ata-ST0000VN000-123456_S1234567-part9`.
If not, are they created after running udevadm trigger?
How long does the zpool create command take to run in the failure case. By default it should wait up to 30 seconds per device for the partitions link to be created by udev.

dmaziuk · 2016-03-16T00:21:12Z

I'm pretty sure it's less than 30 seconds (and it's zpool add in my case). I will be adding another set of disks to the system with this behaviour in the next month or so, I'm making a note to make a note. ;)

jdmaloney · 2016-03-16T01:44:42Z

Can check tomorrow when I get in to work, I'm not 100% sure about the partitions, though I know the base device at least is in /dev/disk/by-path
Again can double check
I agree with @dmaziuk it's less than 30 seconds, for me it's in the 5-10 second range

My code for replacement is in a repo here on github: https://github.com/jdmaloney/zfs_utils
(zfs_fault_fixer) After placing the new drive in the chassis, I wait 30 seconds to a minute then kick off the script, though I have found waiting longer and longer doesn't increase my chances of success. Making sure the access light is long since off I kick off the script which runs through and creates my new vdev_id.conf file, runs udevadm trigger, sleeps 5 seconds, then builds the command to run the zpool replace and runs it. I echo the pci string to make sure my script can seen it, and it does, so linux should know the drive is there.

Can post some exact output tomorrow as I have this system available for testing.

dmaziuk · 2016-03-21T21:54:47Z

Out of order:

2 seconds

]# date ; zpool add tank -o ashift=12 raidz1 /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EX0H /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EY7V /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EZP0 ; date
Mon Mar 21 16:50:00 CDT 2016
cannot add to 'tank': no such pool or dataset
Mon Mar 21 16:50:02 CDT 2016

yes

# ls /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840E*
/dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EX0H    /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EY7V    /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EZP0
/dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EX0H-part1  /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EY7V-part1  /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EZP0-part1
/dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EX0H-part9  /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EY7V-part9  /dev/disk/by-id/ata-ST8000AS0002-1NA17Z_Z840EZP0-part9

So 2) does not apply, of course.

Just for completeness:

# zpool add tank -o ashift=12 raidz1 /dev/sdam /dev/sda /dev/sdal

[root@starfish by-id]# zpool status tank
  pool: tank
...
      raidz1-12                                   ONLINE       0     0     0
        sdam                                      ONLINE       0     0     0
        sda                                       ONLINE       0     0     0
        sdal                                      ONLINE       0     0     0
...

dasjoe · 2016-03-23T23:16:20Z

@behlendorf I've seen this bug before, relevant snippet from an IRC discussion:

23:02 < dasjoe> ryao: I *think* by-id/ gets populated too late after zfs puts a GPT on a disk
23:03 < dasjoe> ryao: there are countermeasures against this, but as far as I understand the code ZFS waits until links to /dev/sdX appear, not for -part1 to become available

ilovezfs · 2016-03-24T07:01:22Z

@behlendorf OK, the problem is the interval between stats needs to be longer here:
https://github.com/zfsonlinux/zfs/blob/master/lib/libzfs/libzfs_pool.c#L4128

diff --git a/lib/libzfs/libzfs_pool.c b/lib/libzfs/libzfs_pool.c
index 9fc4bfc..1e946cd 100644
--- a/lib/libzfs/libzfs_pool.c
+++ b/lib/libzfs/libzfs_pool.c
@@ -4124,8 +4124,8 @@ zpool_label_disk_wait(char *path, int timeout)
     * will exist and the udev link will not, so we must wait for the
     * symlink.  Depending on the udev rules this may take a few seconds.
     */
-   for (i = 0; i < timeout; i++) {
-       usleep(1000);
+   for (i = 0; i < timeout/10; i++) {
+       usleep(10000);

        errno = 0;
        if ((stat64(path, &statbuf) == 0) && (errno == 0))

Not sure what's needed in order to be as robust as possible without being silly but that seems to have been sufficient.

dasjoe · 2016-03-29T20:35:28Z

@behlendorf @ilovezfs I thought the race condition was caused by https://github.com/zfsonlinux/zfs/blob/master/lib/libzfs/libzfs_pool.c#L4282 - we seem to be waiting for the disk's symlinks to get created, not for the disk's first partition.
The problem with this is, the disk's symlink never gets removed, whereas -partX disappear.

Interestingly the zpool create completes successfully after zapping GPT and MBR.

Here's an udevadm monitor --e during a zpool create -f TEST /dev/disk/by-id/usb-SanDisk_Extreme_AA010805141143052101-0\:0, with partitions 1 and 9 already existing:

UDEV  [440370.389410] remove   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440370.392463] remove   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440370.473817] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440370.524642] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440370.527806] remove   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440370.536155] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440370.544679] remove   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440370.617717] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440370.676901] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440370.679961] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440370.756487] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440370.806316] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440370.817451] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440370.872328] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)

Here the same, using a fresh GPT without partitions 1 and 9:

UDEV  [440661.083752] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440661.116216] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440661.126996] remove   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440661.130858] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440661.142613] remove   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440661.226062] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440661.255215] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440661.259926] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440661.335891] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440661.365770] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440661.371126] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440661.390271] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)

And here after zapping all GPT and MBR structures:

UDEV  [440781.063806] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440781.143256] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440781.178342] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440781.178877] add      /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440781.283868] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb (block)
UDEV  [440781.414088] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb9 (block)
UDEV  [440781.432523] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)
UDEV  [440781.498210] add      /devices/virtual/bdi/zfs-8 (bdi)
UDEV  [440781.527729] change   /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.2/1-1.2:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1 (block)

behlendorf · 2016-03-30T01:18:02Z

@dasjoe I'm not sure I follow you entirely. We should be waiting for the partition symlink (-partX) to be created in the expected place by udev. Are you saying those partition symlinks aren't being created by udev for some reason? Just the device itself?

dasjoe · 2016-03-31T17:26:24Z

@behlendorf Nevermind, I just realized zfs_append_partition gets called before we are waiting for the symlinks, so we are correctly using the partition symlink.

However, it is interesting that zpool create fails for disks with a GPT even when no partitions are present. Zapping the GPT/MBR labels makes zpool create succeed, though.

dmaziuk · 2016-03-31T17:46:35Z

Hmm... IIRC when I started playing with zfs, zpool create wouldn't work unless the disk had a GPT label? I mean, there must've been a reason I started creating those instead of just adding completely blank disks.

dasjoe · 2016-03-31T17:49:27Z

@dmaziuk as of now zpool create complains about disks lacking a GPT, you can force the creation with -f.

dmaziuk · 2016-03-31T17:59:26Z

Ah, that's what it was. Well, I'm maxed out in the chassis I've been adding disks to & it looks like I won't have an opportunity to test it with -f anytime soon. :(

jhetrick · 2016-04-11T13:41:42Z

I think this is the same situation as above so I can test it for you :)

`
[root@itf]# ls /dev/disk/by-id/wwn-0x5000cca23b0e93a8*
/dev/disk/by-id/wwn-0x5000cca23b0e93a8

[root@itf]# zpool create dpool05 wwn-0x5000cca23b0e93a8
invalid vdev specification
use '-f' to override the following errors:

/dev/disk/by-vdev/wwn-0x5000cca23b0e93a8 contains a corrupt primary EFI label.
[root@itf]# zpool create -f dpool05 wwn-0x5000cca23b0e93a8
cannot create 'dpool05': no such pool or dataset

[root@itf]# ls /dev/disk/by-id/wwn-0x5000cca23b0e93a8*
/dev/disk/by-id/wwn-0x5000cca23b0e93a8 /dev/disk/by-id/wwn-0x5000cca23b0e93a8-part9
/dev/disk/by-id/wwn-0x5000cca23b0e93a8-part1

[root@itf]# zpool create -f dpool05 wwn-0x5000cca23b0e93a8
cannot create 'dpool05': no such pool or dataset
`

emk2203 · 2016-04-19T15:45:25Z

I am also affected by this bug. The problem is that zpool replace doesn't work even with an attempted workaround.

I am running Ubuntu 16.04LTS with the official zfs modules, at the moment 0.6.5.6-0ubuntu8.

If I try to exchange a disk from a however imported pool, the offlined and removed disk is known to zfs by id, even if I imported the pool by dev.

An attempt to replace, either with disk id or device, gives the error. It also doesn't matter if I erase the GPT and file system structure or not. It just takes a few seconds longer until the error message appears. Also, forcing with -f has no effect.

To prepare the disk for zfs, it should be enough to erase the first and last 100 MB on it, no?

When ZFS partitions a block device it must wait for udev to create both a device node and all the device symlinks. This process takes a variable length of time and depends factors such how many links must be created, the complexity of the rules, etc. Complicating the situation further it is not uncommon for udev to create and then remove a link multiple times while processing the rules. Given the above, the existing scheme of waiting for an expected partition to appear by name isn't 100% reliable. At this point udev may still remove and recreate think link resulting in the kernel modules being unable to open the device. In order to address this the zpool_label_disk_wait() function has been updated to use libudev. Until the registered system device acknowledges that it in fully initialized the function will wait. Once fully initialized all device links are checked and allowed to settle for 50ms. This makes it far more certain that all the device nodes will existing when the kernel modules need to open them. For systems without libudev an alternate zpool_label_disk_wait() was implemented which includes a settle time. In addition, the kernel modules were updated to include retry logic for this ENOENT case. Due to the improved checks in the utilities it is unlikely this logic will be invoked, however in the rare event it is needed to will prevent a failure. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#3708 Issue openzfs#4077 Issue openzfs#4144 Issue openzfs#4214 Issue openzfs#4517

behlendorf · 2016-04-25T18:22:35Z

This issue has been addressed in master by commit 2d82ea8 and we'll look in to back porting it for 0.6.5.7. As always if you're in a position where you can providing additional verification of the fix applied to master it would be appreciated. This ended up being a subtle timing issue so the more real world validation of the fix the better.

ierdnah · 2016-05-05T10:03:54Z

On Ubuntu 16.04 LTS:
root@ubuntu:# zpool create -f -m /var/backups/ backups ata-ST2000DM001-1ER164_Z4Z4BPGL
root@ubuntu:# zpool destroy backups
root@ubuntu:# zpool create -f -m /var/backups/ backups ata-ST2000DM001-1ER164_Z4Z4BPGL
cannot create 'backups': no such pool or dataset
root@ubuntu:# dd if=/dev/zero of=/dev/disk/by-id/ata-ST2000DM001-1ER164_Z4Z4BPGL count=100 bs=1M
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.544526 s, 193 MB/s
root@ubuntu:# zpool create -f -m /var/backups/ backups ata-ST2000DM001-1ER164_Z4Z4BPGL
root@ubuntu:#

When ZFS partitions a block device it must wait for udev to create both a device node and all the device symlinks. This process takes a variable length of time and depends on factors such how many links must be created, the complexity of the rules, etc. Complicating the situation further it is not uncommon for udev to create and then remove a link multiple times while processing the udev rules. Given the above, the existing scheme of waiting for an expected partition to appear by name isn't 100% reliable. At this point udev may still remove and recreate think link resulting in the kernel modules being unable to open the device. In order to address this the zpool_label_disk_wait() function has been updated to use libudev. Until the registered system device acknowledges that it in fully initialized the function will wait. Once fully initialized all device links are checked and allowed to settle for 50ms. This makes it far more likely that all the device nodes will exist when the kernel modules need to open them. For systems without libudev an alternate zpool_label_disk_wait() was updated to include a settle time. In addition, the kernel modules were updated to include retry logic for this ENOENT case. Due to the improved checks in the utilities it is unlikely this logic will be invoked. However, if the rare event it is needed it will prevent a failure. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Richard Laager <rlaager@wiktel.com> Closes openzfs#4523 Closes openzfs#3708 Closes openzfs#4077 Closes openzfs#4144 Closes openzfs#4214 Closes openzfs#4517

DanEmord · 2016-06-05T16:03:36Z

I'm currently on Ubuntu 16.04 with the latest available ZFS being 0.6.5.6 (0.6.5.7 is available in 16.10, but I don't feel like mucking with that right now :-) ). I was able to create pools on SSDs without issue, but the HDD were too slow and would fail with this error. On a whim, I decided to try maxing out the CPU to induce artificial delay... and it worked! I can reliably create and destroy the pool with "stress -c 16" running in the background. YMMV.

When ZFS partitions a block device it must wait for udev to create both a device node and all the device symlinks. This process takes a variable length of time and depends on factors such how many links must be created, the complexity of the rules, etc. Complicating the situation further it is not uncommon for udev to create and then remove a link multiple times while processing the udev rules. Given the above, the existing scheme of waiting for an expected partition to appear by name isn't 100% reliable. At this point udev may still remove and recreate think link resulting in the kernel modules being unable to open the device. In order to address this the zpool_label_disk_wait() function has been updated to use libudev. Until the registered system device acknowledges that it in fully initialized the function will wait. Once fully initialized all device links are checked and allowed to settle for 50ms. This makes it far more likely that all the device nodes will exist when the kernel modules need to open them. For systems without libudev an alternate zpool_label_disk_wait() was updated to include a settle time. In addition, the kernel modules were updated to include retry logic for this ENOENT case. Due to the improved checks in the utilities it is unlikely this logic will be invoked. However, if the rare event it is needed it will prevent a failure. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Richard Laager <rlaager@wiktel.com> Closes openzfs#4523 Closes openzfs#3708 Closes openzfs#4077 Closes openzfs#4144 Closes openzfs#4214 Closes openzfs#4517

HinderSting · 2016-06-18T18:07:34Z

The following command failed for me:
zpool add tank mirror /dev/disk/by-id/xxxxxxx /dev/disk/by-id/yyyyyyy

New parts like xxxxxx-part1 xxxxx-part9 were created after the command execution.

This workaround works for me:
zpool add tank mirror /dev/disk/by-id/xxxxxxx-part1 /dev/disk/by-id/yyyyyyy-part1

PS. I'm running Ubuntu 16.04

alejos · 2018-03-01T14:35:32Z

hello there,
I had the same probleme but there was just a space in my command check your syntax.
hope help !
Alejo

NathanaelA · 2019-01-21T21:08:50Z

@rlanyi Instead of destroying the ZPool, you can export and then import it like this:
$ sudo zpool export tank
$ sudo zpool import -d /dev/disk/by-id tank
That will switch all /dev/sdx drives to the full ID.

Cheers.

This worked awesome; for some reason I was totally confused and thinking I needed to put the actual disk id of one of the devices in the command line so it would find the devices; like so:

$ sudo zpool import -d /dev/disk/by-id/SOME_DISK_ID tank
But that is wrong and fails. Do NOT put the disk_id in; just use the
$ sudo zpool import -d /dev/disk/by-id tank
and zpool is smart enough to figure it all out for you based on your pool name. ;-)

So if anyone else gets confused by the instructions elsewhere; just let zpool do the work, don't over complicate it. ;-)

OrlandoNative · 2021-06-25T15:29:46Z

It's not just by-id. Seems the zpool utility can't directly handle anything other than by "standard" devices - eg sdX; other than for import. Once you create a pool; you can then export it and re-import it using just about any format you want - by-id; by-path; by-partid; etc. But if you need to replace a disk; or detach a spare; or whatever; none of those subcommands work using that same disk format. You then have to export and re-import the pool specifying /dev so it gets the "standard" device names for the disks; and then you can run the subcommands. I'm on CentOS 7 using the latest available zfs; which is a higher release than most of what I've seen in this thread; and it still has this "annoyance". Zpool should have some way of determining what format it''s vdevs are using as far as device names go; and allow it's subcommands to use that format.

altimmons · 2022-04-30T02:38:40Z

Disappointing to see this 2016 bug still around.
Adding by ID fails, and Ive tried 1000 ways over 4 days, deleting MBRs, and Partitions and whatever else I can think of.

What worked was adding by /dev/sdx (which we are explicitly advised NOT to do) and then using dracwyrm 7 year work around.

Really in 7 years no one has fixed this?

specifically I get this error

missing link: sdbs was partitioned but /dev/disk/by-vdev/ztest8-part1 is missing

I set up aliases in /etc/zfs/vdev_id.conf
and running udevadm trigger
then:
❯ zpool create -f ztst mirror ztest8 ztest9

But also using the /dev/disk/by-id/ names fails every time, though I think the error differs slightly:

cannot label 'sdbt': failed to detect device partitions on '/dev/sdbt1': 19

System is Slackware (Unraid)
❯ zfs version
zfs-2.1.4-1
zfs-kmod-2.1.4-1

behlendorf added the Bug - Minor label Oct 13, 2015

dmaziuk mentioned this issue Jan 21, 2016

centos 7: dkms strikes again #3801

Closed

This was referenced Mar 14, 2016

zpool add fails with disks referenced by id #4077

Closed

ZFS attach fails for disk referenced by-id #4144

Closed

behlendorf added this to the 0.7.0 milestone Mar 22, 2016

behlendorf added Bug - Major and removed Bug - Minor labels Apr 12, 2016

behlendorf closed this as completed in 2d82ea8 Apr 25, 2016

behlendorf modified the milestones: 0.6.5.7, 0.7.0 Apr 25, 2016

ltaulell mentioned this issue Aug 22, 2016

Zpool cannot create pool / mix multipath with vdev_id.conf defined disks (not a bug) #5008

Closed

altimmons mentioned this issue Apr 30, 2022

Still having Issue 3708 #13391

Open

Cannot create zpool using links in /dev/disk/by-id #3708

Cannot create zpool using links in /dev/disk/by-id #3708

Comments

Gregy commented Aug 27, 2015

ryao commented Aug 27, 2015

rlanyi commented Sep 3, 2015

roosteng commented Sep 4, 2015

dracwyrm commented Sep 8, 2015

atikir commented Oct 8, 2015

mdsitton commented Oct 10, 2015

mdsitton commented Oct 10, 2015

siberx commented Nov 7, 2015

jbrodriguez commented Nov 24, 2015

wdennis commented Dec 16, 2015

dmaziuk commented Jan 11, 2016

fractalram commented Jan 13, 2016

dmaziuk commented Feb 5, 2016

fusionstream commented Feb 24, 2016

Ralithune commented Mar 12, 2016

jdmaloney commented Mar 12, 2016

partoneoftwo commented Mar 14, 2016

dmaziuk commented Mar 14, 2016

Ralithune commented Mar 14, 2016

behlendorf commented Mar 16, 2016

dmaziuk commented Mar 16, 2016

jdmaloney commented Mar 16, 2016

dmaziuk commented Mar 21, 2016

dasjoe commented Mar 23, 2016

ilovezfs commented Mar 24, 2016

dasjoe commented Mar 29, 2016

behlendorf commented Mar 30, 2016

dasjoe commented Mar 31, 2016

dmaziuk commented Mar 31, 2016

dasjoe commented Mar 31, 2016

dmaziuk commented Mar 31, 2016

jhetrick commented Apr 11, 2016

emk2203 commented Apr 19, 2016 • edited Loading

behlendorf commented Apr 25, 2016

ierdnah commented May 5, 2016 • edited Loading

DanEmord commented Jun 5, 2016

HinderSting commented Jun 18, 2016 • edited Loading

alejos commented Mar 1, 2018

NathanaelA commented Jan 21, 2019 • edited Loading

OrlandoNative commented Jun 25, 2021

altimmons commented Apr 30, 2022 • edited Loading

emk2203 commented Apr 19, 2016 •

edited

Loading

ierdnah commented May 5, 2016 •

edited

Loading

HinderSting commented Jun 18, 2016 •

edited

Loading

NathanaelA commented Jan 21, 2019 •

edited

Loading

altimmons commented Apr 30, 2022 •

edited

Loading