Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udev: by-path device names for NVMe disks are not persistent #22692

Closed
estan opened this issue Mar 9, 2022 · 3 comments
Closed

udev: by-path device names for NVMe disks are not persistent #22692

estan opened this issue Mar 9, 2022 · 3 comments

Comments

@estan
Copy link

estan commented Mar 9, 2022

systemd version the issue has been seen with

245.4-4ubuntu3.15

Used distribution

Ubuntu Server 20.04.4 LTS

Linux kernel version used (uname -a)

Linux oden 5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

CPU architecture issue was seen on

x86_64

Expected behaviour you didn't see

I expected the by-path device names to be persistent across reboots

Unexpected behaviour you saw

The by-path device names keeps changing when rebooting the machine

Steps to reproduce the problem

Plug in 10 x MZQL2960HCJR-00A07 SSDs in an AS-1114S-WN10RT server and repeatedly boot Ubuntu Server 20.04.4 LTS on it.

Additional program output to the terminal or log subsystem illustrating the issue

$ ls -1 /dev/disk/by-path/*part9
/dev/disk/by-path/pci-0000:03:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:04:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:43:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:44:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:81:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:82:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c1:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c2:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c3:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c4:00.0-nvme-1-part9
$

<reboot>

$ ls -1 /dev/disk/by-path/*part9
/dev/disk/by-path/pci-0000:01:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:02:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:41:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:42:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:81:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:82:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c1:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c2:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c3:00.0-nvme-1-part9
/dev/disk/by-path/pci-0000:c4:00.0-nvme-1-part9
$

Notice how the device names for four of the disks changed during reboot.

@poettering
Copy link
Member

It seems that your system renumbered the PCI bus. There's nothing we can do about that. It's really a bug in your firmware: the reported stable ids are not as stable as they should be.

We can't make something stable that breaks the expectations on stability of "stable" identifiers reported by fimware.

Sorry, but this is not actionable for us. Please contact your system vendor about this instead. Or use better identifiers, i.e /dev/disk/by-id/nvme-eui* links for example.

Sorry if that's disappointing.

@estan
Copy link
Author

estan commented Mar 9, 2022

No worries, I was actually unsure where the bug was - udev, kernel, firmware, Supermicro. But you seem confident that udev just works with what it is given (nothing fancy going on), so I'll turn elsewhere :) Sorry for the noise!

@estan
Copy link
Author

estan commented Aug 11, 2022

Just to follow up: Supermicro acknowledged there was a problem with the backplane and CPLD firmware in the server which causes the enumeration to be nondeterministic. We did an RMA and got a new server with newer CPLD firmware and a rev 2.0 of the backplane, which works properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants