Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cope with block devices with no MAJOR in udev #86

Merged
merged 4 commits into from Apr 11, 2020

Conversation

mwhudson
Copy link
Collaborator

@mwhudson mwhudson commented Apr 7, 2020

https://bugs.launchpad.net/subiquity/+bug/1868109 reports a crash
stemming from a block device that has no MAJOR in udev. I'm not sure
what that represents, but crashing probably isn't the correct response.
Ignore such devices instead.

https://bugs.launchpad.net/subiquity/+bug/1868109 reports a crash
stemming from a block device that has no MAJOR in udev. I'm not sure
what that represents, but crashing probably isn't the correct response.
Ignore such devices instead.
@@ -34,7 +34,7 @@ def probe(context=None):
for device in context.list_devices(subsystem='block'):
# Ignore block major=1 (ramdisk) and major=7 (loopback)
# these won't ever be used in recreating storage on target systems.
if device['MAJOR'] not in ["1", "7"]:
if device.get('MAJOR') not in ["1", "7", None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer if we spelled out what we're doing here; these are two separate checks and I'd prefer if they weren't conflated:

if 'MAJOR' in device and device['MAJOR'] not in ["1", "7"]:

Should we be updating the comment above?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI. The "device" that triggered this bug, seems to be the one below (extract from udevadm info -e):

 P: /devices/pci0000:17/0000:17:00.0/0000:18:00.0/nvme/nvme0/nvme0c0n1
 L: 0
 E: DEVPATH=/devices/pci0000:17/0000:17:00.0/0000:18:00.0/nvme/nvme0/nvme0c0n1
 E: SUBSYSTEM=block
 E: DEVTYPE=disk
 E: USEC_INITIALIZED=762755939
 E: MPATH_SBIN_PATH=/sbin
 E: DM_MULTIPATH_DEVICE_PATH=0
 E: ID_SERIAL_SHORT=S4CCNE0M300015
 E: ID_WWN=eui.344343304d3000150025384500000004
 E: ID_MODEL=SAMSUNG MZPLL3T2HAJQ-00005
 E: ID_REVISION=GPJA0B3Q
 E: ID_SERIAL=SAMSUNG MZPLL3T2HAJQ-00005_S4CCNE0M300015
 E: ID_PATH=pci-0000:18:00.0-nvme-1
 E: ID_PATH_TAG=pci-0000_18_00_0-nvme-1
 E: TAGS=:systemd:

@xnox
Copy link
Contributor

xnox commented Apr 7, 2020

I somehow thing this might be a kernel bug

on power system

pprint.pprint(list(dev.properties.items()))
[('DEVPATH',
  '/devices/pci0030:00/0030:00:00.0/0030:01:00.0/nvme/nvme0/nvme0c0n1'),
 ('DEVTYPE', 'disk'),
 ('DM_MULTIPATH_DEVICE_PATH', '0'),
 ('ID_MODEL', 'SAMSUNG MZPLL1T6HEHP-00003'),
 ('ID_PATH', 'pci-0030:01:00.0-nvme-1'),
 ('ID_PATH_TAG', 'pci-0030_01_00_0-nvme-1'),
 ('ID_REVISION', 'GPNA5B3Q'),
 ('ID_SERIAL', 'SAMSUNG MZPLL1T6HEHP-00003_S3HBNA0JA00255'),
 ('ID_SERIAL_SHORT', 'S3HBNA0JA00255'),
 ('ID_WWN', 'eui.334842304aa002550025384100000004'),
 ('MPATH_SBIN_PATH', '/sbin'),
 ('SUBSYSTEM', 'block'),
 ('TAGS', ':systemd:'),
 ('USEC_INITIALIZED', '51204511')]

But amd64

list(dev.properties.items())                                                                                                                                                                              
Out[17]: 
[('DEVLINKS',
  '/dev/disk/by-path/pci-0000:04:00.0-nvme-1 /dev/disk/by-id/nvme-eui.00000000000000006479a71d90513837 /dev/disk/by-id/nvme-Sabrent_27FF079419DE00131387'),
 ('DEVNAME', '/dev/nvme0n1'),
 ('DEVPATH',
  '/devices/pci0000:00/0000:00:1d.0/0000:04:00.0/nvme/nvme0/nvme0n1'),
 ('DEVTYPE', 'disk'),
 ('DM_MULTIPATH_DEVICE_PATH', '0'),
 ('ID_MODEL', 'Sabrent'),
 ('ID_PART_TABLE_TYPE', 'gpt'),
 ('ID_PART_TABLE_UUID', '006790e4-480b-4c80-bdf8-684a4a3da748'),
 ('ID_PATH', 'pci-0000:04:00.0-nvme-1'),
 ('ID_PATH_TAG', 'pci-0000_04_00_0-nvme-1'),
 ('ID_REVISION', 'ECFM12.2'),
 ('ID_SERIAL', 'Sabrent_27FF079419DE00131387'),
 ('ID_SERIAL_SHORT', '27FF079419DE00131387'),
 ('ID_WWN', 'eui.00000000000000006479a71d90513837'),
 ('MAJOR', '259'),
 ('MINOR', '0'),
 ('MPATH_SBIN_PATH', '/sbin'),
 ('SUBSYSTEM', 'block'),
 ('TAGS', ':systemd:'),
 ('USEC_INITIALIZED', '1450288')]

Not sure why my nvme drive is called nvme0n1 on amd64, but nvme0c0n1 on ppc, will check udev rules. But it seems to me, it is buggy to have a device with a major.

also nvme0c0n1 does not exist on disk....

# stat /dev/nvme0*
  File: /dev/nvme0
  Size: 0         	Blocks: 0          IO Block: 65536  character special file
Device: 6h/6d	Inode: 192         Links: 1     Device type: f2,0
Access: (0600/crw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-04-07 15:18:28.473229295 +0000
Modify: 2020-04-07 15:18:28.473229295 +0000
Change: 2020-04-07 15:18:28.473229295 +0000
 Birth: -
  File: /dev/nvme0n1
  Size: 0         	Blocks: 0          IO Block: 65536  block special file
Device: 6h/6d	Inode: 498         Links: 1     Device type: 103,1
Access: (0660/brw-rw----)  Uid: (    0/    root)   Gid: (    6/    disk)
Access: 2020-04-07 15:18:28.733214919 +0000
Modify: 2020-04-07 15:18:28.733214919 +0000
Change: 2020-04-07 15:18:28.733214919 +0000
 Birth: -

Things look very odd.

@xnox
Copy link
Contributor

xnox commented Apr 7, 2020

/*
 * If multipathing is enabled we need to always use the subsystem instance
 * number for numbering our devices to avoid conflicts between subsystems that
 * have multiple controllers and thus use the multipath-aware subsystem node
 * and those that have a single controller and use the controller node
 * directly.
 */
void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
                        struct nvme_ctrl *ctrl, int *flags)
{
        if (!multipath) {
                sprintf(disk_name, "nvme%dn%d", ctrl->instance, ns->head->instance);
        } else if (ns->head->disk) {
                sprintf(disk_name, "nvme%dc%dn%d", ctrl->subsys->instance,
                                ctrl->instance, ns->head->instance);
                *flags = GENHD_FL_HIDDEN;
        } else {
                sprintf(disk_name, "nvme%dn%d", ctrl->subsys->instance,
                                ns->head->instance);
        }
}

Ok so it has multipath enabled.

@xnox
Copy link
Contributor

xnox commented Apr 7, 2020

pprint.pprint(nvmes)
[Device('/sys/devices/pci0030:00/0030:00:00.0/0030:01:00.0/nvme/nvme0'),
 Device('/sys/devices/pci0030:00/0030:00:00.0/0030:01:00.0/nvme/nvme0/nvme0c0n1'),
 Device('/sys/devices/virtual/nvme-subsystem/nvme-subsys0'),
 Device('/sys/devices/virtual/nvme-subsystem/nvme-subsys0/nvme0n1'),
 Device('/sys/devices/virtual/workqueue/nvme-delete-wq'),
 Device('/sys/devices/virtual/workqueue/nvme-reset-wq'),
 Device('/sys/devices/virtual/workqueue/nvme-wq')]

It is as if, we only see multipath nvme device, but not the virtual nvme subsystem block namespace (nvme0n1).

@xnox
Copy link
Contributor

xnox commented Apr 7, 2020

>>> print(dev)
Device('/sys/devices/virtual/nvme-subsystem/nvme-subsys0/nvme0n1')
>>> dev['MAJOR']
'259'

So as if we need to request pyudev to give us virtual nvme drives too....

@xnox
Copy link
Contributor

xnox commented Apr 7, 2020

# multipath -ll
mpathb (0x5000c5008cef1633) dm-3 ATA,ST1000NX0313
size=932G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 1:0:0:0 sdb       8:16 active ready running
mpatha (0x5000c5008cef1495) dm-0 ATA,ST1000NX0313
size=932G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 0:0:0:0 sda       8:0  active ready running
eui.334842304aa002550025384100000004 [nvme]:nvme0n1 NVMe,SAMSUNG MZPLL1T6HEHP-00003,GPNA5B3Q
size=3125627568 features='n/a' hwhandler='n/a' wp=rw
`-+- policy='n/a' prio=n/a status=n/a
  `- 0:0:1   nvme0c0n1 0:0  n/a   n/a   live   

Managed to configure multipath on that nmve, but it seems like it is single path only. I do wonder if we need to support assembling nvme multipath devices somehow.

Copy link
Contributor

@xnox xnox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A subsystem='block' device nvme0c0n1 without a MAJOR number in this case is multipath control node, and is not a block device at all. It does not have /dev/nvme* node, and it does not have DEVLINKS. And is not something we can install anything, and thus it should be ingored, and not returned by the function at all.

Given above processing by multipathd maybe nvme0c0n1 is a single path, to nvme0n blockdevice. I.e. we should only care about nvme0n1.

if 'MAJOR' not in device:
    continue

should be added before checking that MAJOR is not in 1,7.

@mwhudson
Copy link
Collaborator Author

mwhudson commented Apr 7, 2020

OK I updated to be more clear and less cute.

@raharper
Copy link
Contributor

raharper commented Apr 7, 2020

A subsystem='block' device nvme0c0n1 without a MAJOR number in this case is multipath control node, and is not a block device at all.

That feels like a bug. I'm not a fan of this "partially hiding stuff" in the kernel. Why isn't the control node a char device since you can't actually open or interact with it as a block device?

Previous discuss a while back on the curtin side for more background.

https://bugs.launchpad.net/curtin/+bug/1830913

Comment on lines 34 to 40
for device in context.list_devices(subsystem='block'):
if "MAJOR" not in device:
# Shouldn't happen but apparently does! (LP: #1868109)
continue
# Ignore block major=1 (ramdisk) and major=7 (loopback)
# these won't ever be used in recreating storage on target systems.
if device['MAJOR'] not in ["1", "7"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have filtering helper which takes a context and yields devices if they pass these two if checks; then both files can call the same helper to enumerate the devices they need to operate on?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels like a bug. I'm not a fan of this "partially hiding stuff" in the kernel. Why isn't the control node a char device since you can't actually open or interact with it as a block device?

Agreed but I'm even less of a fan of block probe failures :)

Linux block devices, not even once.

Previous discuss a while back on the curtin side for more background.

https://bugs.launchpad.net/curtin/+bug/1830913

I thought I'd seen some discussion of this sort of thing before.

Copy link
Contributor

@raharper raharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unittest needs fixing:

diff --git a/probert/tests/test_lvm.py b/probert/tests/test_lvm.py
index 548bd74..c14f375 100644
--- a/probert/tests/test_lvm.py
+++ b/probert/tests/test_lvm.py
@@ -264,12 +264,13 @@ class TestLvm(testtools.TestCase):
     @mock.patch('probert.lvm.read_sys_block_size_bytes')
     @mock.patch('probert.lvm.activate_volgroups')
     @mock.patch('probert.lvm.lvm_scan')
-    @mock.patch('probert.lvm.pyudev.Context.list_devices')
+    @mock.patch('probert.lvm.sane_block_devices')
     @mock.patch('probert.lvm.probe_vgs_report')
-    def test_probe(self, m_vgs, m_pyudev, m_scan, m_activate, m_size, m_run):
+    def test_probe(self, m_vgs, m_blockdevs, m_scan, m_activate, m_size,
+                   m_run):
         size = 1000
         m_size.return_value = size
-        m_pyudev.return_value = CONTEXT
+        m_blockdevs.return_value = CONTEXT
         m_vgs.return_value = VGS_REPORT

         expected_result = {
@@ -297,13 +298,13 @@ class TestLvm(testtools.TestCase):
     @mock.patch('probert.lvm.read_sys_block_size_bytes')
     @mock.patch('probert.lvm.activate_volgroups')
     @mock.patch('probert.lvm.lvm_scan')
-    @mock.patch('probert.lvm.pyudev.Context.list_devices')
+    @mock.patch('probert.lvm.sane_block_devices')
     @mock.patch('probert.lvm.probe_vgs_report')
-    def test_probe_skip_dupes(self, m_vgs, m_pyudev, m_scan, m_activate,
+    def test_probe_skip_dupes(self, m_vgs, m_blockdevs, m_scan, m_activate,
                               m_size, m_run):
         size = 1000
         m_size.return_value = size
-        m_pyudev.return_value = CONTEXT_DUPES
+        m_blockdevs.return_value = CONTEXT_DUPES
         m_vgs.return_value = VGS_REPORT_DUPES

         expected_result = {

@mwhudson mwhudson merged commit 4793e16 into canonical:master Apr 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants