Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multipathd starts after filesystem mounted #35

Closed
optix2000 opened this issue Jul 4, 2022 · 25 comments
Closed

multipathd starts after filesystem mounted #35

optix2000 opened this issue Jul 4, 2022 · 25 comments

Comments

@optix2000
Copy link

Version: multipath-tools v0.8.5 (11/09, 2020)
OS: Debian Bullseye

I'm getting No SAS end device for 'end_device-0:0' when trying to set up multipath for drives connected to a SAS expander.

Possibly related, multipath -u /dev/sd* always outputs DM_MULTIPATH_DEVICE_PATH="0" and the devices always get mounted before multipath gets to them, despite the fact that multipath is listed before bcache in udev. 60-multipath.rules < 69-bcache.rules

Not sure what to look at from here. Any advice would be appreciated!

Thanks!

Jul 04 03:05:31 systemd[1]: Starting Device-Mapper Multipath Device Controller...
Jul 04 03:05:31 multipathd[1389]: --------start up--------
Jul 04 03:05:31 multipathd[1389]: read /etc/multipath.conf
Jul 04 03:05:31 multipathd[1389]: failed to increase buffer size
Jul 04 03:05:31 multipathd[1389]: path checkers start up
Jul 04 03:05:31 multipathd[1389]: sda: No SAS end device for 'end_device-0:0'
Jul 04 03:05:31 multipathd[1389]: sdj: No SAS end device for 'end_device-0:1'
Jul 04 03:05:31 multipathd[1389]: 35000cca03e0a22e8: addmap [0 5860533168 multipath 0 0 2 1 service-time 0 1 1 8:0 1 service-time 0 1 1 8:144 1]
Jul 04 03:05:31 multipathd[1389]: libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on 35000cca03e0a22e8 (253:0) failed: Device or resource busy
Jul 04 03:05:31 multipathd[1389]: dm_addmap: libdm task=0 error: Success
Jul 04 03:05:31 multipathd[1389]: 35000cca03e0a22e8: ignoring map
Jul 04 03:05:31 multipathd[1389]: sdb: No SAS end device for 'end_device-0:0'
Jul 04 03:05:31 multipathd[1389]: sdh: No SAS end device for 'end_device-0:1'
Jul 04 03:05:31 multipathd[1389]: 35000cca01aa99ec8: addmap [0 5860533168 multipath 0 0 2 1 service-time 0 1 1 8:16 1 service-time 0 1 1 8:112 1]
Jul 04 03:05:31 multipathd[1389]: libdevmapper: ioctl/libdm-iface.c(1927): device-mapper: reload ioctl on 35000cca01aa99ec8 (253:0) failed: Device or resource busy
Jul 04 03:05:31 multipathd[1389]: dm_addmap: libdm task=0 error: Success
Jul 04 03:05:31 multipathd[1389]: 35000cca01aa99ec8: ignoring map
...
% sudo udevadm info /sys/block/sda
P: /devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/port-0:0/expander-0:0/port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:0/block/sda
N: sda
L: 0
S: disk/by-id/scsi-SHITACHI_HUS723030ALS640_YVG5KU7K
S: disk/by-id/scsi-35000cca03e0a22e8
S: disk/by-path/pci-0000:02:00.0-sas-exp0x5006048009de2fbf-phy13-lun-0
S: disk/by-id/wwn-0x5000cca03e0a22e8
S: disk/by-uuid/dc660e4c-2cd8-442b-a7ec-5712a20d36ab
E: DEVPATH=/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/port-0:0/expander-0:0/port-0:0:0/end_device-0:0:0/target0:0:0/0:0:0:0/block/sda
E: DEVNAME=/dev/sda
E: DEVTYPE=disk
E: MAJOR=8
E: MINOR=0
E: SUBSYSTEM=block
E: USEC_INITIALIZED=10212315
E: SCSI_TPGS=0
E: SCSI_TYPE=disk
E: SCSI_VENDOR=HITACHI
E: SCSI_VENDOR_ENC=HITACHI\x20
E: SCSI_MODEL=HUS723030ALS640
E: SCSI_MODEL_ENC=HUS723030ALS640\x20
E: SCSI_REVISION=M2D4
E: ID_SCSI=1
E: ID_SCSI_INQUIRY=1
E: SCSI_IDENT_SERIAL=YVG5KU7K
E: SCSI_IDENT_LUN_NAA_REG=5000cca03e0a22e8
E: SCSI_IDENT_PORT_NAA_REG=5000cca03e0a22ea
E: SCSI_IDENT_PORT_RELATIVE=2
E: SCSI_IDENT_TARGET_NAA_REG=5000cca03e0a22eb
E: SCSI_IDENT_TARGET_NAME=naa.5000CCA03E0A22EB
E: ID_VENDOR=HITACHI
E: ID_VENDOR_ENC=HITACHI\x20
E: ID_MODEL=HUS723030ALS640
E: ID_MODEL_ENC=HUS723030ALS640\x20
E: ID_REVISION=M2D4
E: ID_TYPE=disk
E: ID_WWN_WITH_EXTENSION=0x5000cca03e0a22e8
E: ID_WWN=0x5000cca03e0a22e8
E: ID_BUS=scsi
E: ID_SERIAL=35000cca03e0a22e8
E: ID_SERIAL_SHORT=5000cca03e0a22e8
E: ID_SCSI_SERIAL=YVG5KU7K
E: MPATH_SBIN_PATH=/sbin
E: DM_MULTIPATH_DEVICE_PATH=0
E: ID_PATH=pci-0000:02:00.0-sas-exp0x5006048009de2fbf-phy13-lun-0
E: ID_PATH_TAG=pci-0000_02_00_0-sas-exp0x5006048009de2fbf-phy13-lun-0
E: ID_FS_UUID=dc660e4c-2cd8-442b-a7ec-5712a20d36ab
E: ID_FS_UUID_ENC=dc660e4c-2cd8-442b-a7ec-5712a20d36ab
E: ID_FS_TYPE=bcache
E: ID_FS_USAGE=other
E: DEVLINKS=/dev/disk/by-id/scsi-SHITACHI_HUS723030ALS640_YVG5KU7K /dev/disk/by-id/scsi-35000cca03e0a22e8 /dev/disk/by-path/pci-0000:02:00.0-sas-exp0x5006048009de2fbf-phy13-lun-0 /dev/disk/by-id/wwn-0x5000cca03e0a22e8 /dev/disk/by-uuid/dc660e4c-2cd8-442b-a7ec-5712a20d36ab
E: TAGS=:systemd:
E: CURRENT_TAGS=:systemd:
@mwilck
Copy link
Contributor

mwilck commented Jul 4, 2022

Please test with the latest code. We had some fixes for SAS expanders in 0.8.6.

@optix2000
Copy link
Author

I backported 0.8.8 which gets rid of the No SAS end device error, but still doesn't load multipath before the filesystem mounts.

% sudo multipath -u /dev/sda
DM_MULTIPATH_DEVICE_PATH="0"

I'll try compiling 0.9.0 this week and see if it changes anything.

@optix2000 optix2000 changed the title No SAS end device for 'end_device-0:0' multipathd starts after filesystem mounted Jul 4, 2022
@mwilck
Copy link
Contributor

mwilck commented Jul 4, 2022

Please set verbosity 3 in multipath.conf.

@optix2000
Copy link
Author

mpath.log

@mwilck
Copy link
Contributor

mwilck commented Jul 5, 2022

Please confirm which version you were using.

@mwilck
Copy link
Contributor

mwilck commented Jul 5, 2022

Jul 04 14:16:48 multipathd[919]: open '/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/port-0:1/expander-0:1/p
ort-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry'
Jul 04 14:16:48 multipathd[919]: overflow while reading from /sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/p
ort-0:1/expander-0:1/port-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry

Can you show me the content of this sysfs file?

@mwilck
Copy link
Contributor

mwilck commented Jul 5, 2022

We also need the udev logs. The actual problem occurs in the udev rules, when evaluating "multipath -u". You should have verbose logs from multipath in your journal already after setting verbosity 3, please provide the journal (journalctl -b -o short-monotonic). You may also want to enable udev logging (udev.log-priority=debug).

@mwilck
Copy link
Contributor

mwilck commented Jul 5, 2022

Jul 04 14:16:48 multipathd[919]: overflow while reading from /sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/p
ort-0:1/expander-0:1/port-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry

Hm, this would prevent ALUA support to be detected properly, but it shouldn't cause the path to be misdetected.
We need the udev logs. What's your find_multipaths setting? (try multipath -T)

@optix2000
Copy link
Author

Please confirm which version you were using.

0.8.8

Jul 04 14:16:48 multipathd[919]: open '/sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/port-0:1/expander-0:1/p
ort-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry'
Jul 04 14:16:48 multipathd[919]: overflow while reading from /sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/p
ort-0:1/expander-0:1/port-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry

Can you show me the content of this sysfs file?

% cat /sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/port-0:1/expander-0:1/port-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry
�����HITACHI HUS723030ALS640 M2D4YVG5KU7KCopyright Hitachi Corp. 2003. All rights reserved.

We also need the udev logs. The actual problem occurs in the udev rules, when evaluating "multipath -u". You should have verbose logs from multipath in your journal already after setting verbosity 3, please provide the journal (journalctl -b -o short-monotonic). You may also want to enable udev logging (udev.log-priority=debug).

The logs from earlier are from journalctl -b -u multipathd after a reboot.

Let me get you full logs with udev logs in a bit.

@optix2000
Copy link
Author

optix2000 commented Jul 5, 2022

Full boot log with udev debugging

journalctl -b -o short-monotonic

@optix2000
Copy link
Author

optix2000 commented Jul 5, 2022

Jul 04 14:16:48 multipathd[919]: overflow while reading from /sys/devices/pci0000:00/0000:00:02.0/0000:02:00.0/host0/p
ort-0:1/expander-0:1/port-0:1:3/end_device-0:1:3/target0:0:8/0:0:8:0/inquiry

Hm, this would prevent ALUA support to be detected properly, but it shouldn't cause the path to be misdetected. We need the udev logs. What's your find_multipaths setting? (try multipath -T)

% sudo multipath -T | grep find_multipaths
        find_multipaths "strict"

@mwilck
Copy link
Contributor

mwilck commented Jul 5, 2022

The problem is bcache. It seems to set up devices early on without checking for multipath. This seems to happen before user space (systemd) is even started.

@mwilck
Copy link
Contributor

mwilck commented Jul 5, 2022

Do you have bcache compiled into the kernel image? Probing bcache devices before user space starts up (systemd, udev, multipathd) can't work with multipath.

@optix2000
Copy link
Author

Do you have bcache compiled into the kernel image? Probing bcache devices before user space starts up (systemd, udev, multipathd) can't work with multipath.

% cat /lib/modules/$(uname -r)/modules.builtin | grep bcache
%

I don't believe so.

The problem is bcache. It seems to set up devices early on without checking for multipath. This seems to happen before user space (systemd) is even started.

That makes sense. I dug into it a bit more, but I'm not super familiar with how udev works in the initrd, but it doesn't look like the multipath udev rules exist in the initrd, but bcache rules do exist.

The weird part is I don't see any udev logs that early, so I'm not sure if udev is loading the module or something else is.

% lsinitramfs /boot/initrd.img-5.18.0-0.bpo.1-amd64 | grep -e 'multipath' -e 'bcache'
usr/lib/modules/5.18.0-0.bpo.1-amd64/kernel/drivers/md/bcache
usr/lib/modules/5.18.0-0.bpo.1-amd64/kernel/drivers/md/bcache/bcache.ko
usr/lib/modules/5.18.0-0.bpo.1-amd64/kernel/fs/mbcache.ko
usr/lib/udev/bcache-register
usr/lib/udev/probe-bcache
usr/lib/udev/rules.d/69-bcache.rules

@optix2000
Copy link
Author

So I tried removing the bcache udev rule in the initrd and multipath now works as expected.

However it's kind of hacky to have to rewrite the initrd by hand every time I do a kernel update. Is there a standard way to load multipath in the initrd?

@mwilck
Copy link
Contributor

mwilck commented Jul 6, 2022

Ah, this is Debian. I'm not too familiar how initramfs works on Debian (I'm a SUSE guy). I suppose you need to seek help in the distro forums. By all means, you should make sure that multipath support is included in the initrd. It's certainly possible to configure this, but I am not familiar enough with the Debian environment to tell. It seems that (on your system at least) systemd isn't used in the initramfs. So there must be some other logic to make sure modules are loaded in the correct order.

However it's kind of hacky to have to rewrite the initrd by han

Usually, the tools to create the initramfs have some logic to determine the modules needed to set up the storage stack. The logic seems to have failed for your multipath + bcache case.

@xosevp
Copy link
Contributor

xosevp commented Jul 6, 2022

Could you share your /etc/multipath.conf file ?

@xosevp
Copy link
Contributor

xosevp commented Jul 6, 2022

So I tried removing the bcache udev rule in the initrd and multipath now works as expected.

However it's kind of hacky to have to rewrite the initrd by hand every time I do a kernel update. Is there a standard way to load multipath in the initrd?

add "scsi_dh_alua scsi_dh_emc scsi_dh_rdac dm-multipath", ONE per line, to /etc/initramfs-tools/modules
run update-initramfs -v -u and reboot

With lsinitramfs -l you can check/list its content

@mwilck
Copy link
Contributor

mwilck commented Jul 6, 2022

That will load the kernel modules, but won't start multipathd in the initramfs, which would also be required in order to avoid the issue the OP described.

Or am I overlooking anything?

@optix2000
Copy link
Author

optix2000 commented Jul 7, 2022

Ah, this is Debian. I'm not too familiar how initramfs works on Debian (I'm a SUSE guy). I suppose you need to seek help in the distro forums. By all means, you should make sure that multipath support is included in the initrd. It's certainly possible to configure this, but I am not familiar enough with the Debian environment to tell. It seems that (on your system at least) systemd isn't used in the initramfs. So there must be some other logic to make sure modules are loaded in the correct order.

However it's kind of hacky to have to rewrite the initrd by han

Usually, the tools to create the initramfs have some logic to determine the modules needed to set up the storage stack. The logic seems to have failed for your multipath + bcache case.

Ah that makes sense. For some reason I assumed initramfs-tools wasn't Debian only.
Is there some documentation or example on what's necessary for multipath to run? Perhaps in SUSE? I can write up a hook in Debian to copy the necessary files to run

That being said, is it even possible to run multipathd as a one-shot to set up multipath in the initrd and then have systemd-launched multipathd to take over management? (eg is there some state that needs to be written while the fs is read-only?)

Could you share your /etc/multipath.conf file ?

There's no config. Everything is using the default built-ins.

@mwilck
Copy link
Contributor

mwilck commented Jul 7, 2022

Ah that makes sense. For some reason I assumed initramfs-tools wasn't Debian only. Is there some documentation or example on what's necessary for multipath to run? Perhaps in SUSE? I can write up a hook in Debian to copy the necessary files to run

SUSE and RedHat use dracut. That's what I'm familiar with. But I have no experience with bcache and multipath under dracut, either.

That being said, is it even possible to run multipathd as a one-shot to set up multipath in the initrd and then have systemd-launched multipathd to take over management? (eg is there some state that needs to be written while the fs is read-only?)

That sounds pretty backward to me. It's what we used to do ~15y ago. The problem is: when do you call multipath? It's usually in udev rules when the devices are probed, following some logic which devices should be part of multipath and which shouldn't. Space and time are insufficient here to explain all the details. These days, device activation should be handled by systemd and udev. But again, I really don't know how this is supposed to work under Debian.

@xosevp
Copy link
Contributor

xosevp commented Jul 7, 2022

That will load the kernel modules, but won't start multipathd in the initramfs, which would also be required in order to avoid the issue the OP described.

Or am I overlooking anything?

initram generation scripts "cleverly" do not include multipath files/modules if there is no at least one alive multipath device in the system.
And it have to be force to include multipath files/modules. But you are right, and this is only valid for dracut.
In Debian if the system is booting (root) from a multipath device, multipath-tools-boot must be installed. After that initramfs must be recreated.

PD: multipath-tools-boot should be installed ONLY if the system is booting (root) from a multipath device, otherwise it MUST be removed.

@xosevp
Copy link
Contributor

xosevp commented Jul 7, 2022

Ah that makes sense. For some reason I assumed initramfs-tools wasn't Debian only. Is there some documentation or example on what's necessary for multipath to run? Perhaps in SUSE? I can write up a hook in Debian to copy the necessary files to run

The Debian installer should do this automatically.
Debian docs are inside their package: https://salsa.debian.org/linux-blocks-team/multipath-tools/-/blob/master/debian/multipath-tools.README.Debian

You can test more recent live distros:
https://eu.edge.kernel.org/fedora/releases/36/Server/x86_64/iso/Fedora-Server-dvd-x86_64-36-1.5.iso
https://download.opensuse.org/distribution/leap/15.4/live/openSUSE-Leap-15.4-GNOME-Live-x86_64-Media.iso

I do not kown if "openSUSE Leap" runs multipathd by default, but "Fedora Server" does.

@mwilck
Copy link
Contributor

mwilck commented Jul 7, 2022

I do not kown if "openSUSE Leap" runs multipathd by default, but "Fedora Server" does.

multipath-tools is installed by default, but the service is only active after installation if multipath devices were present during installation. In this case the user will see a prompt whether multipath should be enabled, and if she chooses "yes", the service will be active after installation in both initramfs and installed system. But this is getting OT I fear.

@optix2000
Copy link
Author

I think it's pretty clear this isn't an issue with multipath-tools.

Hopefully this helps others who are running into the same issue.

Thanks for diving into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants