New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenStack Cinder can report an incorrect device path for volume attachments #33128
Comments
|
What the above shows is:
As the Cinder/Nova teams haven't fixed this long standing issue in many years, and defer the Libvirt (who provide no guarantee, and consider the device name no more than an "ordering hint"[1], we should workaround the issue by inspecting the drives serial number, as represented in /dev/disk/by-id. Parts of Kubernetes already do this2, while others do not3. [1]: http://libvirt.org/formatdomain.html#elementsDisks - "target" section:
|
cc @anguslees @mikedanese @kubernetes/sig-openstack |
From some hunting around, it appears the above virtio-$trunc_uuid strategy might only work on kvm, just in case we mistakenly think this is easy to solve ... |
To add the Angus's point, the general pattern also applies to OpenStack with ESXi - the difference being you search for "/dev/disk/by-id/wwn-0x{TRUNCATED CINDER UUID}". I suspect this general pattern of truncated serial number applies more often than it fails.
|
See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher (the parts executed in kube-controller-manager) to return the volume ID, rather than the device name as advertised by Cinder. We then rework the cinder volume attacher (this time, the parts executed in kubelet) to accept this ID, and call the pre-existing GetDevicePath method, will will perform the discovery correctly. This is a break in the Attacher interface, which explicitly calls for the Attach method to return a device name.
I've put together a patch to our 1.3 branch (linked above) that seems to work reasonably well. It's virtio (aka KVM) only, but, the code was KVM only anyway already. I'll add ESXi support separately, as that's an smaller fix to existing code.. |
Heh - left out the question. Thoughts on this approach? Should I go ahead and port to master + submit? |
Dumping the results of my research here for posterity:
I don't like any of the options, but I conclude that looking for the two |
Ok - I'll forward port my 1.3 patch to master, add wwn for ESX, and fallback to whatever it is cinder tells us the device is. |
@anguslees if you have a Rackspace Cloud account, could you attach a volume, let me know the ID, and show me these?
|
Ubuntu 14.04 (kernel 3.13.0-79-generic) on Rackspace / PVHVM: % ls -lah /dev/disk/by-id
ls: cannot access /dev/disk/by-id: No such file or directory
% ls -lah /dev/disk/by-uuid
lrwxrwxrwx 1 root root 11 Jun 7 05:24 234299c3-9cad-4a86-9a83-d74600cd8ed1 -> ../../xvdb1
lrwxrwxrwx 1 root root 11 Jun 2 05:24 49265908-09ae-45ed-b585-b856e321e175 -> ../../xvdc1
lrwxrwxrwx 1 root root 11 Jun 2 05:27 96aba36f-f22d-4475-b1fd-5bd700e1c2fb -> ../../xvda1
% ls -lah /dev/disk/by-label
ls: cannot access /dev/disk/by-label: No such file or directory
% sudo lshw -class disk -class storage
*-ide
description: IDE interface
product: 82371SB PIIX3 IDE [Natoma/Triton II]
vendor: Intel Corporation
physical id: 1.1
bus info: pci@0000:00:01.1
version: 00
width: 32 bits
clock: 33MHz
capabilities: ide bus_master
configuration: driver=ata_piix latency=64
resources: irq:0 ioport:1f0(size=8) ioport:3f6 ioport:170(size=8) ioport:376 ioport:c420(size=16)
*-scsi
description: SCSI storage controller
product: Xen Platform Device
vendor: XenSource, Inc.
physical id: 3
bus info: pci@0000:00:03.0
version: 01
width: 32 bits
clock: 33MHz
capabilities: scsi bus_master
configuration: driver=xen-platform-pci latency=0
resources: irq:30 ioport:c000(size=256) memory:f2000000-f2ffffff
% sudo udevadm info -q all -n /dev/xvdb
P: /devices/vbd-832/block/xvdb
N: xvdb
E: DEVNAME=/dev/xvdb
E: DEVPATH=/devices/vbd-832/block/xvdb
E: DEVTYPE=disk
E: ID_PART_TABLE_TYPE=dos
E: MAJOR=202
E: MINOR=16
E: SUBSYSTEM=block
E: USEC_INITIALIZED=1924951422
% sudo udevadm info -q all -n /dev/xvdb1
P: /devices/vbd-832/block/xvdb/xvdb1
N: xvdb1
S: disk/by-uuid/234299c3-9cad-4a86-9a83-d74600cd8ed1
E: DEVLINKS=/dev/disk/by-uuid/234299c3-9cad-4a86-9a83-d74600cd8ed1
E: DEVNAME=/dev/xvdb1
E: DEVPATH=/devices/vbd-832/block/xvdb/xvdb1
E: DEVTYPE=partition
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=234299c3-9cad-4a86-9a83-d74600cd8ed1
E: ID_FS_UUID_ENC=234299c3-9cad-4a86-9a83-d74600cd8ed1
E: ID_FS_VERSION=1.0
E: ID_PART_ENTRY_DISK=202:16
E: ID_PART_ENTRY_NUMBER=1
E: ID_PART_ENTRY_OFFSET=63
E: ID_PART_ENTRY_SCHEME=dos
E: ID_PART_ENTRY_SIZE=209715137
E: ID_PART_ENTRY_TYPE=0x83
E: ID_PART_TABLE_TYPE=dos
E: MAJOR=202
E: MINOR=17
E: SUBSYSTEM=block
E: USEC_INITIALIZED=1983969339 |
This has been unused since 542f2dc, and relies on deviceName, which can no longer be relied upon (see issue kubernetes#33128). This needs to be removed now, as part of kubernetes#33128, as the code can't be updated to attempt device detection and fallback through to the Cinder provided deviceName, as detection "fails" when the device is gone, and if cinder has reported a deviceName that another volume has used in relaity, then this will block forever (or until the other, unreleated, volume has been detached)
See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
This has been unused since 542f2dc, and relies on deviceName, which can no longer be relied upon (see issue kubernetes#33128). This needs to be removed now, as part of kubernetes#33128, as the code can't be updated to attempt device detection and fallback through to the Cinder provided deviceName, as detection "fails" when the device is gone, and if cinder has reported a deviceName that another volume has used in relaity, then this will block forever (or until the other, unreleated, volume has been detached)
See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
FYI - I've reached out to someone on the Rackspace Block Storage team to try get some definitive answers around if and how this issue affects Rackspace. |
See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
This has been unused since 542f2dc, and relies on deviceName, which can no longer be relied upon (see issue kubernetes#33128). This needs to be removed now, as part of kubernetes#33128, as the code can't be updated to attempt device detection and fallback through to the Cinder provided deviceName, as detection "fails" when the device is gone, and if cinder has reported a deviceName that another volume has used in relaity, then this will block forever (or until the other, unreleated, volume has been detached)
See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
Automatic merge from submit-queue Don't rely on device name provided by Cinder See issue #33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
Automatic merge from submit-queue Remove unused WaitForDetach from Detacher interface and plugins See issue #33128 and PR #33270 We can't rely on the device name provided by OpenStack Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This needs to be removed now, as part of #33128, as the code can't be updated to attempt device detection and fallback through to the Cinder provided deviceName, as detection "fails" when the device is gone, and if cinder has reported a deviceName that another volume has used in relaity, then this will block forever (or until the other, unreleated, volume has been detached)
PR for this has merged. |
This has been unused since 542f2dc, and relies on deviceName, which can no longer be relied upon (see issue kubernetes#33128). This needs to be removed now, as part of kubernetes#33128, as the code can't be updated to attempt device detection and fallback through to the Cinder provided deviceName, as detection "fails" when the device is gone, and if cinder has reported a deviceName that another volume has used in relaity, then this will block forever (or until the other, unreleated, volume has been detached)
See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
Automatic merge from submit-queue Don't rely on device name provided by Cinder See issue kubernetes#33128 We can't rely on the device name provided by Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This patch re-works the cinder volume attacher to ignore the supplied deviceName, and instead defer to the pre-existing GetDevicePath method to discover the device path based on it's serial number and /dev/disk/by-id mapping. This new behavior is controller by a config option, as falling back to the cinder value when we can't discover a device would risk devices not showing up, falling back to cinder's guess, and detecting the wrong disk as attached.
…tfordetach Automatic merge from submit-queue Remove unused WaitForDetach from Detacher interface and plugins See issue kubernetes#33128 and PR kubernetes#33270 We can't rely on the device name provided by OpenStack Cinder, and thus must perform detection based on the drive serial number (aka It's cinder ID) on the kubelet itself. This needs to be removed now, as part of kubernetes#33128, as the code can't be updated to attempt device detection and fallback through to the Cinder provided deviceName, as detection "fails" when the device is gone, and if cinder has reported a deviceName that another volume has used in relaity, then this will block forever (or until the other, unreleated, volume has been detached)
PV Watchdog automating manual procedures of cisco SOP regarding: kubernetes/cloud-provider-openstack#150 kubernetes/kubernetes#33128 - watches on events for pods - deletes a pod - that has relevant cinder emptyPath event - is in Pending phase - hasn't been deleted in past 60 sec
Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"3+", GitVersion:"v1.3.7-hpe.1", GitCommit:"16372fb71140a39a119c87559662e14b5ec0366a", GitTreeState:"clean", BuildDate:"2016-09-03T00:47:27Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3+", GitVersion:"v1.3.7-hpe.1.1+ccaa249c8adedd", GitCommit:"ccaa249c8adeddaca4bd0fd5472714e45091060d", GitTreeState:"clean", BuildDate:"2016-09-20T17:19:59Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Environment:
OpenStack Liberty, Nova with KVM, Cinder with LVM
NAME="Ubuntu"
VERSION="14.04.5 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.5 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
uname -a
):Linux hcp-kubernetes-master-14be63a4-7f33-11e6-a329-fa163ed6dc83 4.4.0-38-generic #57~14.04.1-Ubuntu SMP Tue Sep 6 17:20:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
N/A
N/A
What happened:
Creating and attaching multiple Cinder volumes to a Nova instance, Cinder can report an incorrect device name via it's API. For example, cinder may report /dev/vdc, while in reality, the volume is attached to /dev/vdd. This leads to Kubernetes failing to mount the device.
What you expected to happen:
Kubernetes should avoid using the known-broken data supplied by Cinder, and detect the device path based on the cinder volume ID, supplied to the instance as the drive serial number.
How to reproduce it (as minimally and precisely as possible):
Reliable reproduction is unknown, the gist of it is - Boot a Nova instance and 20 volumes, attach and detach the volumes many times, inspect the Cinder API reported device name alongside the actual device name.
Anything else do we need to know:
More to follow in a comment, in order to keep the noise out of the overall issue smaller
The text was updated successfully, but these errors were encountered: