Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenStack Cinder can report an incorrect device path for volume attachments #33128

Closed
kiall opened this issue Sep 20, 2016 · 13 comments
Closed
Labels
area/provider/openstack Issues or PRs related to openstack provider

Comments

@kiall
Copy link

kiall commented Sep 20, 2016

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"3+", GitVersion:"v1.3.7-hpe.1", GitCommit:"16372fb71140a39a119c87559662e14b5ec0366a", GitTreeState:"clean", BuildDate:"2016-09-03T00:47:27Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3+", GitVersion:"v1.3.7-hpe.1.1+ccaa249c8adedd", GitCommit:"ccaa249c8adeddaca4bd0fd5472714e45091060d", GitTreeState:"clean", BuildDate:"2016-09-20T17:19:59Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration:

OpenStack Liberty, Nova with KVM, Cinder with LVM

  • OS (e.g. from /etc/os-release):

NAME="Ubuntu"
VERSION="14.04.5 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.5 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"

  • Kernel (e.g. uname -a):

Linux hcp-kubernetes-master-14be63a4-7f33-11e6-a329-fa163ed6dc83 4.4.0-38-generic #57~14.04.1-Ubuntu SMP Tue Sep 6 17:20:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools:

N/A

  • Others:

N/A

What happened:

Creating and attaching multiple Cinder volumes to a Nova instance, Cinder can report an incorrect device name via it's API. For example, cinder may report /dev/vdc, while in reality, the volume is attached to /dev/vdd. This leads to Kubernetes failing to mount the device.

What you expected to happen:

Kubernetes should avoid using the known-broken data supplied by Cinder, and detect the device path based on the cinder volume ID, supplied to the instance as the drive serial number.

How to reproduce it (as minimally and precisely as possible):

Reliable reproduction is unknown, the gist of it is - Boot a Nova instance and 20 volumes, attach and detach the volumes many times, inspect the Cinder API reported device name alongside the actual device name.

Anything else do we need to know:

More to follow in a comment, in order to keep the noise out of the overall issue smaller

@kiall
Copy link
Author

kiall commented Sep 20, 2016

$ cinder list
+--------------------------------------+--------+--------------------------------------------------------------------+------+-------------+----------+-------------+--------------------------------------+
|                  ID                  | Status |                                Name                                | Size | Volume Type | Bootable | Multiattach |             Attached to              |
+--------------------------------------+--------+--------------------------------------------------------------------+------+-------------+----------+-------------+--------------------------------------+
| 0b8c4534-50fc-436e-9398-3c1c8e380b6d | in-use |                             pgsql-data                             |  10  |      -      |  false   |    False    | aadca8e9-7863-43f2-b7db-bcca7d9e8566 |
| 65953b61-0af1-42f6-af9a-bd5b0a55acec | in-use |                             pgsql-data                             |  10  |      -      |  false   |    False    | aadca8e9-7863-43f2-b7db-bcca7d9e8566 |
| 942154e3-cabf-4d11-8fcc-654d5c5f352a | in-use |                             pgsql-data                             |  10  |      -      |  false   |    False    | aadca8e9-7863-43f2-b7db-bcca7d9e8566 |
| a2c6e5ae-9b27-4989-92c5-c71240b2a83f | in-use |                            pgpool-data                             |  1   |      -      |  false   |    False    | aadca8e9-7863-43f2-b7db-bcca7d9e8566 |
+--------------------------------------+--------+--------------------------------------------------------------------+------+-------------+----------+-------------+--------------------------------------+


$ cinder show 0b8c4534-50fc-436e-9398-3c1c8e380b6d
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                Property               |                                                                                                                                  Value                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|              attachments              | [{u'server_id': u'aadca8e9-7863-43f2-b7db-bcca7d9e8566', u'attachment_id': u'16354013-752c-4141-8c6a-e02dceb7c795', u'host_name': None, u'volume_id': u'0b8c4534-50fc-436e-9398-3c1c8e380b6d', u'device': u'/dev/vde', u'id': u'0b8c4534-50fc-436e-9398-3c1c8e380b6d'}] |
|           availability_zone           |                                                                                                                                   nova                                                                                                                                  |
|                bootable               |                                                                                                                                  false                                                                                                                                  |
|          consistencygroup_id          |                                                                                                                                   None                                                                                                                                  |
|               created_at              |                                                                                                                        2016-09-20T13:36:49.000000                                                                                                                       |
|              description              |                                                                                                                                   None                                                                                                                                  |
|               encrypted               |                                                                                                                                  False                                                                                                                                  |
|                   id                  |                                                                                                                   0b8c4534-50fc-436e-9398-3c1c8e380b6d                                                                                                                  |
|                metadata               |                                                                             {u'readonly': u'False', u'KubernetesCluster': u'12a05a2b-7f33-11e6-a329-fa163ed6dc83', u'attached_mode': u'rw'}                                                                             |
|              multiattach              |                                                                                                                                  False                                                                                                                                  |
|                  name                 |                                                                                                                                pgsql-data                                                                                                                               |
|      os-vol-tenant-attr:tenant_id     |                                                                                                                     4769565249f8452397e184fb9bf82582                                                                                                                    |
|   os-volume-replication:driver_data   |                                                                                                                                   None                                                                                                                                  |
| os-volume-replication:extended_status |                                                                                                                                   None                                                                                                                                  |
|           replication_status          |                                                                                                                                 disabled                                                                                                                                |
|                  size                 |                                                                                                                                    10                                                                                                                                   |
|              snapshot_id              |                                                                                                                                   None                                                                                                                                  |
|              source_volid             |                                                                                                                                   None                                                                                                                                  |
|                 status                |                                                                                                                                  in-use                                                                                                                                 |
|                user_id                |                                                                                                                     1d2ee82c96a54fdfb917e811341b54db                                                                                                                    |
|              volume_type              |                                                                                                                                   None                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


$ cinder show 65953b61-0af1-42f6-af9a-bd5b0a55acec
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                Property               |                                                                                                                                  Value                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|              attachments              | [{u'server_id': u'aadca8e9-7863-43f2-b7db-bcca7d9e8566', u'attachment_id': u'3ad954fc-0d08-41f6-98c0-470d1ff211fd', u'host_name': None, u'volume_id': u'65953b61-0af1-42f6-af9a-bd5b0a55acec', u'device': u'/dev/vdd', u'id': u'65953b61-0af1-42f6-af9a-bd5b0a55acec'}] |
|           availability_zone           |                                                                                                                                   nova                                                                                                                                  |
|                bootable               |                                                                                                                                  false                                                                                                                                  |
|          consistencygroup_id          |                                                                                                                                   None                                                                                                                                  |
|               created_at              |                                                                                                                        2016-09-20T13:37:01.000000                                                                                                                       |
|              description              |                                                                                                                                   None                                                                                                                                  |
|               encrypted               |                                                                                                                                  False                                                                                                                                  |
|                   id                  |                                                                                                                   65953b61-0af1-42f6-af9a-bd5b0a55acec                                                                                                                  |
|                metadata               |                                                                             {u'readonly': u'False', u'KubernetesCluster': u'12a05a2b-7f33-11e6-a329-fa163ed6dc83', u'attached_mode': u'rw'}                                                                             |
|              multiattach              |                                                                                                                                  False                                                                                                                                  |
|                  name                 |                                                                                                                                pgsql-data                                                                                                                               |
|      os-vol-tenant-attr:tenant_id     |                                                                                                                     4769565249f8452397e184fb9bf82582                                                                                                                    |
|   os-volume-replication:driver_data   |                                                                                                                                   None                                                                                                                                  |
| os-volume-replication:extended_status |                                                                                                                                   None                                                                                                                                  |
|           replication_status          |                                                                                                                                 disabled                                                                                                                                |
|                  size                 |                                                                                                                                    10                                                                                                                                   |
|              snapshot_id              |                                                                                                                                   None                                                                                                                                  |
|              source_volid             |                                                                                                                                   None                                                                                                                                  |
|                 status                |                                                                                                                                  in-use                                                                                                                                 |
|                user_id                |                                                                                                                     1d2ee82c96a54fdfb917e811341b54db                                                                                                                    |
|              volume_type              |                                                                                                                                   None                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


$ cinder show 942154e3-cabf-4d11-8fcc-654d5c5f352a
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                Property               |                                                                                                                                  Value                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|              attachments              | [{u'server_id': u'aadca8e9-7863-43f2-b7db-bcca7d9e8566', u'attachment_id': u'd030d97b-a3cc-434b-be77-b9463b89f560', u'host_name': None, u'volume_id': u'942154e3-cabf-4d11-8fcc-654d5c5f352a', u'device': u'/dev/vdb', u'id': u'942154e3-cabf-4d11-8fcc-654d5c5f352a'}] |
|           availability_zone           |                                                                                                                                   nova                                                                                                                                  |
|                bootable               |                                                                                                                                  false                                                                                                                                  |
|          consistencygroup_id          |                                                                                                                                   None                                                                                                                                  |
|               created_at              |                                                                                                                        2016-09-20T13:36:37.000000                                                                                                                       |
|              description              |                                                                                                                                   None                                                                                                                                  |
|               encrypted               |                                                                                                                                  False                                                                                                                                  |
|                   id                  |                                                                                                                   942154e3-cabf-4d11-8fcc-654d5c5f352a                                                                                                                  |
|                metadata               |                                                                             {u'readonly': u'False', u'KubernetesCluster': u'12a05a2b-7f33-11e6-a329-fa163ed6dc83', u'attached_mode': u'rw'}                                                                             |
|              multiattach              |                                                                                                                                  False                                                                                                                                  |
|                  name                 |                                                                                                                                pgsql-data                                                                                                                               |
|      os-vol-tenant-attr:tenant_id     |                                                                                                                     4769565249f8452397e184fb9bf82582                                                                                                                    |
|   os-volume-replication:driver_data   |                                                                                                                                   None                                                                                                                                  |
| os-volume-replication:extended_status |                                                                                                                                   None                                                                                                                                  |
|           replication_status          |                                                                                                                                 disabled                                                                                                                                |
|                  size                 |                                                                                                                                    10                                                                                                                                   |
|              snapshot_id              |                                                                                                                                   None                                                                                                                                  |
|              source_volid             |                                                                                                                                   None                                                                                                                                  |
|                 status                |                                                                                                                                  in-use                                                                                                                                 |
|                user_id                |                                                                                                                     1d2ee82c96a54fdfb917e811341b54db                                                                                                                    |
|              volume_type              |                                                                                                                                   None                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


$ cinder show a2c6e5ae-9b27-4989-92c5-c71240b2a83f
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                Property               |                                                                                                                                  Value                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|              attachments              | [{u'server_id': u'aadca8e9-7863-43f2-b7db-bcca7d9e8566', u'attachment_id': u'2d744ec0-f946-4ca7-a284-0682129ea0dd', u'host_name': None, u'volume_id': u'a2c6e5ae-9b27-4989-92c5-c71240b2a83f', u'device': u'/dev/vdc', u'id': u'a2c6e5ae-9b27-4989-92c5-c71240b2a83f'}] |
|           availability_zone           |                                                                                                                                   nova                                                                                                                                  |
|                bootable               |                                                                                                                                  false                                                                                                                                  |
|          consistencygroup_id          |                                                                                                                                   None                                                                                                                                  |
|               created_at              |                                                                                                                        2016-09-20T13:37:13.000000                                                                                                                       |
|              description              |                                                                                                                                   None                                                                                                                                  |
|               encrypted               |                                                                                                                                  False                                                                                                                                  |
|                   id                  |                                                                                                                   a2c6e5ae-9b27-4989-92c5-c71240b2a83f                                                                                                                  |
|                metadata               |                                                                             {u'readonly': u'False', u'KubernetesCluster': u'12a05a2b-7f33-11e6-a329-fa163ed6dc83', u'attached_mode': u'rw'}                                                                             |
|              multiattach              |                                                                                                                                  False                                                                                                                                  |
|                  name                 |                                                                                                                               pgpool-data                                                                                                                               |
|      os-vol-tenant-attr:tenant_id     |                                                                                                                     4769565249f8452397e184fb9bf82582                                                                                                                    |
|   os-volume-replication:driver_data   |                                                                                                                                   None                                                                                                                                  |
| os-volume-replication:extended_status |                                                                                                                                   None                                                                                                                                  |
|           replication_status          |                                                                                                                                 disabled                                                                                                                                |
|                  size                 |                                                                                                                                    1                                                                                                                                    |
|              snapshot_id              |                                                                                                                                   None                                                                                                                                  |
|              source_volid             |                                                                                                                                   None                                                                                                                                  |
|                 status                |                                                                                                                                  in-use                                                                                                                                 |
|                user_id                |                                                                                                                     1d2ee82c96a54fdfb917e811341b54db                                                                                                                    |
|              volume_type              |                                                                                                                                   None                                                                                                                                  |
+---------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ find .
.
./by-id
./by-id/virtio-a2c6e5ae-9b27-4989-9
./by-id/virtio-0b8c4534-50fc-436e-9
./by-id/virtio-65953b61-0af1-42f6-a
./by-id/virtio-942154e3-cabf-4d11-8
./by-uuid
./by-uuid/81968278-0e6a-478f-b093-d61d5b3b8f31
./by-uuid/89619621-6412-46f6-9689-8f9407ba91ea
./by-uuid/92d61a84-fa23-4b1f-9171-fc75bb58bf26
./by-uuid/6c12f5c2-cc09-45b7-a6d9-0064a76afbd8
./by-uuid/d4302c1c-a752-4e5c-a739-eb8c311cd3f0
./by-label
./by-label/cloudimg-rootfs


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ ls /dev/vd*
/dev/vda  /dev/vda1  /dev/vdb  /dev/vdc  /dev/vde  /dev/vdf


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ dmesg | grep vd | grep -v intel_powerclamp
[ 8907.419544] Buffer I/O error on dev vdd, logical block 1081344, lost sync page write
[ 8907.420989] JBD2: Error -5 detected when updating journal superblock for vdd-8.
[ 8907.422222] Aborting journal on device vdd-8.
[ 8907.423283] Buffer I/O error on dev vdd, logical block 1081344, lost sync page write
[ 8907.424489] JBD2: Error -5 detected when updating journal superblock for vdd-8.
[ 9436.785297] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: (null)
[ 9436.827118] EXT4-fs (vdb): re-mounted. Opts: (null)
[ 9440.369851] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: (null)
[ 9440.386677] EXT4-fs (vdc): re-mounted. Opts: (null)
[ 9467.046560] EXT4-fs (vde): VFS: Can't find ext4 filesystem
[ 9473.026512] EXT4-fs (vde): mounted filesystem with ordered data mode. Opts: (null)
[ 9473.045747] EXT4-fs (vde): re-mounted. Opts: (null)
[11823.350893] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: (null)
[11823.355428] EXT4-fs (vdb): re-mounted. Opts: (null)
[11833.317758] EXT4-fs (vde): mounted filesystem with ordered data mode. Opts: (null)
[11833.322239] EXT4-fs (vde): re-mounted. Opts: (null)
[11877.435588] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: (null)
[11877.442831] EXT4-fs (vdc): re-mounted. Opts: (null)


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo lshw -class disk -class storage
  *-ide                   
       description: IDE interface
       product: 82371SB PIIX3 IDE [Natoma/Triton II]
       vendor: Intel Corporation
       physical id: 1.1
       bus info: pci@0000:00:01.1
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: ide bus_master
       configuration: driver=ata_piix latency=0
       resources: irq:0 ioport:1f0(size=8) ioport:3f6 ioport:170(size=8) ioport:376 ioport:c0a0(size=16)
  *-scsi:0
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 4
       bus info: pci@0000:00:04.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:11 ioport:c000(size=64) memory:febd2000-febd2fff
  *-scsi:1
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: e
       bus info: pci@0000:00:0e.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:11 ioport:1000(size=64) memory:c0000000-c0000fff
  *-scsi:2
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: f
       bus info: pci@0000:00:0f.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:10 ioport:1040(size=64) memory:c0001000-c0001fff
  *-scsi:3
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 10
       bus info: pci@0000:00:10.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:11 ioport:1080(size=64) memory:c0002000-c0002fff
  *-scsi:4
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 11
       bus info: pci@0000:00:11.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:10 ioport:10c0(size=64) memory:c0003000-c0003fff


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vda | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:04.0/virtio1/block/vda


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vdb | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:0e.0/virtio3/block/vdb


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vdc | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:0f.0/virtio4/block/vdc


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vde | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:10.0/virtio5/block/vde


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vdf | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:11.0/virtio6/block/vdf


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/sys/devices/pci0000:00$ sudo udevadm info -q all -n /dev/disk/by-id/virtio-0b8c4534-50fc-436e-9 | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:10.0/virtio5/block/vde


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/sys/devices/pci0000:00$ sudo udevadm info -q all -n /dev/disk/by-id/virtio-65953b61-0af1-42f6-a | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:0f.0/virtio4/block/vdc


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/sys/devices/pci0000:00$ sudo udevadm info -q all -n /dev/disk/by-id/virtio-942154e3-cabf-4d11-8 | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:0e.0/virtio3/block/vdb


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/sys/devices/pci0000:00$ sudo udevadm info -q all -n /dev/disk/by-id/virtio-a2c6e5ae-9b27-4989-9 | grep DEVPATH
E: DEVPATH=/devices/pci0000:00/0000:00:11.0/virtio6/block/vdf


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vdb
P: /devices/pci0000:00/0000:00:0e.0/virtio3/block/vdb
N: vdb
S: disk/by-id/virtio-942154e3-cabf-4d11-8
S: disk/by-uuid/6c12f5c2-cc09-45b7-a6d9-0064a76afbd8
E: DEVLINKS=/dev/disk/by-id/virtio-942154e3-cabf-4d11-8 /dev/disk/by-uuid/6c12f5c2-cc09-45b7-a6d9-0064a76afbd8
E: DEVNAME=/dev/vdb
E: DEVPATH=/devices/pci0000:00/0000:00:0e.0/virtio3/block/vdb
E: DEVTYPE=disk
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=6c12f5c2-cc09-45b7-a6d9-0064a76afbd8
E: ID_FS_UUID_ENC=6c12f5c2-cc09-45b7-a6d9-0064a76afbd8
E: ID_FS_VERSION=1.0
E: ID_SERIAL=942154e3-cabf-4d11-8
E: MAJOR=253
E: MINOR=16
E: SUBSYSTEM=block
E: USEC_INITIALIZED=811686198


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vdc
P: /devices/pci0000:00/0000:00:0f.0/virtio4/block/vdc
N: vdc
S: disk/by-id/virtio-65953b61-0af1-42f6-a
S: disk/by-uuid/92d61a84-fa23-4b1f-9171-fc75bb58bf26
E: DEVLINKS=/dev/disk/by-id/virtio-65953b61-0af1-42f6-a /dev/disk/by-uuid/92d61a84-fa23-4b1f-9171-fc75bb58bf26
E: DEVNAME=/dev/vdc
E: DEVPATH=/devices/pci0000:00/0000:00:0f.0/virtio4/block/vdc
E: DEVTYPE=disk
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=92d61a84-fa23-4b1f-9171-fc75bb58bf26
E: ID_FS_UUID_ENC=92d61a84-fa23-4b1f-9171-fc75bb58bf26
E: ID_FS_VERSION=1.0
E: ID_SERIAL=65953b61-0af1-42f6-a
E: MAJOR=253
E: MINOR=32
E: SUBSYSTEM=block
E: USEC_INITIALIZED=818208253


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vde
P: /devices/pci0000:00/0000:00:10.0/virtio5/block/vde
N: vde
S: disk/by-id/virtio-0b8c4534-50fc-436e-9
S: disk/by-uuid/89619621-6412-46f6-9689-8f9407ba91ea
E: DEVLINKS=/dev/disk/by-id/virtio-0b8c4534-50fc-436e-9 /dev/disk/by-uuid/89619621-6412-46f6-9689-8f9407ba91ea
E: DEVNAME=/dev/vde
E: DEVPATH=/devices/pci0000:00/0000:00:10.0/virtio5/block/vde
E: DEVTYPE=disk
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=89619621-6412-46f6-9689-8f9407ba91ea
E: ID_FS_UUID_ENC=89619621-6412-46f6-9689-8f9407ba91ea
E: ID_FS_VERSION=1.0
E: ID_SERIAL=0b8c4534-50fc-436e-9
E: MAJOR=253
E: MINOR=64
E: SUBSYSTEM=block
E: USEC_INITIALIZED=824261879


ubuntu@hcp-kubernetes-node-3c6c7cfe-7f34-11e6-a329-fa163ed6dc83:/dev/disk$ sudo udevadm info -q all -n /dev/vdf
P: /devices/pci0000:00/0000:00:11.0/virtio6/block/vdf
N: vdf
S: disk/by-id/virtio-a2c6e5ae-9b27-4989-9
S: disk/by-uuid/81968278-0e6a-478f-b093-d61d5b3b8f31
E: DEVLINKS=/dev/disk/by-id/virtio-a2c6e5ae-9b27-4989-9 /dev/disk/by-uuid/81968278-0e6a-478f-b093-d61d5b3b8f31
E: DEVNAME=/dev/vdf
E: DEVPATH=/devices/pci0000:00/0000:00:11.0/virtio6/block/vdf
E: DEVTYPE=disk
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=81968278-0e6a-478f-b093-d61d5b3b8f31
E: ID_FS_UUID_ENC=81968278-0e6a-478f-b093-d61d5b3b8f31
E: ID_FS_VERSION=1.0
E: ID_SERIAL=a2c6e5ae-9b27-4989-9
E: MAJOR=253
E: MINOR=80
E: SUBSYSTEM=block
E: USEC_INITIALIZED=865454133

@kiall
Copy link
Author

kiall commented Sep 20, 2016

What the above shows is:

  • 4 volumes, 1 instance.
  • Cinder reports volumes attached as vdb, vdc, vdd and vde.
  • Guest OS has: vdb, vdc, vde, vdf (in addition to vda, the root disk)
    • vdb and vde are correct on both sides.
    • Cinder's vdc is the Guests vdf
    • Cinder's vdd is the Guests vdc
  • All devices show under /dev/disk/by-id/, using a name following "virtio-{TRUNCATED CINDER UUID}"

As the Cinder/Nova teams haven't fixed this long standing issue in many years, and defer the Libvirt (who provide no guarantee, and consider the device name no more than an "ordering hint"[1], we should workaround the issue by inspecting the drives serial number, as represented in /dev/disk/by-id.

Parts of Kubernetes already do this2, while others do not3.

[1]: http://libvirt.org/formatdomain.html#elementsDisks - "target" section:

The target element controls the bus / device under which the disk is exposed to the guest OS. The dev attribute indicates the "logical" device name. The actual device name specified is not guaranteed to map to the device name in the guest OS. Treat it as a device ordering hint.

@kiall
Copy link
Author

kiall commented Sep 20, 2016

cc @anguslees @mikedanese @kubernetes/sig-openstack

@anguslees
Copy link
Member

From some hunting around, it appears the above virtio-$trunc_uuid strategy might only work on kvm, just in case we mistakenly think this is easy to solve ...

@kiall
Copy link
Author

kiall commented Sep 21, 2016

To add the Angus's point, the general pattern also applies to OpenStack with ESXi - the difference being you search for "/dev/disk/by-id/wwn-0x{TRUNCATED CINDER UUID}". I suspect this general pattern of truncated serial number applies more often than it fails.

fmt.Sprintf("/dev/disk/by-id/virtio-%s", volumeID[:20])
vs
fmt.Sprintf("/dev/disk/by-id/wwn-0x%s", strings.Replace(volumeID, "-", "", -1))

kiall pushed a commit to hpcloud/kubernetes that referenced this issue Sep 21, 2016
See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher (the parts executed in
kube-controller-manager) to return the volume ID, rather than the device
name as advertised by Cinder. We then rework the cinder volume attacher
(this time, the parts executed in kubelet) to accept this ID, and call the
pre-existing GetDevicePath method, will will perform the discovery
correctly.

This is a break in the Attacher interface, which explicitly calls for the
Attach method to return a device name.
@kiall
Copy link
Author

kiall commented Sep 21, 2016

I've put together a patch to our 1.3 branch (linked above) that seems to work reasonably well. It's virtio (aka KVM) only, but, the code was KVM only anyway already. I'll add ESXi support separately, as that's an smaller fix to existing code..

@kiall
Copy link
Author

kiall commented Sep 21, 2016

Heh - left out the question. Thoughts on this approach? Should I go ahead and port to master + submit?

@anguslees
Copy link
Member

anguslees commented Sep 22, 2016

Dumping the results of my research here for posterity:

  • The (2012) nova bug that discussed/implemented a lot of this originally was 1004328. Many of the relevant libvirt/kvm changes are linked from there.
  • KVM/virtio exposes the first 20 chars of UUID as virtio serial number (as seen above)
  • ESXi appears to expose it as the device WWN (according to above. I note this abuses the WWN format slightly)
  • It seems Xen might not expose serial/wwn in any form (I certainly failed to find anything on Rackspace/PVHVM) - only some concept of "device order" as currently encoded in the device field (and see the current bug for its unreliability).
  • Obviously /dev/disk/by-id/ only exists if udev is installed and configured. Finding the serial number ourselves seems to differ across each of the virtual buses and kernel versions, however.

I don't like any of the options, but I conclude that looking for the two /dev/disk/by-id paths @kiall mentions above and then just trying device with our fingers crossed is our only reasonable solution.

@kiall
Copy link
Author

kiall commented Sep 22, 2016

Ok - I'll forward port my 1.3 patch to master, add wwn for ESX, and fallback to whatever it is cinder tells us the device is.

@kiall
Copy link
Author

kiall commented Sep 22, 2016

@anguslees if you have a Rackspace Cloud account, could you attach a volume, let me know the ID, and show me these?

ls -lah /dev/disk/by-id
ls -lah /dev/disk/by-uuid
ls -lah /dev/disk/by-label
sudo lshw -class disk -class storage
sudo udevadm info -q all -n /dev/<DEVICE>

@anguslees
Copy link
Member

Ubuntu 14.04 (kernel 3.13.0-79-generic) on Rackspace / PVHVM:
/dev/xvdb is an attached volume.

% ls -lah /dev/disk/by-id
ls: cannot access /dev/disk/by-id: No such file or directory
% ls -lah /dev/disk/by-uuid
lrwxrwxrwx 1 root root 11 Jun  7 05:24 234299c3-9cad-4a86-9a83-d74600cd8ed1 -> ../../xvdb1
lrwxrwxrwx 1 root root 11 Jun  2 05:24 49265908-09ae-45ed-b585-b856e321e175 -> ../../xvdc1
lrwxrwxrwx 1 root root 11 Jun  2 05:27 96aba36f-f22d-4475-b1fd-5bd700e1c2fb -> ../../xvda1
% ls -lah /dev/disk/by-label
ls: cannot access /dev/disk/by-label: No such file or directory
% sudo lshw -class disk -class storage
  *-ide                   
       description: IDE interface
       product: 82371SB PIIX3 IDE [Natoma/Triton II]
       vendor: Intel Corporation
       physical id: 1.1
       bus info: pci@0000:00:01.1
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: ide bus_master
       configuration: driver=ata_piix latency=64
       resources: irq:0 ioport:1f0(size=8) ioport:3f6 ioport:170(size=8) ioport:376 ioport:c420(size=16)
  *-scsi
       description: SCSI storage controller
       product: Xen Platform Device
       vendor: XenSource, Inc.
       physical id: 3
       bus info: pci@0000:00:03.0
       version: 01
       width: 32 bits
       clock: 33MHz
       capabilities: scsi bus_master
       configuration: driver=xen-platform-pci latency=0
       resources: irq:30 ioport:c000(size=256) memory:f2000000-f2ffffff
% sudo udevadm info -q all -n /dev/xvdb
P: /devices/vbd-832/block/xvdb
N: xvdb
E: DEVNAME=/dev/xvdb
E: DEVPATH=/devices/vbd-832/block/xvdb
E: DEVTYPE=disk
E: ID_PART_TABLE_TYPE=dos
E: MAJOR=202
E: MINOR=16
E: SUBSYSTEM=block
E: USEC_INITIALIZED=1924951422
% sudo udevadm info -q all -n /dev/xvdb1
P: /devices/vbd-832/block/xvdb/xvdb1
N: xvdb1
S: disk/by-uuid/234299c3-9cad-4a86-9a83-d74600cd8ed1
E: DEVLINKS=/dev/disk/by-uuid/234299c3-9cad-4a86-9a83-d74600cd8ed1
E: DEVNAME=/dev/xvdb1
E: DEVPATH=/devices/vbd-832/block/xvdb/xvdb1
E: DEVTYPE=partition
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=234299c3-9cad-4a86-9a83-d74600cd8ed1
E: ID_FS_UUID_ENC=234299c3-9cad-4a86-9a83-d74600cd8ed1
E: ID_FS_VERSION=1.0
E: ID_PART_ENTRY_DISK=202:16
E: ID_PART_ENTRY_NUMBER=1
E: ID_PART_ENTRY_OFFSET=63
E: ID_PART_ENTRY_SCHEME=dos
E: ID_PART_ENTRY_SIZE=209715137
E: ID_PART_ENTRY_TYPE=0x83
E: ID_PART_TABLE_TYPE=dos
E: MAJOR=202
E: MINOR=17
E: SUBSYSTEM=block
E: USEC_INITIALIZED=1983969339

kiall pushed a commit to hpcloud/kubernetes that referenced this issue Sep 22, 2016
This has been unused since 542f2dc, and relies on deviceName, which
can no longer be relied upon (see issue kubernetes#33128).

This needs to be removed now, as part of kubernetes#33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
kiall pushed a commit to hpcloud/kubernetes that referenced this issue Sep 22, 2016
See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
dagnello pushed a commit to dagnello/kubernetes that referenced this issue Sep 22, 2016
This has been unused since 542f2dc, and relies on deviceName, which
can no longer be relied upon (see issue kubernetes#33128).

This needs to be removed now, as part of kubernetes#33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
dagnello pushed a commit to dagnello/kubernetes that referenced this issue Sep 22, 2016
See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
@kiall
Copy link
Author

kiall commented Sep 29, 2016

FYI - I've reached out to someone on the Rackspace Block Storage team to try get some definitive answers around if and how this issue affects Rackspace.

kiall pushed a commit to hpcloud/kubernetes that referenced this issue Oct 26, 2016
See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
kiall pushed a commit to hpcloud/kubernetes that referenced this issue Nov 2, 2016
This has been unused since 542f2dc, and relies on deviceName, which
can no longer be relied upon (see issue kubernetes#33128).

This needs to be removed now, as part of kubernetes#33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
kiall pushed a commit to hpcloud/kubernetes that referenced this issue Nov 3, 2016
See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
k8s-github-robot pushed a commit that referenced this issue Nov 3, 2016
Automatic merge from submit-queue

Don't rely on device name provided by Cinder

See issue #33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
k8s-github-robot pushed a commit that referenced this issue Nov 6, 2016
Automatic merge from submit-queue

Remove unused WaitForDetach from Detacher interface and plugins

See issue #33128 and PR #33270

We can't rely on the device name provided by OpenStack Cinder, and thus
must perform detection based on the drive serial number (aka It's cinder ID)
on the kubelet itself.

This needs to be removed now, as part of #33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
@dims dims added area/provider/openstack Issues or PRs related to openstack provider and removed area/kubectl labels Nov 15, 2016
@kiall
Copy link
Author

kiall commented Dec 2, 2016

PR for this has merged.

@kiall kiall closed this as completed Dec 2, 2016
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
This has been unused since 542f2dc, and relies on deviceName, which
can no longer be relied upon (see issue kubernetes#33128).

This needs to be removed now, as part of kubernetes#33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
Automatic merge from submit-queue

Don't rely on device name provided by Cinder

See issue kubernetes#33128

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

This patch re-works the cinder volume attacher to ignore the supplied
deviceName, and instead defer to the pre-existing GetDevicePath method to
discover the device path based on it's serial number and /dev/disk/by-id
mapping.

This new behavior is controller by a config option, as falling back
to the cinder value when we can't discover a device would risk devices
not showing up, falling back to cinder's guess, and detecting the wrong
disk as attached.
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
…tfordetach

Automatic merge from submit-queue

Remove unused WaitForDetach from Detacher interface and plugins

See issue kubernetes#33128 and PR kubernetes#33270

We can't rely on the device name provided by OpenStack Cinder, and thus
must perform detection based on the drive serial number (aka It's cinder ID)
on the kubelet itself.

This needs to be removed now, as part of kubernetes#33128, as the code can't be
updated to attempt device detection and fallback through to the Cinder
provided deviceName, as detection "fails" when the device is gone, and
if cinder has reported a deviceName that another volume has used in
relaity, then this will block forever (or until the other, unreleated,
volume has been detached)
wozniakjan pushed a commit to cisco-sso/pvwatch that referenced this issue Jan 9, 2019
PV Watchdog automating manual procedures of cisco SOP regarding:
kubernetes/cloud-provider-openstack#150
kubernetes/kubernetes#33128

- watches on events for pods
- deletes a pod
    - that has relevant cinder emptyPath event
    - is in Pending phase
    - hasn't been deleted in past 60 sec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/openstack Issues or PRs related to openstack provider
Projects
None yet
Development

No branches or pull requests

4 participants