New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isDeviceOpened check for cinder could use invalid device path #68479

Closed
gnufied opened this Issue Sep 10, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@gnufied
Member

gnufied commented Sep 10, 2018

We fixed invalid device path in bunch of places via - #33270 but isDeviceOpen call still could use invalid device path.

That check can sometimes use invalid path and hence can cause UnmountDevice operation to fails even though device has already been unmounted.

/sig storage
/sig openstack

/assign

@gnufied

This comment has been minimized.

Show comment
Hide comment
@gnufied

gnufied Sep 14, 2018

Member

Actually this does not affect version 1.10 onwards. Not sure if 1.9 is still supported. I think I will close this for now.

/close

Member

gnufied commented Sep 14, 2018

Actually this does not affect version 1.10 onwards. Not sure if 1.9 is still supported. I think I will close this for now.

/close

@gnufied gnufied closed this Sep 14, 2018

@gnufied

This comment has been minimized.

Show comment
Hide comment
@gnufied

gnufied Sep 14, 2018

Member

I just noticed that 1.9 is still supported and hence it is probably worth fixing there for now.

Member

gnufied commented Sep 14, 2018

I just noticed that 1.9 is still supported and hence it is probably worth fixing there for now.

@gnufied gnufied reopened this Sep 14, 2018

gnufied added a commit to gnufied/kubernetes that referenced this issue Sep 14, 2018

Make sure that we update actual state of world with real device path
Many volume plugins rely on value returned by WaitForAttach call
as real device path. But isDeviceOpen check used value present in node object
which can be wrong in many cases.

Fixes kubernetes#68479
@msau42

This comment has been minimized.

Show comment
Hide comment
@msau42

msau42 Sep 14, 2018

Member

How come the check can "sometimes" use invalid device path?

Member

msau42 commented Sep 14, 2018

How come the check can "sometimes" use invalid device path?

@gnufied

This comment has been minimized.

Show comment
Hide comment
@gnufied

gnufied Sep 14, 2018

Member

@msau42 So on Cinder for example - Attach call would return something like /dev/sdb as attach device path. But if you see the linked PR(#33270), WaitForAttach will do a /dev/disk/by-id and use that device path for global mount. In general it is always a good idea to use device path returned by WaitForAttach as single source of truth.

This applies to something like EBS with NVMe too where device is discovered via WaitForAttach call (but nvme disks aren't directly affected by this bug because returned device path on EBS is bogus).

Member

gnufied commented Sep 14, 2018

@msau42 So on Cinder for example - Attach call would return something like /dev/sdb as attach device path. But if you see the linked PR(#33270), WaitForAttach will do a /dev/disk/by-id and use that device path for global mount. In general it is always a good idea to use device path returned by WaitForAttach as single source of truth.

This applies to something like EBS with NVMe too where device is discovered via WaitForAttach call (but nvme disks aren't directly affected by this bug because returned device path on EBS is bogus).

@gnufied

This comment has been minimized.

Show comment
Hide comment
@gnufied

gnufied Sep 14, 2018

Member

So on EBS with NVMe for example - isDeviceOpen check will always fail (but still return success) because it will use something like /dev/xvdba as device path which does not actually exist on the node.

Member

gnufied commented Sep 14, 2018

So on EBS with NVMe for example - isDeviceOpen check will always fail (but still return success) because it will use something like /dev/xvdba as device path which does not actually exist on the node.

@msau42

This comment has been minimized.

Show comment
Hide comment
@msau42

msau42 Sep 14, 2018

Member

You said the issue only happens "sometimes". Can you clarify why it only happens occasionally?

Member

msau42 commented Sep 14, 2018

You said the issue only happens "sometimes". Can you clarify why it only happens occasionally?

@gnufied

This comment has been minimized.

Show comment
Hide comment
@gnufied

gnufied Sep 14, 2018

Member

@msau42 doesn't the linked PR #33270 make it clear?

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

As for why - Cinder sometimes returns wrong device path on AttachDevice call, I don't know. But it happens and we have worked around it.

Member

gnufied commented Sep 14, 2018

@msau42 doesn't the linked PR #33270 make it clear?

We can't rely on the device name provided by Cinder, and thus must perform
detection based on the drive serial number (aka It's cinder ID) on the
kubelet itself.

As for why - Cinder sometimes returns wrong device path on AttachDevice call, I don't know. But it happens and we have worked around it.

@gnufied

This comment has been minimized.

Show comment
Hide comment
@gnufied

gnufied Sep 28, 2018

Member

Fixed by #68674

Member

gnufied commented Sep 28, 2018

Fixed by #68674

@gnufied gnufied closed this Sep 28, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment