Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vsphere Cloud Provider: failed to detach volume from shutdown node #75342

Open
ksandermann opened this Issue Mar 13, 2019 · 7 comments

Comments

Projects
None yet
4 participants
@ksandermann
Copy link

ksandermann commented Mar 13, 2019

What happened:
I have a pod with a pv attached to it running on node1.
When I shutdown node1 to simulate node-failure, Kubernetes detects the unhealthy node in the configured timeframe and tries to re-schedule the pod on node2 following the --pod-eviction timeout.
When trying to start the pod on node2, the pv can not be attached as it is still attached to the shutdown node1:

  Warning  FailedAttachVolume      6m               attachdetach-controller  Multi-Attach error for volume "pvc-db44144b-457f-11e9-a7b0-005056af6878" Volume is already exclusively attached to one node and can't be attached to another

Also, the pod on the shutdown node does not get deleted.

What you expected to happen:
As documented here:
The disk should be detached from the shutdown node and attached to the new node where the pod is scheduled on

How to reproduce it (as minimally and precisely as possible):

  1. Schedule single pod using pvc on node1
  2. Shutdown node1
  3. Watch FailedAttachVolume event on the new pod on node2

Anything else we need to know?:
Also, kube-controller-manager does not log anything about this failure.
Detachment and attachment to another nodes works fine, as long as all nodes are healthy.
Force-deletion of the pod on the shutdown node also does nothing.

Environment:

  • Kubernetes version (use kubectl version): 1.12.5
  • Cloud provider or hardware configuration: vsphere
  • OS (e.g: cat /etc/os-release): Ubuntu 16.04
  • Kernel (e.g. uname -a): Linux k8s-dev-master3 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Kubespray 2.8.3 using kubeadm
  • Others: VSphere 6.5

Thanks in advance! :)

@ksandermann

This comment has been minimized.

Copy link
Author

ksandermann commented Mar 13, 2019

/sig vmware

@k8s-ci-robot k8s-ci-robot added sig/vmware and removed needs-sig labels Mar 13, 2019

@ksandermann

This comment has been minimized.

Copy link
Author

ksandermann commented Mar 13, 2019

@kubernetes/sig-vmware-bugs

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Mar 13, 2019

@ksandermann: Reiterating the mentions to trigger a notification:
@kubernetes/sig-vmware-bugs

In response to this:

@kubernetes/sig-vmware-bugs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@LinAnt

This comment has been minimized.

Copy link

LinAnt commented Mar 14, 2019

It is even worse if you drain a node for an upgrade and then delete the VM. The disks that remain attached will be gone as well as the vm. This is not a recent issue, it has been like this since 1.8.x or 1.9.x.

@yastij

This comment has been minimized.

Copy link
Member

yastij commented Mar 14, 2019

There's a KEP opened for this kubernetes/enhancements#719

/priority important-soon

@ksandermann

This comment has been minimized.

Copy link
Author

ksandermann commented Mar 15, 2019

@yastij I see, thanks for the reference!
Is there any estimated time for this to actually get through?
I see that the PR has been stale for 10 days now

@yastij

This comment has been minimized.

Copy link
Member

yastij commented Mar 15, 2019

The design is still in discussion, this will be landing on 1.15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.