Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openstack cinder volumes not detached from downed vm when pod is rescheduled to another node. #33288

Closed
Rotwang opened this issue Sep 22, 2016 · 2 comments · Fixed by #39055 · May be fixed by Rotwang/kubernetes#1
Closed

Openstack cinder volumes not detached from downed vm when pod is rescheduled to another node. #33288

Rotwang opened this issue Sep 22, 2016 · 2 comments · Fixed by #39055 · May be fixed by Rotwang/kubernetes#1
Labels
area/provider/openstack Issues or PRs related to openstack provider

Comments

@Rotwang
Copy link

Rotwang commented Sep 22, 2016

Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.0.1062+1070a518301f0b", GitCommit:"1070a518301f0bfb6f6c8832feff0bba0c391a22", GitTreeState:"clean", BuildDate:"2016-09-20T11:34:18Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Also tried with 1.3.6

Environment:

What happened:
If compute instance with attached volume is downed (physical volume in kubernetes terms), kubernetes doesn't try to detach said volume (ever). End result is that k8s is trying to attach a volume in a loop but never succeeds because it is already attached to a downed node. I've waited for more than an hour for the volume to be detached (didn't help :c)

What you expected to happen:
I expect controller-manager or kubelet to detach volume before trying to attach it to a new compute instance.

How to reproduce it (as minimally and precisely as possible):
Bring up cluster with two nodes on openstack. Schedule pod with a pvc. Shutdown (from the command line on the operating system) the node with attached volume. Pod gets rescheduled to another node, but volume stays with the downed node.

Anything else do we need to know:
I've looked through reconcile.go on both kubelet and controller-manager it seems to me that once a node is downed it is no longer on the list of nodesManaged (in desired_state_of_world.go) so it's volume is never going to be detached. I've also tried to disable --enable-controller-attach-detach on a kubelet, but it doesn't try to detach it either.

@Rotwang
Copy link
Author

Rotwang commented Sep 23, 2016

Additionally after bringing back compute instance up (the one with required volume), so that 'kubectl get nodes' lists it as 'Ready' volume is not being detached (even after 3 hours). Tried with both controller-manager and kubelet (--enable-controller-attach-detach=False). After volume is manually detached from the node (node which doesn't run pod which claims this volume) controller-manager attaches it to proper node.

@dims dims added area/provider/openstack Issues or PRs related to openstack provider and removed area/kubelet labels Nov 15, 2016
anguslees added a commit to anguslees/kubernetes that referenced this issue Dec 21, 2016
k8s-github-robot pushed a commit that referenced this issue Dec 28, 2016
Automatic merge from submit-queue (batch tested with PRs 39152, 39142, 39055)

openstack: Forcibly detach an attached cinder volume before attaching elsewhere

Fixes #33288



**What this PR does / why we need it**:
Without this fix, we can't preemptively reschedule pods with persistent volumes to other hosts (for rebalancing or hardware failure recovery).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #33288

**Special notes for your reviewer**:
(This is a resurrection/cleanup of PR #33734, originally authored by @Rotwang)

**Release note**:
DukeXar pushed a commit to DukeXar/kubernetes that referenced this issue Jan 7, 2017
Automatic merge from submit-queue (batch tested with PRs 39152, 39142, 39055)

openstack: Forcibly detach an attached cinder volume before attaching elsewhere

Fixes kubernetes#33288

**What this PR does / why we need it**:
Without this fix, we can't preemptively reschedule pods with persistent volumes to other hosts (for rebalancing or hardware failure recovery).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes kubernetes#33288

**Special notes for your reviewer**:
(This is a resurrection/cleanup of PR kubernetes#33734, originally authored by @Rotwang)

**Release note**:
@slawekww
Copy link

In the scenario when pod is forced to delete by kubectl delete pod command, pod is rescheduled to another node. Without forcibly detach volume fix, scenario would end with failure as cinder volume is attached to another k8s node. It is very annoying, especially pod is hosting database on persistent volume.
Would this fix be part of official kubernetes 1.4 or/and 1.5 release?

berryjam pushed a commit to berryjam/kubernetes that referenced this issue Aug 18, 2017
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
Automatic merge from submit-queue (batch tested with PRs 39152, 39142, 39055)

openstack: Forcibly detach an attached cinder volume before attaching elsewhere

Fixes kubernetes#33288



**What this PR does / why we need it**:
Without this fix, we can't preemptively reschedule pods with persistent volumes to other hosts (for rebalancing or hardware failure recovery).

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes kubernetes#33288

**Special notes for your reviewer**:
(This is a resurrection/cleanup of PR kubernetes#33734, originally authored by @Rotwang)

**Release note**:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment