New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attach/detach controller does not recover from missed pod deletion #34242
Comments
Jan, Yes, this is an issue that we haven't addressed. Basically when master Please let me know any suggestions/comments. Thank you! Best, On Thu, Oct 6, 2016 at 8:39 AM, Jan Šafránek notifications@github.com
|
We could tag AWS EBS and Cinder volumes on attach with name(s) of pod(s) that use it and un-tag it on detach. I am not sure about GCE, there is PD.Description where we put some json when dynamically creating the volume. Perhaps we can update the json on attach/detach. I don't know anything about Ceph RBD, which is getting attach/detach support soon. In addition, on AWS we assign devices /dev/xvdb[a-z] /dev/xvdc[a-z] and so on to Kubernetes volumes, leaving /dev/xvd[a-z] and /dev/xvda[a-z] to "system". Still, the safest thing would be to save attach/detach information somewhere in API server as a separate object or somewhere inside Node.Spec or Status. |
Returning back to this bug with newest kube-controller-manager and kubelet (almost 1.5), I noticed that if I restart controller-manager, node retains enough information about attached volumes:
Could it be enough to detach these volumes when controller restarts? I know, there is some window where the volume is attached and node status is not written yet, still it would help in most of the cases. |
That's a good point. We might recover some information from node object and
put this information back to the actual state when controller restarts. But
again we need design this carefully to avoid race condition. Will follow up
with a proposal later this week. Thanks!
Jing
…On Tue, Dec 6, 2016 at 4:55 AM, Jan Šafránek ***@***.***> wrote:
Returning back to this bug with newest kube-controller-manager and kubelet
(almost 1.5), I noticed that if I restart controller-manager, node retains
enough information about attached volumes:
status:
volumesAttached:
- devicePath: /dev/xvdba
name: kubernetes.io/aws-ebs/aws://us-east-1d/vol-4fc15dde
Could it be enough to detach these volumes when controller restarts? I
know, there is some window where the volume is attached and node status is
not written yet, still it would help in most of the cases.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34242 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxaQ5g2GUV2Io9cpqrqvDINrMFA2hks5rFVtdgaJpZM4KQF98>
.
--
- Jing
|
@rootfs, #37727 is addressing a different issue, I think. When node
restarts, kubelet might delete the old node object, and create a new one.
Because of this, the list of attached volumes will be wiped out from the
node object so that kubelet on the node will not able to retrieve this
information from the api server and will wait forever for node to be
attached (even though the truth is that it is already attached).
Jan mentioned the problem is caused by controller manager at master
restarts, if pods are deleted in the meantime, the list of volumes that are
currently attached to the node will be lost. It would help if we add the
logic to recover this information from the node object when controller
manager restarts. #37727 reminds me that if at this moment, kubelet also
restarts and delete the old node object, the list of attached volumes
information will be gone too and could not be recovered.
…On Tue, Dec 6, 2016 at 11:13 AM, Huamin Chen ***@***.***> wrote:
@jsafrane <https://github.com/jsafrane> @jingxu97
<https://github.com/jingxu97> would #37727
<#37727> help?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34242 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxVTtOcavM0xXkmrYLX3eJoJheH_3ks5rFbPqgaJpZM4KQF98>
.
--
- Jing
|
@jingxu97 here is my thought. When controller master restarts, if it first gets the node status (and gets the attached volumes) before sync pods, wouldn't the attached volume be still there by the time pod is to be deleted? |
@rootfs, pod deletion does not affect the node status. Yes, the information
about the attached volume will still be available and we can add the logic
to recover this information (currently we don't have this logic and only
rely on sync pod to recover the volumes information). But if node restarts
at the same, then the information about the attached volumes will be gone
because the whole node object is deleted (we plan to revisit this logic
about delete node object too.)
…On Tue, Dec 6, 2016 at 11:48 AM, Huamin Chen ***@***.***> wrote:
@jingxu97 <https://github.com/jingxu97> here is my thought. When
controller master restarts, if it first gets the node status (and gets the
attached volumes) before sync pods, wouldn't the attached volume be still
there by the time pod is to be deleted?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#34242 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ASSNxVsgvBD9Oe3KhIBohgaK4kfavyZfks5rFbwKgaJpZM4KQF98>
.
--
- Jing
|
Spoke with @jingxu97 offline. I'm ok with using |
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This reuired a small change also in the volume plugin interface.
Hello. I tested this in AWS and it looks to be working as expected: the unused volume gets detached even when the pod has been deleted during the controller-manager downtime. |
@tsmetana thank you for helping on this. I think when pod is deleted from api server, some information such as volume spec is not recoverable. But it might be ok to put some dummy information as long as the information needed for detach is good. |
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This reuired a small change also in the volume plugin interface.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This reuired a small change also in the volume plugin interface.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
Does this also affect HA controller-manager w/ leader election? Because I observe a similar issue when leaders swap (the new leader doesn't detach some volumes correctly). |
I don't think it is, because what I observe is the pod remains running during the leader election swap. For some reason when the pod is then deleted hours later the new volume controller master doesn't detach the volume. From what I gather the volume manager should be able to handle this case. |
That's way too large a change to merge to 1.6. We can consider it for post-1.6. |
Too late for v1.6, moving to v1.7 milestone. If this is incorrect please correct. /cc @kubernetes/release-team |
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
Automatic merge from submit-queue (batch tested with PRs 44722, 44704, 44681, 44494, 39732) Fix issue #34242: Attach/detach should recover from a crash When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status and figure out which volumes to detach. This requires some changes in the volume providers too: the only information available from the nodes is the volume name and the device path. The controller needs to find the correct volume plugin and reconstruct the volume spec just from the name. This required a small change also in the volume plugin interface. Fixes Issue #34242. cc: @jsafrane @jingxu97
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
When the attach/detach controller crashes and a pod with attached PV is deleted afterwards the controller will never detach the pod's attached volumes. To prevent this the controller should try to recover the state from the nodes status.
…over from a crash :100644 100644 d3de5fdf98... 01658bd9b3... M pkg/controller/volume/attachdetach/BUILD :100644 100644 01d2adc016... 66cac888ca... M pkg/controller/volume/attachdetach/attach_detach_controller.go :100644 100644 4a7a8ebfd2... a1a2266d65... M pkg/controller/volume/attachdetach/attach_detach_controller_test.go :100644 100644 5387bec0d9... db40529822... M pkg/controller/volume/attachdetach/cache/actual_state_of_world.go :100644 100644 86f0461493... fa19728b33... M pkg/controller/volume/attachdetach/cache/actual_state_of_world_test.go :100644 100644 505e11e071... 08ce7effc1... M pkg/controller/volume/attachdetach/reconciler/reconciler.go :100644 100644 7911072557... baf67d9ca7... M pkg/controller/volume/attachdetach/reconciler/reconciler_test.go :100644 100644 b484cfa8ce... 2b954e6b79... M pkg/controller/volume/attachdetach/testing/testvolumespec.go :100644 100644 b78c76d2f9... 89b29be2a5... M pkg/volume/plugins.go :100644 100644 8e28405786... f8ae260244... M pkg/volume/util/operationexecutor/operation_executor.go :100644 100644 f1aff52c81... f6a9eb092b... M pkg/volume/util/operationexecutor/operation_generator.go :100644 100644 c55c8db60e... d4dd45dfe3... M pkg/volume/util/volumehelper/volumehelper.go
…over from a crash :100644 100644 d3de5fdf98... 01658bd9b3... M pkg/controller/volume/attachdetach/BUILD :100644 100644 01d2adc016... 66cac888ca... M pkg/controller/volume/attachdetach/attach_detach_controller.go :100644 100644 4a7a8ebfd2... a1a2266d65... M pkg/controller/volume/attachdetach/attach_detach_controller_test.go :100644 100644 5387bec0d9... db40529822... M pkg/controller/volume/attachdetach/cache/actual_state_of_world.go :100644 100644 86f0461493... fa19728b33... M pkg/controller/volume/attachdetach/cache/actual_state_of_world_test.go :100644 100644 505e11e071... 08ce7effc1... M pkg/controller/volume/attachdetach/reconciler/reconciler.go :100644 100644 7911072557... baf67d9ca7... M pkg/controller/volume/attachdetach/reconciler/reconciler_test.go :100644 100644 b484cfa8ce... 2b954e6b79... M pkg/controller/volume/attachdetach/testing/testvolumespec.go :100644 100644 b78c76d2f9... 89b29be2a5... M pkg/volume/plugins.go :100644 100644 8e28405786... f8ae260244... M pkg/volume/util/operationexecutor/operation_executor.go :100644 100644 f1aff52c81... f6a9eb092b... M pkg/volume/util/operationexecutor/operation_generator.go :100644 100644 c55c8db60e... d4dd45dfe3... M pkg/volume/util/volumehelper/volumehelper.go
…over from a crash :100644 100644 d3de5fdf98... 01658bd9b3... M pkg/controller/volume/attachdetach/BUILD :100644 100644 01d2adc016... 66cac888ca... M pkg/controller/volume/attachdetach/attach_detach_controller.go :100644 100644 4a7a8ebfd2... a1a2266d65... M pkg/controller/volume/attachdetach/attach_detach_controller_test.go :100644 100644 5387bec0d9... db40529822... M pkg/controller/volume/attachdetach/cache/actual_state_of_world.go :100644 100644 86f0461493... fa19728b33... M pkg/controller/volume/attachdetach/cache/actual_state_of_world_test.go :100644 100644 505e11e071... 08ce7effc1... M pkg/controller/volume/attachdetach/reconciler/reconciler.go :100644 100644 7911072557... baf67d9ca7... M pkg/controller/volume/attachdetach/reconciler/reconciler_test.go :100644 100644 b484cfa8ce... 2b954e6b79... M pkg/controller/volume/attachdetach/testing/testvolumespec.go :100644 100644 b78c76d2f9... 89b29be2a5... M pkg/volume/plugins.go :100644 100644 8e28405786... f8ae260244... M pkg/volume/util/operationexecutor/operation_executor.go :100644 100644 f1aff52c81... f6a9eb092b... M pkg/volume/util/operationexecutor/operation_generator.go :100644 100644 c55c8db60e... d4dd45dfe3... M pkg/volume/util/volumehelper/volumehelper.go
…over from a crash :100644 100644 d3de5fdf98... 01658bd9b3... M pkg/controller/volume/attachdetach/BUILD :100644 100644 01d2adc016... 66cac888ca... M pkg/controller/volume/attachdetach/attach_detach_controller.go :100644 100644 4a7a8ebfd2... a1a2266d65... M pkg/controller/volume/attachdetach/attach_detach_controller_test.go :100644 100644 5387bec0d9... db40529822... M pkg/controller/volume/attachdetach/cache/actual_state_of_world.go :100644 100644 86f0461493... fa19728b33... M pkg/controller/volume/attachdetach/cache/actual_state_of_world_test.go :100644 100644 505e11e071... 08ce7effc1... M pkg/controller/volume/attachdetach/reconciler/reconciler.go :100644 100644 7911072557... baf67d9ca7... M pkg/controller/volume/attachdetach/reconciler/reconciler_test.go :100644 100644 b484cfa8ce... 2b954e6b79... M pkg/controller/volume/attachdetach/testing/testvolumespec.go :100644 100644 b78c76d2f9... 89b29be2a5... M pkg/volume/plugins.go :100644 100644 8e28405786... f8ae260244... M pkg/volume/util/operationexecutor/operation_executor.go :100644 100644 f1aff52c81... f6a9eb092b... M pkg/volume/util/operationexecutor/operation_generator.go :100644 100644 c55c8db60e... d4dd45dfe3... M pkg/volume/util/volumehelper/volumehelper.go
#39732 merged for 1.7 |
We run OpenShift in master-slave setup and our master crashes once in a while (from unrelated reason). When a new master starts, it does not detach volumes that should be detached.
Steps to reproduce on AWS with standard Kubernetes:
Result: volumes are attached forever (or at least for next 30 minutes).
It should be reproducible also on GCE. Shouldn't there be a periodic sync that ensures the controller finds deleted pods? This comment looks scary: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/attachdetach/attach_detach_controller.go#L76
Affected version: kubernetes-1.3.8
@saad-ali @jingxu97 @kubernetes/sig-storage
The text was updated successfully, but these errors were encountered: