New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong AWS volume can be mounted #29324
Comments
I may of just ran into the same problem without doing any funny work. I have 2 separate pods that mount 2 separate aws ebs volumes. Currently on the instance only 1 volume shows up as mounted, but in AWS EC2 both volumes are attached to the instance. Both pods have the same data volume mounted inside. Kubernetes 1.3.4
|
@atombender, @sstarcher Thank you for posting the issues. Could you please share the kubelet log if it is still available? Thanks! Jing |
@atombender , thanks for the log. Could you also please share your deployment spec with us? Thanks! |
I don't have it right here, but I basically just use an AWS volume in the pod declaration (not a persistent vol + claim) and a single volume mount. I will send you the actual file later when I am back before a screen.
|
@atombender thanks for your response. Also the log file you posted only has some error messages. Do you still have the kubelet full log on the instance? Thanks! |
That is the full log. (I have kept the logs all the way from when I started using K8s.) I believe I extracted the right time range. FWIW, I don't think K8s logged much in terms of errors (though you will see some in the KCM log). In fact, that's kind of the point -- it attached the wrong thing, without erroring.
|
@sstarcher I followed your steps but with gce disks, and it seems mounting correctly. Could you please share the kubelet logs so that I can do some debugging? Thanks! |
@atombender From KCM log, I only see attach and detach once. But it seems that you created two deployments, and it should have two attach log. From kubelet log, they are all error messages (unable to mount, unable to cleanup), but there is no information log. |
I may have done the other deploy the day before. I will get you logs going farther back in time.
|
@jingxu97 I don't have the same exact setup anymore. I'll attempt to reproduce on monday if possible. |
@justinsb I was trying to reproduce the error @sstarcher described for AWS volume. I tried the same steps on gce disks and it is working fine. But I don't have the AWS volumes to test. If you have the setup, could you please try it on your environment when you have time. The steps @sstarcher described as follows. Thank you very much! Kubernetes 1.3.4
|
I think I know what is happening here...
Looking at simple ways to fix this. Probably making the map of mounts in-flight just be global state. It shouldn't leak, but even if it does under rare circumstances, it will be better than this bug. |
Just put up an experimental fix - haven't tested it even compiles yet. But thought I would get the ball rolling on how I think we should fix this! |
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
Automatic merge from submit-queue Fix issue in updating device path when volume is attached multiple times When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue #29324
Hi @argusua @spacepluk @svanderbijl and all other AWS users, the fix #32242 is already merged. I would really appreciate if you could test against the master version (or the head of the tree) to see whether the problem is fixed or not on your cluster. There is still a little chance of failure if you reboot the node instead of terminating the node and replace the node with new name. I am working on this which requires a bigger fix in PR #33760 Please let me know if you have any questions. Thank you! |
Thank you, @jingxu97 |
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
FYI, this issue also effects PetSets in the 1.4.0 release. When a pod is killed or the minion running it dies, the volume does not detach to re-attach to the newly assigned minion instance. However, the pod will start after timing out on the attach of the PV. This seems like a bug with PetSets. If the PV fails to mount, it should not continue. Should this be a separate issue or included here? |
@tnine, for minion dies and restart/recreate, I am trying to address the issue of volume manager fails to detach and attach in #33760. Please let me know if that PR will address your concern or not. As for pod is killed (nodes are running file), attach/detach should work as supposed to. Please let me know if you have any issue you have experienced. Thanks! |
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
Automatic merge from submit-queue Fix race condition in updating attached volume between master and node This PR tries to fix issue kubernetes#29324. The cause of this issue is that a race condition happens when marking volumes as attached for node status. This PR tries to clean up the logic of when and where to mark volumes as attached/detached. Basically the workflow as follows, 1. When volume is attached sucessfully, the volume and node info is added into nodesToUpdateStatusFor to mark the volume as attached to the node. 2. When detach request comes in, it will check whether it is safe to detach now. If the check passes, remove the volume from volumesToReportAsAttached to indicate the volume is no longer considered as attached now. Afterwards, reconciler tries to update node status and trigger detach operation. If any of these operation fails, the volume is added back to the volumesToReportAsAttached list showing that it is still attached. These steps should make sure that kubelet get the right (might be outdated) information about which volume is attached or not. It also garantees that if detach operation is pending, kubelet should not trigger any mount operations. (cherry picked from commit 6a9a93d)
When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324
This PR tries to fix issue kubernetes#29324. This cause of this issue is a race condition happens when marking volumes as attached for node status. This PR tries to clean up the logic of when and where to mark volumes as attached/detached. Basically the workflow as follows, 1. When volume is attached sucessfully, the volume and node info is added into nodesToUpdateStatusFor to mark the volume as attached to the node. 2. When detach request comes in, it will check whether it is safe to detach now. If the check passes, remove the volume from volumesToReportAsAttached to indicate the volume is no longer considered as attached now. Afterwards, reconciler tries to update node status and trigger detach operation. If any of these operation fails, the volume is added back to the volumesToReportAsAttached list showing that it is still attached. These steps should make sure that kubelet get the right (might be outdated) information about which volume is attached or not. It also garantees that if detach operation is pending, kubelet should not trigger any mount operations.
The problem is that attachments are now done on the master, and we are only caching the attachment map persistently for the local instance. So there is now a race, because the attachment map is cleared every time. Issue kubernetes#29324
In the light of issue kubernetes#29324, double check that the volume was attached correctly where we expect it, before returning. Issue kubernetes#29324
Is this issue already resolved? |
I see. Thank you for the information! I'm looking forward to the next release. |
We upgraded to 1.4.7 and are still seeing delays of about 2 min, when pods change the host they're running on:
|
@justinsb we have recreated this twice in a row. We are going to get logs for you. |
Issues go stale after 30d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
Background: K8s often has problems attaching/detaching AWS volumes: If I do a
kubectl replace
on a deployment, I frequently observe K8s not correctly detaching the volume, and the new deployment will be forever stuck in a crash loop because it times out getting the new volume. Seems like a race condition. The steps I always do when this happens is to delete the deployment, manually force-detach via EC2 API, and manuallyumount
the mounts, otherwise they never seem to be cleaned up.So today I experienced this again, and I did the following:
umount
wrong volume. I accidentally unmounted the volume of a different pod because I didn't read themount
output correctly. Oops.What happened was that the new deployment was mounted with the wrong device:
This is the right volume ID, but it has mounted the wrong device. According to EC2, the (right) volume is attached to
/dev/xvdbb
./dev/xvdba
is the volume I accidentally unmounted by mistake:So it's mounting the wrong volume into the pod.
Of course I was being naughty, but it seems to me that this should never happen, since EC2 has already allocated
xvdbb
?Here's the entire KCM output from this fiasco.
The text was updated successfully, but these errors were encountered: