Wrong AWS volume can be mounted #29324

atombender · 2016-07-20T21:29:29Z

Background: K8s often has problems attaching/detaching AWS volumes: If I do a kubectl replace on a deployment, I frequently observe K8s not correctly detaching the volume, and the new deployment will be forever stuck in a crash loop because it times out getting the new volume. Seems like a race condition. The steps I always do when this happens is to delete the deployment, manually force-detach via EC2 API, and manually umount the mounts, otherwise they never seem to be cleaned up.

So today I experienced this again, and I did the following:

Delete deployment.
Manually detach volume via EC2 API, since it was not being released.
Manually umount wrong volume. I accidentally unmounted the volume of a different pod because I didn't read the mount output correctly. Oops.
Created deployment.
Noticed my mistake.

What happened was that the new deployment was mounted with the wrong device:

$ mount | grep xvdb
/dev/xvdba on /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bf8343563ee5aa type ext4 (rw,relatime,data=ordered)
/dev/xvdba on /var/lib/kubelet/pods/a8f54bf6-4ebd-11e6-89fc-065b459e106f/volumes/kubernetes.io~aws-ebs/data type ext4 (rw,relatime,data=ordered)

This is the right volume ID, but it has mounted the wrong device. According to EC2, the (right) volume is attached to /dev/xvdbb.

$ aws ec2 describe-volumes --volume-ids vol-0e3bf8343563ee5aa
{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1e", 
            "Attachments": [
                {
                    "AttachTime": "2016-07-20T21:05:17.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e3bf8343563ee5aa", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdbb"
                }
            ], 
            "Tags": [],
            "Encrypted": false, 
            "VolumeType": "gp2", 
            "VolumeId": "vol-0e3bf8343563ee5aa", 
            "State": "in-use", 
            "Iops": 100, 
            "SnapshotId": "", 
            "CreateTime": "2016-07-19T00:25:00.168Z", 
            "Size": 30
        }
    ]
}

/dev/xvdba is the volume I accidentally unmounted by mistake:

            // ...
            "Attachments": [
                {
                    "AttachTime": "2016-07-19T17:56:27.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e0a13f1b3a4a6dbc", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdba"
                }
            ],

So it's mounting the wrong volume into the pod.

Of course I was being naughty, but it seems to me that this should never happen, since EC2 has already allocated xvdbb?

Here's the entire KCM output from this fiasco.

The text was updated successfully, but these errors were encountered:

sstarcher · 2016-08-03T17:56:20Z

I may of just ran into the same problem without doing any funny work. I have 2 separate pods that mount 2 separate aws ebs volumes. Currently on the instance only 1 volume shows up as mounted, but in AWS EC2 both volumes are attached to the instance. Both pods have the same data volume mounted inside.

Kubernetes 1.3.4

Pod A has a persistent volume claim and mounts ebs volume A
Pod B has a direct awsElasticBlockStore statement in the pod for ebs volume B
Volume A and B are both attached to the EC2 instance
Volume A is properly mounted on the host
Pod A and Pod B both have volume A mounted inside of them. This is incorrect for Pod B.
Deleting Pod A unattaches the ebs volume A and ebs volume B was mounted

jingxu97 · 2016-08-12T00:59:52Z

@atombender, @sstarcher Thank you for posting the issues. Could you please share the kubelet log if it is still available? Thanks!

Jing

atombender · 2016-08-12T01:10:55Z

@jingxu97 https://gist.github.com/atombender/a810ca55fd1e149f20d2a7a6f7a45934

jingxu97 · 2016-08-12T21:54:19Z

@atombender , thanks for the log. Could you also please share your deployment spec with us? Thanks!

atombender · 2016-08-12T23:07:02Z

I don't have it right here, but I basically just use an AWS volume in the pod declaration (not a persistent vol + claim) and a single volume mount.

I will send you the actual file later when I am back before a screen.

On Aug 12, 2016, at 16:55, Jing Xu notifications@github.com wrote:

@atombender , thanks for the log. Could you also please share your deployment spec with us? Thanks!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

jingxu97 · 2016-08-12T23:19:58Z

@atombender thanks for your response. Also the log file you posted only has some error messages. Do you still have the kubelet full log on the instance? Thanks!

atombender · 2016-08-12T23:23:51Z

That is the full log. (I have kept the logs all the way from when I started using K8s.) I believe I extracted the right time range.

FWIW, I don't think K8s logged much in terms of errors (though you will see some in the KCM log). In fact, that's kind of the point -- it attached the wrong thing, without erroring.

On Aug 12, 2016, at 18:20, Jing Xu notifications@github.com wrote:

@atombender thanks for your response. Also the log file you posted only has some error messages. Do you still have the kubelet full log on the instance? Thanks!

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

jingxu97 · 2016-08-12T23:25:23Z

@sstarcher I followed your steps but with gce disks, and it seems mounting correctly. Could you please share the kubelet logs so that I can do some debugging? Thanks!

jingxu97 · 2016-08-12T23:34:14Z

@atombender From KCM log, I only see attach and detach once. But it seems that you created two deployments, and it should have two attach log. From kubelet log, they are all error messages (unable to mount, unable to cleanup), but there is no information log.

atombender · 2016-08-13T00:49:59Z

I may have done the other deploy the day before. I will get you logs going farther back in time.

On Aug 12, 2016, at 18:35, Jing Xu notifications@github.com wrote:

@atombender From KCM log, I only see attach and detach once. But it seems that you created two deployments, and it should have two attach log. From kubelet log, they are all error messages (unable to mount, unable to cleanup), but there is no information log.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

sstarcher · 2016-08-13T00:58:17Z

@jingxu97 I don't have the same exact setup anymore. I'll attempt to reproduce on monday if possible.

jingxu97 · 2016-08-15T18:21:18Z

@justinsb I was trying to reproduce the error @sstarcher described for AWS volume. I tried the same steps on gce disks and it is working fine. But I don't have the AWS volumes to test. If you have the setup, could you please try it on your environment when you have time. The steps @sstarcher described as follows. Thank you very much!

Kubernetes 1.3.4

Pod A has a persistent volume claim and mounts ebs volume A
Pod B has a direct awsElasticBlockStore statement in the pod for ebs volume B
Volume A and B are both attached to the EC2 instance
Volume A is properly mounted on the host
Pod A and Pod B both have volume A mounted inside of them. This is incorrect for Pod B.
Deleting Pod A unattaches the ebs volume A and ebs volume B was mounted

justinsb · 2016-08-22T03:00:16Z

I think I know what is happening here...

A user reported that this was happening also with just awsElasticBlockStore volumes. They provided me an extract of their logs. (Thanks Fairfax!)
In the logs, we saw ~5 volumes being assigned to a single instance at the same time, and all 5 of them were allocated the same device (within a very small time interval).
This is supposed to be impossible - we keep an in-memory map of mounts in-flight precisely to avoid this, so that we can keep track of devices before they hit the AWS API (because we have to choose a device for AWS to mount the volume, but I didn't particularly want to hold a lock while making the attachment)
BUT... we keep this map for the local instance, because mounts happen locally, but not on the master because otherwise we'd have to worry about how to garbage collect it.
but of course in 1.3 the previous doesn't hold - mounts now happen on the master :-(

Looking at simple ways to fix this. Probably making the map of mounts in-flight just be global state. It shouldn't leak, but even if it does under rare circumstances, it will be better than this bug.

justinsb · 2016-08-22T03:46:33Z

Just put up an experimental fix - haven't tested it even compiles yet. But thought I would get the ball rolling on how I think we should fix this!

When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324

Automatic merge from submit-queue Fix issue in updating device path when volume is attached multiple times When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue #29324

jingxu97 · 2016-10-04T17:50:29Z

Hi @argusua @spacepluk @svanderbijl and all other AWS users, the fix #32242 is already merged. I would really appreciate if you could test against the master version (or the head of the tree) to see whether the problem is fixed or not on your cluster.

There is still a little chance of failure if you reboot the node instead of terminating the node and replace the node with new name. I am working on this which requires a bigger fix in PR #33760

Please let me know if you have any questions. Thank you!

argusua · 2016-10-04T18:09:58Z

Thank you, @jingxu97
Unfortunately I have a vacation and I'll come back only after 2 weeks. I hope somebody can help with testing also. Sorry.

When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324

tnine · 2016-10-05T22:17:00Z

FYI, this issue also effects PetSets in the 1.4.0 release. When a pod is killed or the minion running it dies, the volume does not detach to re-attach to the newly assigned minion instance. However, the pod will start after timing out on the attach of the PV. This seems like a bug with PetSets. If the PV fails to mount, it should not continue. Should this be a separate issue or included here?

jingxu97 · 2016-10-05T22:23:24Z

@tnine, for minion dies and restart/recreate, I am trying to address the issue of volume manager fails to detach and attach in #33760. Please let me know if that PR will address your concern or not.

As for pod is killed (nodes are running file), attach/detach should work as supposed to. Please let me know if you have any issue you have experienced. Thanks!

When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324

Automatic merge from submit-queue Fix race condition in updating attached volume between master and node This PR tries to fix issue kubernetes#29324. The cause of this issue is that a race condition happens when marking volumes as attached for node status. This PR tries to clean up the logic of when and where to mark volumes as attached/detached. Basically the workflow as follows, 1. When volume is attached sucessfully, the volume and node info is added into nodesToUpdateStatusFor to mark the volume as attached to the node. 2. When detach request comes in, it will check whether it is safe to detach now. If the check passes, remove the volume from volumesToReportAsAttached to indicate the volume is no longer considered as attached now. Afterwards, reconciler tries to update node status and trigger detach operation. If any of these operation fails, the volume is added back to the volumesToReportAsAttached list showing that it is still attached. These steps should make sure that kubelet get the right (might be outdated) information about which volume is attached or not. It also garantees that if detach operation is pending, kubelet should not trigger any mount operations. (cherry picked from commit 6a9a93d)

When volume is attached, it is possible that the actual state already has this volume object (e.g., the volume is attached to multiple nodes, or volume was detached and attached again). We need to update the device path in such situation, otherwise, the device path would be stale information and cause kubelet mount to the wrong device. This PR partially fixes issue kubernetes#29324

This PR tries to fix issue kubernetes#29324. This cause of this issue is a race condition happens when marking volumes as attached for node status. This PR tries to clean up the logic of when and where to mark volumes as attached/detached. Basically the workflow as follows, 1. When volume is attached sucessfully, the volume and node info is added into nodesToUpdateStatusFor to mark the volume as attached to the node. 2. When detach request comes in, it will check whether it is safe to detach now. If the check passes, remove the volume from volumesToReportAsAttached to indicate the volume is no longer considered as attached now. Afterwards, reconciler tries to update node status and trigger detach operation. If any of these operation fails, the volume is added back to the volumesToReportAsAttached list showing that it is still attached. These steps should make sure that kubelet get the right (might be outdated) information about which volume is attached or not. It also garantees that if detach operation is pending, kubelet should not trigger any mount operations.

The problem is that attachments are now done on the master, and we are only caching the attachment map persistently for the local instance. So there is now a race, because the attachment map is cleared every time. Issue kubernetes#29324

In the light of issue kubernetes#29324, double check that the volume was attached correctly where we expect it, before returning. Issue kubernetes#29324

rdtr · 2016-12-03T00:38:04Z

Is this issue already resolved?
I've been facing with very similar issue with v1.4.6+coreos0 for a prometheus pod.

mikesplain · 2016-12-03T17:54:15Z

@rdtr Not the same issue as this but may help you #36840, which should be released this coming week on the 1.4 branch.

rdtr · 2016-12-05T19:44:56Z

I see. Thank you for the information! I'm looking forward to the next release.

hanikesn · 2016-12-13T17:41:55Z

We upgraded to 1.4.7 and are still seeing delays of about 2 min, when pods change the host they're running on:

  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  4m		4m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned grafana-11524727-6vfm2 to ip-10-128-16-121.eu-central-1.compute.internal
  4m		3m		8	{controller-manager }			Warning		FailedMount	Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
		status code: 400, request id: 
  2m		2m	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}					Warning	FailedMount	Unable to mount volumes for pod "grafana-11524727-6vfm2_kube-system(bf839d9d-c158-11e6-8c8f-066e0d14e725)": timeout expired waiting for volumes to attach/mount for pod "grafana-11524727-6vfm2"/"kube-system". list of unattached/unmounted volumes=[data-volume]
  2m		2m	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}					Warning	FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "grafana-11524727-6vfm2"/"kube-system". list of unattached/unmounted volumes=[data-volume]
  41s		41s	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}	spec.containers{grafana}	Normal	Pulling		pulling image "grafana/grafana:4.0.2"
  31s		31s	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}	spec.containers{grafana}	Normal	Pulled		Successfully pulled image "grafana/grafana:4.0.2"
  31s		31s	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}	spec.containers{grafana}	Normal	Created		Created container with docker id 18e5dc26b0cb; Secu

Dec 13 17:25:13 master1 kube-controller-manager[10866]: I1213 17:25:13.530908   10866 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" (spec.Name: "grafana") from node "ip-10-128-16-121.eu-central-1.compute.internal".
Dec 13 17:25:01 master1 kube-controller-manager[10866]: I1213 17:25:01.978733   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:25:01 master1 kube-controller-manager[10866]: I1213 17:25:01.935105   10866 operation_executor.go:700] DetachVolume.Detach succeeded for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" (spec.Name: "grafana") from node "ip-10-128-16-122.eu-central-1.compute.internal".
Dec 13 17:24:51 master1 kube-controller-manager[10866]: I1213 17:24:51.317749   10866 operation_executor.go:754] Verified volume is safe to detach for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" (spec.Name: "grafana") from node "ip-10-128-16-122.eu-central-1.compute.intern
al".
Dec 13 17:24:51 master1 kube-controller-manager[10866]: I1213 17:24:51.311087   10866 reconciler.go:167] Started DetachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" from node "ip-10-128-16-122.eu-central-1.compute.internal"
Dec 13 17:23:47 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:47 master1 kube-controller-manager[10866]: I1213 17:23:47.282409   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206",
 FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:47 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:47 master1 kube-controller-manager[10866]: E1213 17:23:47.282382   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:24:51.282357403 +0000 UTC (durationBeforeRetry 1m4s
). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:47 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:47 master1 kube-controller-manager[10866]: E1213 17:23:47.282300   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:46 master1 kube-controller-manager[10866]: I1213 17:23:46.844133   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:23:14 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:14 master1 kube-controller-manager[10866]: I1213 17:23:14.806039   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206",
 FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:14 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:14 master1 kube-controller-manager[10866]: E1213 17:23:14.805591   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:23:46.805566071 +0000 UTC (durationBeforeRetry 32s)
. Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:14 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:14 master1 kube-controller-manager[10866]: E1213 17:23:14.805510   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:14 master1 kube-controller-manager[10866]: I1213 17:23:14.238093   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:58 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:58 master1 kube-controller-manager[10866]: I1213 17:22:58.162915   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206",
 FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:58 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:58 master1 kube-controller-manager[10866]: E1213 17:22:58.162406   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:23:14.162380778 +0000 UTC (durationBeforeRetry 16s)
. Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:58 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:58 master1 kube-controller-manager[10866]: E1213 17:22:58.162324   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:57 master1 kube-controller-manager[10866]: I1213 17:22:57.694447   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:49 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:49 master1 kube-controller-manager[10866]: I1213 17:22:49.684951   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:49 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:49 master1 kube-controller-manager[10866]: E1213 17:22:49.684778   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:57.684751979 +0000 UTC (durationBeforeRetry 8s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:49 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:49 master1 kube-controller-manager[10866]: E1213 17:22:49.684697   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:49 master1 kube-controller-manager[10866]: I1213 17:22:49.255749   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:45 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:45 master1 kube-controller-manager[10866]: I1213 17:22:45.188568   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:45 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:45 master1 kube-controller-manager[10866]: E1213 17:22:45.188109   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:49.188084737 +0000 UTC (durationBeforeRetry 4s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:45 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:45 master1 kube-controller-manager[10866]: E1213 17:22:45.188031   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:44 master1 kube-controller-manager[10866]: I1213 17:22:44.744739   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:42 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:42 master1 kube-controller-manager[10866]: I1213 17:22:42.730993   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:42 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:42 master1 kube-controller-manager[10866]: E1213 17:22:42.730470   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:44.730445015 +0000 UTC (durationBeforeRetry 2s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:42 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:42 master1 kube-controller-manager[10866]: E1213 17:22:42.730392   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:42 master1 kube-controller-manager[10866]: I1213 17:22:42.238802   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:41 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:41 master1 kube-controller-manager[10866]: I1213 17:22:41.196712   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:41 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:41 master1 kube-controller-manager[10866]: E1213 17:22:41.196686   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:42.196661757 +0000 UTC (durationBeforeRetry 1s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:41 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:41 master1 kube-controller-manager[10866]: E1213 17:22:41.196609   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:40 master1 kube-controller-manager[10866]: I1213 17:22:40.735292   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:40 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:40 master1 kube-controller-manager[10866]: I1213 17:22:40.147534   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:40 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:40 master1 kube-controller-manager[10866]: E1213 17:22:40.147477   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:40.647450881 +0000 UTC (durationBeforeRetry 500ms). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:40 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:40 master1 kube-controller-manager[10866]: E1213 17:22:40.147394   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:39 master1 kube-controller-manager[10866]: I1213 17:22:39.632706   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:39 master1 kube-controller-manager[10866]: I1213 17:22:39.588878   10866 event.go:217] Event(api.ObjectReference{Kind:"ReplicaSet", Namespace:"kube-system", Name:"grafana-11524727", UID:"57a37ad9-be23-11e6-b98a-069d86b508b9", APIVersion:"extensions", ResourceVersion:"12081091", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: grafana-11524727-6vfm2

chrislovecnm · 2016-12-22T20:23:17Z

@justinsb we have recreated this twice in a row. We are going to get logs for you.

fejta-bot · 2017-12-19T13:29:38Z

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

jingxu97 · 2018-01-12T22:43:17Z

The original issue on this post about "Wrong AWS volume can be mounted" is solved already. The other issue mentioned related to detach delay is addressed in other issues #52735, #53059 . Close this one.

apelisse added sig/storage Categorizes an issue or PR as relevant to SIG Storage. team/cluster labels Jul 21, 2016

matchstick assigned justinsb, saad-ali and jingxu97 and unassigned saad-ali Aug 12, 2016

matchstick added this to the v1.3 milestone Aug 12, 2016

thockin added team/sig-aws and removed team/sig-aws labels Aug 12, 2016

justinsb added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Aug 22, 2016

justinsb mentioned this issue Aug 22, 2016

AWS: fix volume device assignment race condition #31090

Merged

shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this issue Dec 1, 2016

AWS: Sanity checks after volume attach

1e5692e

In the light of issue kubernetes#29324, double check that the volume was attached correctly where we expect it, before returning. Issue kubernetes#29324

gnufied mentioned this issue Jan 14, 2017

Write e2e tests for wrong volume being mounted #39910

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 19, 2017

jingxu97 closed this as completed Jan 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong AWS volume can be mounted #29324

Wrong AWS volume can be mounted #29324

atombender commented Jul 20, 2016 •

edited

sstarcher commented Aug 3, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 13, 2016

sstarcher commented Aug 13, 2016

jingxu97 commented Aug 15, 2016

justinsb commented Aug 22, 2016

justinsb commented Aug 22, 2016

jingxu97 commented Oct 4, 2016 •

edited

argusua commented Oct 4, 2016 •

edited

tnine commented Oct 5, 2016

jingxu97 commented Oct 5, 2016

rdtr commented Dec 3, 2016

mikesplain commented Dec 3, 2016

rdtr commented Dec 5, 2016

hanikesn commented Dec 13, 2016

chrislovecnm commented Dec 22, 2016

fejta-bot commented Dec 19, 2017

jingxu97 commented Jan 12, 2018

Wrong AWS volume can be mounted #29324

Wrong AWS volume can be mounted #29324

Comments

atombender commented Jul 20, 2016 • edited

sstarcher commented Aug 3, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

jingxu97 commented Aug 12, 2016

atombender commented Aug 13, 2016

sstarcher commented Aug 13, 2016

jingxu97 commented Aug 15, 2016

justinsb commented Aug 22, 2016

justinsb commented Aug 22, 2016

jingxu97 commented Oct 4, 2016 • edited

argusua commented Oct 4, 2016 • edited

tnine commented Oct 5, 2016

jingxu97 commented Oct 5, 2016

rdtr commented Dec 3, 2016

mikesplain commented Dec 3, 2016

rdtr commented Dec 5, 2016

hanikesn commented Dec 13, 2016

chrislovecnm commented Dec 22, 2016

fejta-bot commented Dec 19, 2017

jingxu97 commented Jan 12, 2018

atombender commented Jul 20, 2016 •

edited

jingxu97 commented Oct 4, 2016 •

edited

argusua commented Oct 4, 2016 •

edited