Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong AWS volume can be mounted #29324

Closed
atombender opened this issue Jul 20, 2016 · 89 comments
Closed

Wrong AWS volume can be mounted #29324

atombender opened this issue Jul 20, 2016 · 89 comments
Assignees
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Milestone

Comments

@atombender
Copy link
Contributor

atombender commented Jul 20, 2016

Background: K8s often has problems attaching/detaching AWS volumes: If I do a kubectl replace on a deployment, I frequently observe K8s not correctly detaching the volume, and the new deployment will be forever stuck in a crash loop because it times out getting the new volume. Seems like a race condition. The steps I always do when this happens is to delete the deployment, manually force-detach via EC2 API, and manually umount the mounts, otherwise they never seem to be cleaned up.

So today I experienced this again, and I did the following:

  1. Delete deployment.
  2. Manually detach volume via EC2 API, since it was not being released.
  3. Manually umount wrong volume. I accidentally unmounted the volume of a different pod because I didn't read the mount output correctly. Oops.
  4. Created deployment.
  5. Noticed my mistake.

What happened was that the new deployment was mounted with the wrong device:

$ mount | grep xvdb
/dev/xvdba on /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/vol-0e3bf8343563ee5aa type ext4 (rw,relatime,data=ordered)
/dev/xvdba on /var/lib/kubelet/pods/a8f54bf6-4ebd-11e6-89fc-065b459e106f/volumes/kubernetes.io~aws-ebs/data type ext4 (rw,relatime,data=ordered)

This is the right volume ID, but it has mounted the wrong device. According to EC2, the (right) volume is attached to /dev/xvdbb.

$ aws ec2 describe-volumes --volume-ids vol-0e3bf8343563ee5aa
{
    "Volumes": [
        {
            "AvailabilityZone": "us-east-1e", 
            "Attachments": [
                {
                    "AttachTime": "2016-07-20T21:05:17.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e3bf8343563ee5aa", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdbb"
                }
            ], 
            "Tags": [],
            "Encrypted": false, 
            "VolumeType": "gp2", 
            "VolumeId": "vol-0e3bf8343563ee5aa", 
            "State": "in-use", 
            "Iops": 100, 
            "SnapshotId": "", 
            "CreateTime": "2016-07-19T00:25:00.168Z", 
            "Size": 30
        }
    ]
}

/dev/xvdba is the volume I accidentally unmounted by mistake:

            // ...
            "Attachments": [
                {
                    "AttachTime": "2016-07-19T17:56:27.000Z", 
                    "InstanceId": "i-06d03923e9e5a6329", 
                    "VolumeId": "vol-0e0a13f1b3a4a6dbc", 
                    "State": "attached", 
                    "DeleteOnTermination": false, 
                    "Device": "/dev/xvdba"
                }
            ], 

So it's mounting the wrong volume into the pod.

Of course I was being naughty, but it seems to me that this should never happen, since EC2 has already allocated xvdbb?

Here's the entire KCM output from this fiasco.

@apelisse apelisse added sig/storage Categorizes an issue or PR as relevant to SIG Storage. team/cluster labels Jul 21, 2016
@sstarcher
Copy link

I may of just ran into the same problem without doing any funny work. I have 2 separate pods that mount 2 separate aws ebs volumes. Currently on the instance only 1 volume shows up as mounted, but in AWS EC2 both volumes are attached to the instance. Both pods have the same data volume mounted inside.

Kubernetes 1.3.4

  • Pod A has a persistent volume claim and mounts ebs volume A
  • Pod B has a direct awsElasticBlockStore statement in the pod for ebs volume B
  • Volume A and B are both attached to the EC2 instance
  • Volume A is properly mounted on the host
  • Pod A and Pod B both have volume A mounted inside of them. This is incorrect for Pod B.
  • Deleting Pod A unattaches the ebs volume A and ebs volume B was mounted

@matchstick matchstick assigned justinsb, saad-ali and jingxu97 and unassigned saad-ali Aug 12, 2016
@matchstick matchstick added this to the v1.3 milestone Aug 12, 2016
@jingxu97
Copy link
Contributor

@atombender, @sstarcher Thank you for posting the issues. Could you please share the kubelet log if it is still available? Thanks!

Jing

@atombender
Copy link
Contributor Author

@jingxu97
Copy link
Contributor

@atombender , thanks for the log. Could you also please share your deployment spec with us? Thanks!

@atombender
Copy link
Contributor Author

I don't have it right here, but I basically just use an AWS volume in the pod declaration (not a persistent vol + claim) and a single volume mount.

I will send you the actual file later when I am back before a screen.

On Aug 12, 2016, at 16:55, Jing Xu notifications@github.com wrote:

@atombender , thanks for the log. Could you also please share your deployment spec with us? Thanks!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@jingxu97
Copy link
Contributor

@atombender thanks for your response. Also the log file you posted only has some error messages. Do you still have the kubelet full log on the instance? Thanks!

@atombender
Copy link
Contributor Author

That is the full log. (I have kept the logs all the way from when I started using K8s.) I believe I extracted the right time range.

FWIW, I don't think K8s logged much in terms of errors (though you will see some in the KCM log). In fact, that's kind of the point -- it attached the wrong thing, without erroring.

On Aug 12, 2016, at 18:20, Jing Xu notifications@github.com wrote:

@atombender thanks for your response. Also the log file you posted only has some error messages. Do you still have the kubelet full log on the instance? Thanks!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@jingxu97
Copy link
Contributor

@sstarcher I followed your steps but with gce disks, and it seems mounting correctly. Could you please share the kubelet logs so that I can do some debugging? Thanks!

@jingxu97
Copy link
Contributor

@atombender From KCM log, I only see attach and detach once. But it seems that you created two deployments, and it should have two attach log. From kubelet log, they are all error messages (unable to mount, unable to cleanup), but there is no information log.

@atombender
Copy link
Contributor Author

I may have done the other deploy the day before. I will get you logs going farther back in time.

On Aug 12, 2016, at 18:35, Jing Xu notifications@github.com wrote:

@atombender From KCM log, I only see attach and detach once. But it seems that you created two deployments, and it should have two attach log. From kubelet log, they are all error messages (unable to mount, unable to cleanup), but there is no information log.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@sstarcher
Copy link

@jingxu97 I don't have the same exact setup anymore. I'll attempt to reproduce on monday if possible.

@jingxu97
Copy link
Contributor

@justinsb I was trying to reproduce the error @sstarcher described for AWS volume. I tried the same steps on gce disks and it is working fine. But I don't have the AWS volumes to test. If you have the setup, could you please try it on your environment when you have time. The steps @sstarcher described as follows. Thank you very much!

Kubernetes 1.3.4

  • Pod A has a persistent volume claim and mounts ebs volume A
  • Pod B has a direct awsElasticBlockStore statement in the pod for ebs volume B
  • Volume A and B are both attached to the EC2 instance
  • Volume A is properly mounted on the host
  • Pod A and Pod B both have volume A mounted inside of them. This is incorrect for Pod B.
  • Deleting Pod A unattaches the ebs volume A and ebs volume B was mounted

@justinsb justinsb added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Aug 22, 2016
@justinsb
Copy link
Member

I think I know what is happening here...

  • A user reported that this was happening also with just awsElasticBlockStore volumes. They provided me an extract of their logs. (Thanks Fairfax!)
  • In the logs, we saw ~5 volumes being assigned to a single instance at the same time, and all 5 of them were allocated the same device (within a very small time interval).
  • This is supposed to be impossible - we keep an in-memory map of mounts in-flight precisely to avoid this, so that we can keep track of devices before they hit the AWS API (because we have to choose a device for AWS to mount the volume, but I didn't particularly want to hold a lock while making the attachment)
  • BUT... we keep this map for the local instance, because mounts happen locally, but not on the master because otherwise we'd have to worry about how to garbage collect it.
  • but of course in 1.3 the previous doesn't hold - mounts now happen on the master :-(

Looking at simple ways to fix this. Probably making the map of mounts in-flight just be global state. It shouldn't leak, but even if it does under rare circumstances, it will be better than this bug.

@justinsb
Copy link
Member

Just put up an experimental fix - haven't tested it even compiles yet. But thought I would get the ball rolling on how I think we should fix this!

jingxu97 added a commit to jingxu97/kubernetes that referenced this issue Sep 30, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
jingxu97 added a commit to jingxu97/kubernetes that referenced this issue Oct 3, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
jingxu97 added a commit to jingxu97/kubernetes that referenced this issue Oct 4, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
k8s-github-robot pushed a commit that referenced this issue Oct 4, 2016
Automatic merge from submit-queue

Fix issue in updating device path when volume is attached multiple times

When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue #29324
@jingxu97
Copy link
Contributor

jingxu97 commented Oct 4, 2016

Hi @argusua @spacepluk @svanderbijl and all other AWS users, the fix #32242 is already merged. I would really appreciate if you could test against the master version (or the head of the tree) to see whether the problem is fixed or not on your cluster.

There is still a little chance of failure if you reboot the node instead of terminating the node and replace the node with new name. I am working on this which requires a bigger fix in PR #33760

Please let me know if you have any questions. Thank you!

@argusua
Copy link

argusua commented Oct 4, 2016

Thank you, @jingxu97
Unfortunately I have a vacation and I'll come back only after 2 weeks. I hope somebody can help with testing also. Sorry.

dagnello pushed a commit to hpcloud/kubernetes that referenced this issue Oct 4, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
@tnine
Copy link

tnine commented Oct 5, 2016

FYI, this issue also effects PetSets in the 1.4.0 release. When a pod is killed or the minion running it dies, the volume does not detach to re-attach to the newly assigned minion instance. However, the pod will start after timing out on the attach of the PV. This seems like a bug with PetSets. If the PV fails to mount, it should not continue. Should this be a separate issue or included here?

@jingxu97
Copy link
Contributor

jingxu97 commented Oct 5, 2016

@tnine, for minion dies and restart/recreate, I am trying to address the issue of volume manager fails to detach and attach in #33760. Please let me know if that PR will address your concern or not.

As for pod is killed (nodes are running file), attach/detach should work as supposed to. Please let me know if you have any issue you have experienced. Thanks!

jingxu97 added a commit to jingxu97/kubernetes that referenced this issue Oct 6, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
maxwell92 pushed a commit to maxwell92/kubernetes that referenced this issue Oct 8, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
jingxu97 added a commit to jingxu97/kubernetes that referenced this issue Oct 11, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
jingxu97 added a commit to jingxu97/kubernetes that referenced this issue Oct 11, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this issue Dec 1, 2016
Automatic merge from submit-queue

Fix race condition in updating attached volume between master and node

This PR tries to fix issue kubernetes#29324. The cause of this issue is that a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
(cherry picked from commit 6a9a93d)
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this issue Dec 1, 2016
When volume is attached, it is possible that the actual state
already has this volume object (e.g., the volume is attached to multiple
nodes, or volume was detached and attached again). We need to update the
device path in such situation, otherwise, the device path would be stale
information and cause kubelet mount to the wrong device.

This PR partially fixes issue kubernetes#29324
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this issue Dec 1, 2016
This PR tries to fix issue kubernetes#29324. This cause of this issue is a race
condition happens when marking volumes as attached for node status. This
PR tries to clean up the logic of when and where to mark volumes as
attached/detached. Basically the workflow as follows,
1. When volume is attached sucessfully, the volume and node info is
added into nodesToUpdateStatusFor to mark the volume as attached to the
node.
2. When detach request comes in, it will check whether it is safe to
detach now. If the check passes, remove the volume from volumesToReportAsAttached
to indicate the volume is no longer considered as attached now.
Afterwards, reconciler tries to update node status and trigger detach
operation. If any of these operation fails, the volume is added back to
the volumesToReportAsAttached list showing that it is still attached.

These steps should make sure that kubelet get the right (might be
outdated) information about which volume is attached or not. It also
garantees that if detach operation is pending, kubelet should not
trigger any mount operations.
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this issue Dec 1, 2016
The problem is that attachments are now done on the master, and we are
only caching the attachment map persistently for the local instance.  So
there is now a race, because the attachment map is cleared every time.

Issue kubernetes#29324
shyamjvs pushed a commit to shyamjvs/kubernetes that referenced this issue Dec 1, 2016
In the light of issue kubernetes#29324, double check that the volume was attached
correctly where we expect it, before returning.

Issue kubernetes#29324
@rdtr
Copy link

rdtr commented Dec 3, 2016

Is this issue already resolved?
I've been facing with very similar issue with v1.4.6+coreos0 for a prometheus pod.

@mikesplain
Copy link

@rdtr Not the same issue as this but may help you #36840, which should be released this coming week on the 1.4 branch.

@rdtr
Copy link

rdtr commented Dec 5, 2016

I see. Thank you for the information! I'm looking forward to the next release.

@hanikesn
Copy link

We upgraded to 1.4.7 and are still seeing delays of about 2 min, when pods change the host they're running on:

  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  4m		4m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned grafana-11524727-6vfm2 to ip-10-128-16-121.eu-central-1.compute.internal
  4m		3m		8	{controller-manager }			Warning		FailedMount	Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
		status code: 400, request id: 
  2m		2m	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}					Warning	FailedMount	Unable to mount volumes for pod "grafana-11524727-6vfm2_kube-system(bf839d9d-c158-11e6-8c8f-066e0d14e725)": timeout expired waiting for volumes to attach/mount for pod "grafana-11524727-6vfm2"/"kube-system". list of unattached/unmounted volumes=[data-volume]
  2m		2m	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}					Warning	FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "grafana-11524727-6vfm2"/"kube-system". list of unattached/unmounted volumes=[data-volume]
  41s		41s	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}	spec.containers{grafana}	Normal	Pulling		pulling image "grafana/grafana:4.0.2"
  31s		31s	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}	spec.containers{grafana}	Normal	Pulled		Successfully pulled image "grafana/grafana:4.0.2"
  31s		31s	1	{kubelet ip-10-128-16-121.eu-central-1.compute.internal}	spec.containers{grafana}	Normal	Created		Created container with docker id 18e5dc26b0cb; Secu
Dec 13 17:25:13 master1 kube-controller-manager[10866]: I1213 17:25:13.530908   10866 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" (spec.Name: "grafana") from node "ip-10-128-16-121.eu-central-1.compute.internal".
Dec 13 17:25:01 master1 kube-controller-manager[10866]: I1213 17:25:01.978733   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:25:01 master1 kube-controller-manager[10866]: I1213 17:25:01.935105   10866 operation_executor.go:700] DetachVolume.Detach succeeded for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" (spec.Name: "grafana") from node "ip-10-128-16-122.eu-central-1.compute.internal".
Dec 13 17:24:51 master1 kube-controller-manager[10866]: I1213 17:24:51.317749   10866 operation_executor.go:754] Verified volume is safe to detach for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" (spec.Name: "grafana") from node "ip-10-128-16-122.eu-central-1.compute.intern
al".
Dec 13 17:24:51 master1 kube-controller-manager[10866]: I1213 17:24:51.311087   10866 reconciler.go:167] Started DetachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" from node "ip-10-128-16-122.eu-central-1.compute.internal"
Dec 13 17:23:47 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:47 master1 kube-controller-manager[10866]: I1213 17:23:47.282409   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206",
 FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:47 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:47 master1 kube-controller-manager[10866]: E1213 17:23:47.282382   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:24:51.282357403 +0000 UTC (durationBeforeRetry 1m4s
). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:47 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:47 master1 kube-controller-manager[10866]: E1213 17:23:47.282300   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:46 master1 kube-controller-manager[10866]: I1213 17:23:46.844133   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:23:14 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:14 master1 kube-controller-manager[10866]: I1213 17:23:14.806039   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206",
 FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:14 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:14 master1 kube-controller-manager[10866]: E1213 17:23:14.805591   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:23:46.805566071 +0000 UTC (durationBeforeRetry 32s)
. Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:14 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:23:14 master1 kube-controller-manager[10866]: E1213 17:23:14.805510   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:23:14 master1 kube-controller-manager[10866]: I1213 17:23:14.238093   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:58 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:58 master1 kube-controller-manager[10866]: I1213 17:22:58.162915   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206",
 FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:58 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:58 master1 kube-controller-manager[10866]: E1213 17:22:58.162406   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:23:14.162380778 +0000 UTC (durationBeforeRetry 16s)
. Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:58 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:58 master1 kube-controller-manager[10866]: E1213 17:22:58.162324   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:57 master1 kube-controller-manager[10866]: I1213 17:22:57.694447   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:49 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:49 master1 kube-controller-manager[10866]: I1213 17:22:49.684951   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:49 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:49 master1 kube-controller-manager[10866]: E1213 17:22:49.684778   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:57.684751979 +0000 UTC (durationBeforeRetry 8s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:49 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:49 master1 kube-controller-manager[10866]: E1213 17:22:49.684697   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:49 master1 kube-controller-manager[10866]: I1213 17:22:49.255749   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:45 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:45 master1 kube-controller-manager[10866]: I1213 17:22:45.188568   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:45 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:45 master1 kube-controller-manager[10866]: E1213 17:22:45.188109   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:49.188084737 +0000 UTC (durationBeforeRetry 4s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:45 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:45 master1 kube-controller-manager[10866]: E1213 17:22:45.188031   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:44 master1 kube-controller-manager[10866]: I1213 17:22:44.744739   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:42 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:42 master1 kube-controller-manager[10866]: I1213 17:22:42.730993   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:42 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:42 master1 kube-controller-manager[10866]: E1213 17:22:42.730470   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:44.730445015 +0000 UTC (durationBeforeRetry 2s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:42 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:42 master1 kube-controller-manager[10866]: E1213 17:22:42.730392   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:42 master1 kube-controller-manager[10866]: I1213 17:22:42.238802   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:41 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:41 master1 kube-controller-manager[10866]: I1213 17:22:41.196712   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:41 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:41 master1 kube-controller-manager[10866]: E1213 17:22:41.196686   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:42.196661757 +0000 UTC (durationBeforeRetry 1s). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:41 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:41 master1 kube-controller-manager[10866]: E1213 17:22:41.196609   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:40 master1 kube-controller-manager[10866]: I1213 17:22:40.735292   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:40 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:40 master1 kube-controller-manager[10866]: I1213 17:22:40.147534   10866 event.go:217] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"grafana-11524727-6vfm2", UID:"bf839d9d-c158-11e6-8c8f-066e0d14e725", APIVersion:"v1", ResourceVersion:"12517206", FieldPath:""}): type: 'Warning' reason: 'FailedMount' Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:40 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:40 master1 kube-controller-manager[10866]: E1213 17:22:40.147477   10866 nestedpendingoperations.go:262] Operation for "\"kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8\"" failed. No retries permitted until 2016-12-13 17:22:40.647450881 +0000 UTC (durationBeforeRetry 500ms). Error: Failed to attach volume "grafana" on node "ip-10-128-16-121.eu-central-1.compute.internal" with: Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:40 master1 kube-controller-manager[10866]: status code: 400, request id:
Dec 13 17:22:40 master1 kube-controller-manager[10866]: E1213 17:22:40.147394   10866 attacher.go:72] Error attaching volume "vol-0dacaeb7988eaecb8": Error attaching EBS volume: VolumeInUse: vol-0dacaeb7988eaecb8 is already attached to an instance
Dec 13 17:22:39 master1 kube-controller-manager[10866]: I1213 17:22:39.632706   10866 reconciler.go:202] Started AttachVolume for volume "kubernetes.io/aws-ebs/vol-0dacaeb7988eaecb8" to node "ip-10-128-16-121.eu-central-1.compute.internal"
Dec 13 17:22:39 master1 kube-controller-manager[10866]: I1213 17:22:39.588878   10866 event.go:217] Event(api.ObjectReference{Kind:"ReplicaSet", Namespace:"kube-system", Name:"grafana-11524727", UID:"57a37ad9-be23-11e6-b98a-069d86b508b9", APIVersion:"extensions", ResourceVersion:"12081091", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: grafana-11524727-6vfm2

@chrislovecnm
Copy link
Contributor

@justinsb we have recreated this twice in a row. We are going to get logs for you.

@fejta-bot
Copy link

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 19, 2017
@jingxu97
Copy link
Contributor

The original issue on this post about "Wrong AWS volume can be mounted" is solved already. The other issue mentioned related to detach delay is addressed in other issues #52735, #53059 . Close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests