Deployments with GCE PD fail with "...is already being used by..." #48968

saad-ali · 2017-07-14T22:49:30Z

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

From user report on reddit:

I have a deployment with persistent volume claim in Google cloud. One pod is using this volume. Deployment is of "recreate" type. But each time node is feeling under the weather and reschedules this pod to another one, it fails to start with:

googleapi: Error 400: The disk resource 'projects/...-pvc-...' is already being used by 'projects/.../instances/..node-bla-bla'     

I've stumbled across some issues on github, but don't see definitive solution. Due to the nature of the problem, I cannot reliably recreate it manually, artificial overload needs to be created.

What I considered doing:
1. Create some sort of gluster/ceph/whateverfs cluster and using it as PV. Con: additional point of failure, needs setup/maintenance of its own.
2. Create separate node pool with 1 node in it and schedule deployment strictly to that pool. Con: doesn't scale neither up nor down, at this point no need in one whole node just for that deployment, but if it grows then problem starts all over.

I've upgraded cluster and nodes to 1.6.7, but don't know if it will matter. Any help appreciated.

Other reports here:

GCE PD Volumes already attached to a node fail with "Error 400: The disk resource is already being used by" node #19953 (comment)

What you expected to happen:
Volume should attach to new node without issue.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration**:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

The text was updated successfully, but these errors were encountered:

saad-ali · 2017-07-14T23:28:00Z

This report mentioned Deployments along with GCE PDs. This can get tricky because in some cases it can result in multiple pods (scheduled to different nodes) referencing the same (read-write once) volume which will cause the second pod to not start.

To prevent this from happening, the general recommendations for using Deployments with GCE PDs is:

Set the deployment replicas count to 1 -- Because GCE PDs can only support Read-write attachment to a single node at a time, and if you have more than 1 replica, pods may be scheduled to different nodes.
When doing rolling updates either:
1. Use the "Recreate" strategy, which ensures that old pods are killed before new pods are created (there was a bug Recreate deployments dont wait for pod termination #27362 where this doesn't work correctly in some cases that apperently was fixed a long time ago)
2. Use the "RollingUpdate" strategy with MaxSurge=0 and MaxUnavailable=1.
  - If a strategy is not specified for a deployment, the default is RollingUpdate. Rolling update strategy has two parameters maxUnavailable and maxSurge; when not specified they default to 1 and 1 respectively. This means that during a rolling update it requires at least one pod from the old deployment to remain and permits an extra new pod (beyond the requested number of replicas) to be created. When this happens, if the new pod lands on a different node, since the old pod has the disk attached as read-write the new pod will fail to start.

However, the reporter mentioned they used the "Recreate" strategy, which means that there must be a bug here.

To help us debug, if you run into this issue, please:

Verify that you are adhering guidance provided above.
Grab and share the following with me (either post here or email directly if you don't want to share publicly):
* Your kube-controller-manager logs from your master (if you're on GKE contact customer support, reference this issue, and ask them to grab the logs for you).
* Your deployment YAML
* A description of what commands your ran and when.

Let's figure this out!

CC @kubernetes/sig-storage-bugs

msau42 · 2017-07-15T00:04:57Z

The initial report mentions that the node was not healthy, so the pod got rescheduled. Does that trigger the update strategy (recreate or rollingupdate)?

yuvipanda · 2017-07-15T04:37:32Z

We've run into this a lot too, and we weren't using deployments. We were simply spawning pods that had a PD attached. After the pods were killed, the disks don't seem to be detached from the underling VM automatically, sometimes for hours? When the pods come back up on a different machine, we often get this exact same error error (and it sometimes clears up in a few minutes, often not).

We wrote a script that just watches for this issue (grepping describe in a loop, ugh) and runs the appropriate gcloud detach command, and that made the problem 'go away' for us.

We were on GKE.

tamalsaha · 2017-07-15T09:30:37Z

We also see this issue happening. The most annoying thing about GCE is that it does "silent" migration of VMs, after which PVs are not released. So, when Kubernetes wants to start pod on that node, it fails with resource already in use type error.

cc: @sadlil

wawastein · 2017-07-15T09:52:16Z

Hello, I'm original reddit poster. Couple more details.

Deployment in question (sans sensitive info):

kind: Deployment
metadata:
  name: ...-redis
  labels:
    name: ...-redis
spec:
  selector:
    matchLabels:
      name: ...-redis
  template:
    metadata:
      labels:
        name: ...-redis
        app: ...
      namespace: ...
    spec:
      nodeSelector:
          cloud.google.com/gke-nodepool: pool-1
      volumes:
      - name: redis-data
        persistentVolumeClaim:
          claimName: ...-redis
      containers:
      - name: redis
        image: "gcr.io/.../...-redis:latest"
        ports: 
        - name: redis
          containerPort: 6379
        volumeMounts:
        - mountPath: /data
          name: redis-data
  strategy: 
    type: Recreate

The PVC:

apiVersion: v1
metadata:
  name: ...-redis
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi

Last time I observed behavior in question, it was induced by faulty job producing high number of faulty pods, which generated disk pressure. Redis was scheduled to different node and the rest is history.
Please let me know if I can be of any additional help.

jingxu97 · 2017-07-17T17:42:02Z

@yuvipanda Thank you for reporting the issue. Is that possible you could share us some controller's logs during the issue happened? Thanks!

gnufied · 2017-07-17T17:58:33Z

@yuvipanda can you elaborate on "pod gets killed" ? Previously for terminated pods but not deleted from api server, we never used to perform volume detach but this changed in #45191 .

yuvipanda · 2017-07-17T18:16:45Z

@jingxu97 we were running GKE, and I couldn't find a way to get logs of controllers unfortunately.

@gnufied we spawn one pod per user, and then the user can execute arbitrary code inside the pod (via a Jupyter Notebook). We have a script that would watch for users that are inactive, and perform a delete (via the k8s API) of their pods. We found that for a long (and intermittent?) time after that, the volumes would still be attached to the node in which the user's pod was running. So when the user's pod was re-created (maybe they became active again), we need to attach the same volume back to this pod, and this would fail with this message.

yuvipanda · 2017-07-17T22:16:09Z

@jingxu97 reached out to me, and I provided a repro case (https://gist.github.com/yuvipanda/0b6aa32192c35b960e91698e1c14690c)

jingxu97 · 2017-07-17T23:20:12Z

@wawastein thanks for reporting the error. You mentioned job producing high number of faulty pods. Does each pod uses a different PVC so it creates a new volume? Thanks!

wawastein · 2017-07-18T10:55:07Z

@jingxu97 hi, no, job pods do not use or create any volumes. Problem was in misconfiguration, containers couldn't connect to DB and failed time and time again.

yuvipanda · 2017-07-18T17:35:50Z

I can't reproduce this on 1.7 btw. Can on 1.6

jingxu97 · 2017-07-18T19:03:08Z

@yuvipanda I tried on both 1.6.4 and 1.7 cluster with your repro steps, but could not get the errors..

wawastein · 2017-08-01T14:06:28Z

Here we go again.
Cluster 1.7.2
This time nothing extraordinary, just a regular deployment.

Multi-Attach error for volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" Volume is already exclusively attached to one node and can't be attached to another
Unable to mount volumes for pod "imgs-640158079-lq5rg_imgs(cc617d27-76c1-11e7-b22a-42010a84002d)": timeout expired waiting for volumes to attach/mount for pod "imgs"/"imgs-640158079-lq5rg". list of unattached/unmounted volumes=[nginx-cache]
Error syncing pod

jingxu97 · 2017-08-01T16:49:30Z

@wawastein could you please share more information about this issue, in what condition this error occurred? If possible, could you also share the project, zone and cluster name information so that we could check the master log?

wawastein · 2017-08-02T09:15:23Z

@jingxu97 so the timeline was as follows:

I created 1 Gi PVC for using with nginx container.
Deployed the changed config, deploy was successful
Pushed new nginx image to registry and deployed once again
Got the error above.

Project: superlocal-149713
Zone: europe-west1-b
Cluster name: staging-cluster-2

jingxu97 · 2017-08-02T18:22:33Z

@wawastein I checked the master log and found the following related information

kube-controller-manager.log-20170802-1501632000:I0801 13:41:43.693416       5 gce_util.go:144] Successfully created GCE PD volume gke-staging-cluster-2--pvc-25580310-76bf-11e7-b22a-42010a84002d
kube-controller-manager.log-20170802-1501632000:I0801 13:41:44.494282       5 pv_controller.go:1409] volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" provisioned for claim "imgs/imgs-nginx-cache"
kube-controller-manager.log-20170802-1501632000:I0801 13:41:44.500007       5 pv_controller.go:718] volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" entered phase "Bound"
kube-controller-manager.log-20170802-1501632000:I0801 13:41:44.500046       5 pv_controller.go:853] volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" bound to claim "imgs/imgs-nginx-cache"
kube-controller-manager.log-20170802-1501632000:I0801 13:52:45.861876       5 reconciler.go:272] attacherDetacher.AttachVolume started for volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" (UniqueName: "kubernetes.io/gce-pd/gke-staging-cluster-2--pvc-25580310-76bf-11e7-b22a-42010a84002d") from node "gke-staging-cluster-2-pool-1-bb497dff-9mms" 
kube-controller-manager.log-20170802-1501632000:I0801 13:52:53.731639       5 operation_generator.go:271] AttachVolume.Attach succeeded for volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" (UniqueName: "kubernetes.io/gce-pd/gke-staging-cluster-2--pvc-25580310-76bf-11e7-b22a-42010a84002d") from node "gke-staging-cluster-2-pool-1-bb497dff-9mms" 
kube-controller-manager.log-20170802-1501632000:W0801 14:00:38.804960       5 reconciler.go:262] Multi-Attach error for volume "pvc-25580310-76bf-11e7-b22a-42010a84002d" (UniqueName: "kubernetes.io/gce-pd/gke-staging-cluster-2--pvc-25580310-76bf-11e7-b22a-42010a84002d") from node "gke-staging-cluster-2-pool-1-bb497dff-rqsz" Volume is already exclusively attached to one node and can't be attached to another

From the log we can see, around 13:52:53 the pvc is first attached to node gke-staging-cluster-2-pool-1-bb497dff-9mms. Later around 14:00:38, reconciler tries to attached this pvc to a different node and failed since it is still attached to the old node. The problem is here that the old pod is not deleted so the volume is still attached and the new pod is trying to attach then failed.

From the deployment doc https://kubernetes.io/docs/concepts/workloads/controllers/deployment/,
"For example, if you look at the above Deployment closely, you will see that it first created a new Pod, then deleted some old Pods and created new ones. It does not kill old Pods until a sufficient number of new Pods have come up, and does not create new Pods until a sufficient number of old Pods have been killed. It makes sure that number of available Pods is at least 2 and the number of total Pods is at most 4."

I haven't looked at the deployment controller in detail, but if the new pod failed to start because of volume problem and controller does not kill the old pod, it will struck.

I also notice that in your master controller log, there are lots of errors like the following, not sure whether they are relevant or not.

E0801 13:52:19.659620       5 horizontal.go:206] failed to query scale subresource for Deployment/default/dj: deployments/scale.extensions "dj" not found
E0801 13:52:19.668547       5 horizontal.go:206] failed to query scale subresource for Deployment/default/puma: deployments/scale.extensions "puma" not found
E0801 13:52:26.056806       5 horizontal.go:206] failed to compute desired number of replicas based on listed metrics for Deployment/deals-production/cloudsqlproxy: failed to get cpu utilization: missing request for cpu on container cloudsqlproxy in pod deals-production/cloudsqlproxy-795806673-0g5ch
E0801 13:52:27.566351       5 horizontal.go:206] failed to compute desired number of replicas based on listed metrics for Deployment/staging/rpush: failed to get cpu utilization: missing request for cpu on container rpush in pod staging/rpush-196916759-l2p29
E0801 13:52:29.141446       5 horizontal.go:206] failed to compute desired number of replicas based on listed metrics for Deployment/imgs/imgs: failed to get cpu utilization: missing request for cpu on container app in pod imgs/imgs-3690798063-ct914
I0801 13:52:45.765722       5 replica_set.go:455] Too few "imgs"/"imgs-557785410" replicas, need 1, creating 1

I will check with workload team to see whether there is a problem during deployment update.

wawastein · 2017-08-02T18:42:49Z

@jingxu97 the last log messages confused me too, I though it might be because most of the time those pods are idle and don't use resources. HPA ones are related to deleted deployments.

On the bound PVC however, it's at least counter intuitive. If new pod is scheduled to new node, maybe there ought to be be a check for any detachable volumes of the old pod first.

jingxu97 · 2017-08-02T18:49:57Z

@wawastein the scheduler and volume manager are completely separated. When scheduler schedule the pod it will not know whether the volume used by the pod is still attached to some node or not. But normally, old pod should be deleted at some point, so that volume will be detached from the old node, even though it might happen a little bit after the new pod is created

jingxu97 · 2017-08-02T19:13:41Z

@wawastein Could you check the comment here to see whether following the steps could solve your problem? #48968 (comment)

dfcowell · 2017-08-03T14:18:54Z

I've had this same problem pop up when attempting to schedule single-replica deployments after VM migrations of my nodes on GCP.

Eventually the scheduler gives up trying to reschedule the pod and I have to delete it manually to resolve the issue.

I'm going to try recreating the deployments with strategy: Replace and see if that helps, but I'll report back in this issue if it happens again and I can provide a timestamp, node, project etc.

UPDATE: It looks like, even when using the Recreate strategy on the deployment, the cluster schedules the new pod before removing the old one:

Anecdotally though, after changing the strategy to replace, the disk was released much faster than previously and the pod quickly recovered. I'll need a greater sample size before I can say that with any authority though.

andrewhowdencom · 2017-08-04T10:03:21Z

I'm noticing this during lots of deploys. Maybe there's a timeout between switching disks between machines, or some other sort of sync loop?

wawastein · 2017-08-07T13:25:52Z

@jingxu97 I'll try it on the new deployment. But I can't really force reschedule for test, I'll just sit there and hope it doesn't happen.

davidz627 · 2017-12-11T21:22:24Z

/assign looking into recovery from different failure scenarios with recommended settings to make sure recreated pods are schedulable.

jakirpatel · 2017-12-15T11:16:14Z

I am facing similar issue for statefulsets. I have attached the PV through PVC in GKE. But when the replicas are more than 1, the rest pod will go in waiting: the crashloopbackoff state.

Any progress on this issue?

msau42 · 2017-12-15T16:09:16Z

Hi @jakirpatel, crashloopbackoff generally means that the the volume is successfully mounted, but your container is crashing. You'll need to check the logs of your container to see why it's crashing.

msau42 · 2017-12-15T16:10:47Z

Can you post the events you see with 'kubectl describe pod '?

davidz627 · 2017-12-21T23:56:58Z

I have looked into different failure scenarios with our recommended settings and have the following conclusions:

Note: This may not be an exhaustive list of failure scenarios. Please contact me if you are experiencing issues with a different scenario.

Note: All scenarios tested with "replicas:1" and deployment with a PVC referencing gce-pd (only supports single node attach).

	No deployment strategy	Deployment strategy: Recreate	Deployment strategy: Rolling update (maxSurge: 0, maxUnavailable: 1)
Deleting pod manually	Successful new pod	Successful new pod	Successful new pod
Updating deployment	Expected Error: multi-attach	Successful new pod	Successful new pod
Tainting node to evict pods	Successful new pod	Successful new pod	Successful new pod
Node is killed	Successful new pod	Successful new pod	Successful new pod
Stressing node to evict pods	#57531	#57531	#57531

mofirouz · 2017-12-22T00:01:29Z

@davidz627 This seems to be inline with what I've experienced as well. The multi-attach error happened when I was updating the StatefulSet and caused pods to be reshuffled on nodes.

msau42 · 2017-12-22T00:09:06Z

@davidz627 can you also try this experiment with statefulsets? Theoretically, you should not see multi-attach errors with StatefulSets because the pod must be completely deleted before a new pod with the same name can be recreated.

@mofirouz if you could paste your full pod events showing the CrashLoopBackOff, that would be helpful in triaging.

davidz627 · 2017-12-22T23:10:03Z

Also tested with stateful sets with 3 replicas. Used volumeClaimTemplates with gce-pd. Here are the results

	Stateful Set
Deleting pod manually	Successful new pods
Updating stateful set	Successful new pods
Tainting node to evict pods	Successful new pods
Node is killed	Successful new pods
Stressing node to evict pods	#57531

fejta-bot · 2018-03-22T23:29:53Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-04-21T23:46:51Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

plkokanov · 2018-05-16T11:28:53Z

/remove-lifecycle rotten

Hi, we experienced the same problem with kubernetes 1.9.6:

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-27T00:13:02Z", GoVersion:"go1.9.4", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration: GCE

Some more details: We use a statefulset for our POD, configured with RollingUpdate strategy.

Pod was restarted from node1 to node2 and stuck in ContainerCreating state due to

AttachVolume.Attach failed for volume “pvc-id” : googleapi: Error 400: The disk resource ‘projects/blabla/zones/blablazone/disks/gcp-dynam-pvc-id’ is already being used by ‘projects/blabla/zones/blablazone/instances/node1’

controller-manager logs:

I0515 18:04:22.207470       1 reconciler.go:287] attacherDetacher.AttachVolume started for volume "pvc-id" (UniqueName: "kubernetes.io/gce-pd/gcp-dynam-pvc-id) from node "node2"

E0515 18:04:26.778812       1 gce_op.go:88] GCE operation failed: googleapi: Error 400: The disk resource 'projects/blabla/zones/blablazone/disks/gcp-dynam-pvc-id' is already being used by 'projects/blabla/zones/blablazone/instances/node1'

E0515 18:04:26.778879       1 attacher.go:92] Error attaching PD "gcp-dynam-pvc-id" to node "node2": googleapi: Error 400: The disk resource 'projects/blabla/zones/blablazone/disks/gcp-dynam-pvc-id' is already being used by 'projects/blabla/zones/blablazone/instances/node1'

E0515 18:04:26.778969       1 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/gcp-dynam-pvc-id\"" failed. No retries permitted until 2018-05-15 18:06:28.778939119 +0000 UTC m=+19053.925176613 (durationBeforeRetry 2m2s). Error: "AttachVolume.Attach failed for volume \"pvc-id\" (UniqueName: \"kubernetes.io/gce-pd/gcp-dynam-pvc-id") from node \"node2\" : googleapi: Error 400: The disk resource 'projects/blabla/zones/blablazone/disks/gcp-dynam-pvc-id' is already being used by 'projects/blabla/zones/blablazone/instances/node1'"

After checking for volumesAttached and volumesInUse for both nodes we saw:

on node 1 there were no volumesAttached and no volumesInUse
on node 2 there were no volumesAttached, however there was one volumesInUse:
```
volumesInUse:
    gcp-dynam-pvc-id
```

By draining node 2 and having the POD restart on node 1 (happened by accident since we have more than 2 nodes in our cluster) the problem was fixed

Due to the requirement of a manual fix there is some downtime and this is a critical POD for our cluster

msau42 · 2018-05-16T17:43:51Z

@plkokanov to confirm, you saw this problem when you issued a rolling update on your statefulset?

jingxu97 · 2018-05-16T18:47:01Z

@plkokanov would you mind to share us the full controller-manager log to us so we can help triage? You see VolumeInUse in node 2 is normal because controller try to mount gcp-dynam-pvc-id on node 2. The strange thing is why volume is still attached to node 1.

plkokanov · 2018-05-17T12:58:19Z

@msau42 we saw it after the pod was restarted. The reason for the restart was most likely an unsuccessful liveliness probe. I'll make sure to post the exact reason as soon as I manage to reproduce it.
@jingxu97 can't get the controller-manager logs from when the issue happened. I'll see if I can do anything about that.

pv93 · 2018-06-01T19:08:38Z

This error is still occuring to me, running GKE 1.9.6-1. Running a pod with a gce-pd PVC. The odd thing is that this error pops up immediately on creation of the pod and its associated PVC, after deleting the old pod and PVC. My guess is that the old one has not been deleted in GCP, even though it's gone from Kubernetes and from GCE console. This is a really big problem and is making it impossible to run any stateful apps in my cluster.

fejta-bot · 2018-08-30T20:05:50Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-09-29T20:27:59Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

jingxu97 · 2018-10-18T22:03:45Z

@pv93, sorry to reply late. Could you please let us know more details about your issue? You mentioned -you deleted old pod and PVC. Did you use "--forcedelete" and "--grace-period=0" option? Otherwise, the new pod or PVC (if using the same name as the old one) could not be created without the old one being cleaned up and deleted. For PVC, unless you change the Reclaim policy, by default, if PVC is deleted, the volume should be also deleted. Your new PVC should create a new volume.

pv93 · 2018-10-18T22:25:50Z

@jingxu97 it seems like this issue has stopped popping up as much with Kubernetes 1.10. Once a pod is rescheduled, GKE is much faster with removing the PVC from the old node and attaching it to the new one. So this isn't a problem for me anymore. Thanks anyway.

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 14, 2017

saad-ali mentioned this issue Jul 14, 2017

GCE PD Volumes already attached to a node fail with "Error 400: The disk resource is already being used by" node #19953

Closed

jingxu97 self-assigned this Aug 2, 2017

k8s-ci-robot assigned davidz627 Dec 11, 2017

lukaszgryglicki mentioned this issue Mar 1, 2018

Pervasive lag issue with label/milestone changes in issues and PRs cncf/devstats.archive#78

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 22, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 21, 2018

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 16, 2018

msau42 mentioned this issue May 16, 2018

Add e2e test for StatefulSet rolling update with PVC #63940

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 30, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 29, 2018

saad-ali closed this as completed Nov 1, 2018

Deployments with GCE PD fail with "...is already being used by..." #48968

Deployments with GCE PD fail with "...is already being used by..." #48968

Comments

saad-ali commented Jul 14, 2017 • edited

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

saad-ali commented Jul 14, 2017

msau42 commented Jul 15, 2017

yuvipanda commented Jul 15, 2017

tamalsaha commented Jul 15, 2017

wawastein commented Jul 15, 2017

jingxu97 commented Jul 17, 2017

gnufied commented Jul 17, 2017

yuvipanda commented Jul 17, 2017

yuvipanda commented Jul 17, 2017

jingxu97 commented Jul 17, 2017

wawastein commented Jul 18, 2017

yuvipanda commented Jul 18, 2017

jingxu97 commented Jul 18, 2017

wawastein commented Aug 1, 2017 • edited

jingxu97 commented Aug 1, 2017

wawastein commented Aug 2, 2017

jingxu97 commented Aug 2, 2017

wawastein commented Aug 2, 2017

jingxu97 commented Aug 2, 2017

jingxu97 commented Aug 2, 2017

dfcowell commented Aug 3, 2017 • edited

andrewhowdencom commented Aug 4, 2017

wawastein commented Aug 7, 2017

davidz627 commented Dec 11, 2017 • edited

jakirpatel commented Dec 15, 2017

msau42 commented Dec 15, 2017

msau42 commented Dec 15, 2017

davidz627 commented Dec 21, 2017 • edited

mofirouz commented Dec 22, 2017

msau42 commented Dec 22, 2017

davidz627 commented Dec 22, 2017

fejta-bot commented Mar 22, 2018

fejta-bot commented Apr 21, 2018

plkokanov commented May 16, 2018

msau42 commented May 16, 2018

jingxu97 commented May 16, 2018

plkokanov commented May 17, 2018

pv93 commented Jun 1, 2018 • edited

fejta-bot commented Aug 30, 2018

fejta-bot commented Sep 29, 2018

jingxu97 commented Oct 18, 2018

pv93 commented Oct 18, 2018

saad-ali commented Jul 14, 2017 •

edited

wawastein commented Aug 1, 2017 •

edited

dfcowell commented Aug 3, 2017 •

edited

davidz627 commented Dec 11, 2017 •

edited

davidz627 commented Dec 21, 2017 •

edited

pv93 commented Jun 1, 2018 •

edited