Pod volume mounting failing even after PV is bound and attached to pod #49926

amolsh · 2017-08-01T06:24:52Z

kubectl version:
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.4+coreos.0", GitCommit:"97c11b097b1a2b194f1eddca8ce5468fcc83331c", GitTreeState:"clean", BuildDate:"2017-03-08T23:54:21Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

yml file:

---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx"
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: gcr.io/google_containers/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
      annotations:
         volume.beta.kubernetes.io/storage-class: default
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

List of bound PVs:

ERROR:

relevant kubelet logs:

The text was updated successfully, but these errors were encountered:

xiangpengzhao · 2017-08-01T07:46:26Z

/sig storage

huangjiasingle · 2017-08-01T15:36:05Z

@amolsh you can get the pv? you storage is use storage-class, if you create pvc ,it's will auto create pv. you can see the pv exist or not

amolsh · 2017-08-01T16:25:35Z

yes i can get it

Also forgot to mention my cluster is in AWS
Also, I am getting above error for all deployments which i tried. None of them is able to mount volume to attached ebs pv.

huangjiasingle · 2017-08-02T15:39:36Z

@amolsh did you created a storage-class was named default? cany you show the storage-class? l think the reason is miss storage-class of named default. do you have any other yaml?

amolsh · 2017-08-02T15:57:43Z

@huangjiasingle This is my storage class

apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name: default
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"
  labels:
    kubernetes.io/cluster-service: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2

jingxu97 · 2017-08-02T18:58:57Z

@amolsh do you have the master log we can take a look? You can email us if your prefer. The kubelet log shows that the volume is not attached to the node yet.

amolsh · 2017-08-03T12:32:05Z

master kube-controll-manager is not generating any logs related to above statefulset(it did not generate any logs when I created above statefulset). Also I couldn't find anything in api-server logs also. One more thing I forgot to mention my cluster is on CoreOs machines

amolsh · 2017-08-07T06:12:56Z

@huangjiasingle
In logs it gives following error. Volume is not able to attach, because of some authorization issue. Is it related to aws IAM policy:

Failed to attach volume "pvc-fff84c66-7b35-11e7-b125-02f3f42ec6aa" on node "ip-100-x-x-x.us-west-2.compute.internal" with: Error attaching EBS volume "vol-00cbb836374d1b37b" to instance "i-03d4ba6b17ab9cf5f": UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: 4nedpChQKhsxXs......

amolsh · 2017-08-07T11:47:32Z

I updated IAM policy and added ec2:AttachVolume and ec2:DetachVolume. It resolved authorization issue. But now it is giving another issue. Even though volume is available attached to instance :

Failed to attach volume "pvc-79e4c457-7b57-11e7-a96e-0698361089de" on node "ip-100-x-x-x.us-west-2.compute.internal" with: Error attaching EBS volume "vol-00da392489f12395f" to instance "i-02bbbc571c95b69fd": IncorrectState: vol-00da392489f12395f is not 'available'.

amolsh · 2017-08-08T11:13:46Z

Now volume attachement is successfull, but still getting error

master kube-controller logs:


I0808 10:56:41.980229       1 event.go:217] Event(api.ObjectReference{Kind:"StatefulSet", Namespace:"default", Name:"web", UID:"42d0dd3d-7c28-11e7-a194-0acdd643d1f4", APIVersion:"apps", ResourceVersion:"32766795", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' pet: web-0
I0808 10:56:42.010206       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:56:42.028325       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:56:42.045299       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:56:42.074867       1 reconciler.go:213] Started AttachVolume for volume "kubernetes.io/aws-ebs/aws://us-west-2c/vol-041e0445484858feb" to node "ip-100-x-x-216.us-west-2.compute.internal"

I0808 10:56:52.643138       1 operation_executor.go:620] AttachVolume.Attach succeeded for volume "kubernetes.io/aws-ebs/aws://us-west-2c/vol-041e0445484858feb" (spec.Name: "pvc-40b565bc-7c26-11e7-b125-02f3f42ec6aa") from node "ip-100-x-x-216.us-west-2.compute.internal".
I0808 10:56:58.393077       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:57:28.393299       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:57:58.393600       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:58:28.393739       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:58:58.394266       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:59:28.393965       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0
I0808 10:59:58.394304       1 pet_set.go:332] StatefulSet web blocked from scaling on pod web-0

msau42 · 2017-11-09T04:55:44Z

@amolsh are you still seeing this issue?

amolsh · 2017-11-09T05:51:47Z

Yes, it did not resolve, so I stopped trying. On Nov 9, 2017 10:26 AM, "Michelle Au" <notifications@github.com> wrote: @amolsh <https://github.com/amolsh> are you still seeing this issue? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#49926 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKdrDSbpWZ_E6iP4X1yw0B3d4U9XKu78ks5s0oX3gaJpZM4OpRyq> .

msau42 · 2017-11-09T18:36:14Z

This seems like a possible issue related to EBS volumes. Can @kubernetes/sig-aws-misc help out?

mattcamp · 2017-11-23T12:56:26Z

I'm also having this issue: "AttachVolume.Attach failed for volume "pvc-5aa1db99-d04b-11e7-96e1-0a328f684a08" : Error attaching EBS volume "vol-026ef055a9d715c14" to instance "i-0e6f2c5d8edd415e4": IncorrectState: vol-026ef055a9d715c14 is not 'available'. status code: 400"

Using a single master, 3 node cluster in AWS (deployed via KOPS)... trying to deploy influxdb via helm.

It worked the first time I deployed it but after deleting the deployment (to change collectd config) I can't get it to work... it doesn't seem to matter which node, none of them work.

Grafana is deployed on the same cluster just fine (but have only deployed it once).

helm install stable/influxdb --name influxdb --set persistence.enabled=true --set config.collectd.enabled=true --namespace poke

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"922a86cfcd65915a9b2f69f3f193b8907d741d9c", GitTreeState:"clean", BuildDate:"2017-07-21T08:23:22Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.10", GitCommit:"bebdeb749f1fa3da9e1312c4b08e439c404b3136", GitTreeState:"clean", BuildDate:"2017-11-03T16:31:49Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

kubectl.log
node.log

gnufied · 2017-11-27T17:05:34Z

The IncorrectState: vol-026ef055a9d715c14 is not 'available'. status code: 400" is a transient error that happens when you try to attach a volume immediately after creating it, it goes away in half a second or so and controller will keep retrying attach.

@mattcamp are you seeing "Not available" error more than once in controller logs? Looking at kubectl output I see following message:

  3m            3m      1       kubelet, ip-172-20-36-162.eu-west-1.compute.internal                                            Normal  SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "pvc-5aa1db99-d04b-11e7-96e1-0a328f684a08"

Which indicates attach succeeded and volume was attached to the node. I think something else happened when container was started:

3m            4s      21      kubelet, ip-172-20-36-162.eu-west-1.compute.internal    spec.containers{influxdb-influxdb}      Warning BackOff                 Back-off restarting failed container

You may want to look into it.

neverfox · 2017-11-30T22:17:42Z

Ran into "IncorrectState: ... is not 'available'. status code: 400" when trying a stateful set in 1.8 that works great in 1.7. It does retry but keeps failing and the pod stays in a CreateContainerConfigError state (something that's new to me).

feffi · 2018-01-07T21:46:42Z

Hi, i got the same (?) error when trying 'helm install'

$ kubectl get storageclass
NAME            PROVISIONER             AGE
default         kubernetes.io/aws-ebs   4d
gp2 (default)   kubernetes.io/aws-ebs   4d

$ kubectl get persistentvolume
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS    CLAIM                          STORAGECLASS   REASON    AGE
pvc-8b23869c-f196-11e7-b93f-020c644f020c   8Gi        RWO            Delete           Bound     default/sonarqube-postgresql   gp2                      3d
pvc-c4707186-f3f0-11e7-b93f-020c644f020c   20Gi       RWO            Delete           Bound     default/git-minio              gp2                      21m
pvc-c4710474-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            Delete           Bound     default/git-postgresql         gp2                      21m
pvc-c471bf13-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            Delete           Bound     default/git-redis              gp2                      21m
pvc-c47273d1-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            Delete           Bound     default/git-registry-data      gp2                      21m
pvc-c4732d10-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            Delete           Bound     default/git-gitlab-data        gp2                      21m

$ kubectl get persistentvolumeclaim
NAME                   STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
git-gitlab-data        Bound     pvc-c4732d10-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            gp2            21m
git-minio              Bound     pvc-c4707186-f3f0-11e7-b93f-020c644f020c   20Gi       RWO            gp2            21m
git-postgresql         Bound     pvc-c4710474-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            gp2            21m
git-redis              Bound     pvc-c471bf13-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            gp2            21m
git-registry-data      Bound     pvc-c47273d1-f3f0-11e7-b93f-020c644f020c   10Gi       RWO            gp2            21m
sonarqube-postgresql   Bound     pvc-8b23869c-f196-11e7-b93f-020c644f020c   8Gi        RWO            gp2            3d

Events:
  Type     Reason                 Age                 From                                                    Message
  ----     ------                 ----                ----                                                    -------
  Warning  FailedScheduling       24m (x2 over 24m)   default-scheduler                                       PersistentVolumeClaim is not bound: "git-registry-data" (repeated 3 times)
  Normal   Scheduled              24m                 default-scheduler                                       Successfully assigned git-registry-7b784bb55d-lvhbk to ip-172-20-34-53.eu-central-1.compute.internal
  Warning  FailedMount            24m                 attachdetach                                            AttachVolume.Attach failed for volume "pvc-c47273d1-f3f0-11e7-b93f-020c644f020c" : Error attaching EBS volume "vol-0e7150ac4806512a8" to instance "i-0ee36c5c18824a89e": "IncorrectState: vol-0e7150ac4806512a8 is not 'available'.\n\tstatus code: 400, request id: c8cc4773-af8a-402c-b08b-fb280b4fb3da"
  Warning  FailedMount            23m                 attachdetach                                            AttachVolume.Attach failed for volume "pvc-c47273d1-f3f0-11e7-b93f-020c644f020c" : Error attaching EBS volume "vol-0e7150ac4806512a8" to instance "i-0ee36c5c18824a89e": "IncorrectState: vol-0e7150ac4806512a8 is not 'available'.\n\tstatus code: 400, request id: 1e0a0209-147c-458f-9fbf-7c45ef885e7c"
  Normal   SuccessfulMountVolume  23m                 kubelet, ip-172-20-34-53.eu-central-1.compute.internal  MountVolume.SetUp succeeded for volume "default-token-z2zbm"
  Normal   SuccessfulMountVolume  23m                 kubelet, ip-172-20-34-53.eu-central-1.compute.internal  MountVolume.SetUp succeeded for volume "pvc-c47273d1-f3f0-11e7-b93f-020c644f020c"
  Warning  FailedMount            12m (x5 over 21m)   kubelet, ip-172-20-34-53.eu-central-1.compute.internal  Unable to mount volumes for pod "git-registry-7b784bb55d-lvhbk_default(c4a17555-f3f0-11e7-b93f-020c644f020c)": timeout expired waiting for volumes to attach/mount for pod "default"/"git-registry-7b784bb55d-lvhbk". list of unattached/unmounted volumes=[certs]
  Warning  FailedMount            11m (x14 over 23m)  kubelet, ip-172-20-34-53.eu-central-1.compute.internal  MountVolume.SetUp failed for volume "certs" : secrets "registry-server-tls" not found
  Warning  FailedSync             3m (x9 over 21m)    kubelet, ip-172-20-34-53.eu-central-1.compute.internal  Error syncing pod

msau42 · 2018-01-08T18:15:49Z

@feffi the mount error you got is pretty self-explanatory:

MountVolume.SetUp failed for volume "certs" : secrets "registry-server-tls" not found

This is different from the OP's error, which is related to EBS volumes.

macropin · 2018-01-10T04:36:29Z

It's worth noting that a message such Warning FailedScheduling 24m (x2 over 24m) default-scheduler PersistentVolumeClaim is not bound: "git-registry-data" (repeated 3 times) can appear to indicate a PV/PVC issue, when infact the pod is restarting due to error and the message is repeated when the container is rescheduled.

hugohenley · 2018-01-29T14:20:37Z

Same issue here!

jolcese · 2018-01-31T23:51:09Z

Same problem when deploying a Stateful Set on kops on AWS.
I'm trying to deploy Cassandra with 6 replicas and sometimes it fails on the first pod and sometimes in the 2nd...

PersistentVolumeClaim is not bound: "cassandra-data-pvc-cassandra-1" (repeated 9 times)
AttachVolume.Attach failed for volume "pvc-dd201a20-06dd-11e8-9f82-06210fc81b88" : Error attaching EBS volume "vol-0a752df5ddef30f64" to instance "i-039b684aca16fc74e": "IncorrectState: vol-0a752df5ddef30f64 is not 'available'.\n\tstatus code: 400, request id: e57f9b75-e57b-4455-843c-9f9b61eb83f1"
Back-off restarting failed container
Error syncing pod

versions:

jolcese@jolcese-osx:~/src/kubernetes/kops$ kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T10:09:24Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.6", GitCommit:"6260bb08c46c31eea6cb538b34a9ceb3e406689c", GitTreeState:"clean", BuildDate:"2017-12-21T06:23:29Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
jolcese@jolcese-osx:~/src/kubernetes/kops$ kops  version
Version 1.8.0
jolcese@jolcese-osx:~/src/kubernetes/kops$

lvicentesanchez · 2018-02-19T18:44:32Z

Same error here... I'm create an storage class, pv, pvc and a statefulset in one go and I get the same error. The ebs volume it's actually attached to the node so I'm not sure what's going on.

lvicentesanchez · 2018-02-19T18:47:37Z

I have tried to remove the pod and it still fails, once recreated.

lvicentesanchez · 2018-02-19T18:49:20Z

I'm using Kubernetes 1.8.8, deployed using kops 1.8.1, with RBAC enabled.

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-07T12:22:21Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.8", GitCommit:"2f73858c9e6ede659d6828fe5a1862a48034a0fd", GitTreeState:"clean", BuildDate:"2018-02-09T21:23:25Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64

lvicentesanchez · 2018-02-19T18:54:04Z

Some additional information. If I remove everything using kubectl delete -f mantifest/influxdb.yaml, the EBS volume is in available status. If then I create it again, I get a timeout while trying to mount the volume but the EBS volume is 'in use'. So... first time, I get an error because of the 'available' status, after that... the volume can't be attached even if it's 'in use' by the target node.

Unable to mount volumes for pod "monitoring-influxdb-0_kube-system(e039f409-15a5-11e8-8142-0a3a1b5232ac)": timeout expired waiting for volumes to attach/mount for pod "kube-system"/"monitoring-influxdb-0". list of unattached/unmounted volumes=[influxdb-persistent-storage]

omerh · 2018-12-26T20:34:01Z

I am also having issues with nvme instances, running Kubernetes 1.10.x in this case, I tried using m5.large with pvc.
This is the StatefulSet that reproduce this

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: k8s-rmq
spec:
  serviceName: "k8s-rmq"
  replicas: 1
  selector:
    matchLabels:
      app: k8s-rmq
  template:
    metadata:
      labels:
        app: k8s-rmq
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
    spec:
      nodeSelector:
        kops.k8s.io/instancegroup: nodes
      terminationGracePeriodSeconds: 30
      containers:
      - name: k8s-rmq
        imagePullPolicy: IfNotPresent
        image: rabbitmq:3.7.8-management-alpine
        ports:
        - containerPort: 5672
          name: amqp
        - containerPort: 15672
          name: management
        envFrom:
            - configMapRef:
                name: k8s-dev-aws         
        env:
          - name: RABBITMQ_DEFAULT_USER
            value: example
          - name: RABBITMQ_DEFAULT_PASS
            value: example
        resources:
          limits:
            cpu: "800m"
            memory: "1Gi"
          requests:
            cpu: "100m"
            memory: "128Mi"
        livenessProbe:
          tcpSocket:
            port: 5672
          initialDelaySeconds: 20
          timeoutSeconds: 5
          periodSeconds: 30
          failureThreshold: 2
          successThreshold: 1
        readinessProbe:
          tcpSocket:
            port: 5672
          initialDelaySeconds: 20
          timeoutSeconds: 5
          periodSeconds: 30
          failureThreshold: 2
          successThreshold: 1
        volumeMounts:
        - name: rmqvol
          mountPath: /var/lib/rabbitmq
  volumeClaimTemplates:
  - metadata:
      name: rmqvol
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 20Gi

this is the storage classes:

NAME            PROVISIONER             AGE
default         kubernetes.io/aws-ebs   337d
gp2 (default)   kubernetes.io/aws-ebs   337d

The EBS is created and attached to the instance.

but Kubelet fails to mount the disk into pod

1m          1m           1       k8s-rmq-0.1573f1a3938f660f                                     Pod    \
                                                         Warning   FailedMount               \
     kubelet, ip-172-20-57-150.eu-west-1.compute.internal  \
    Unable to mount volumes for pod "k8s-rmq-0_default(1c45f9a0-0932-11e9-b1e7-0ac8a16a5f0c)": timeout expired waiting for volumes to attach or mount for pod "default"/"k8s-rmq-0". list of unmounted volumes=[rmqvol default-token-wp48g]. list of unattached volumes=[rmqvol default-token-wp48g]

Cluster provisioned using Kops 1.10

kubectl version

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T21:04:45Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.11", GitCommit:"637c7e288581ee40ab4ca210618a89a555b6e7e9", GitTreeState:"clean", BuildDate:"2018-11-26T14:25:46Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

When checking on m5.large the mounts on the node there is not disk mount on the nvme drive.

when replacing to m4.large the mounts has:
/dev/xvdcu 20G 49M 19G 1% /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/eu-west-1a/vol-0b73b3a1bf15aac39

The node image in Kops is: kope.io/k8s-1.10-debian-jessie-amd64-hvm-ebs-2018-08-17

And on the same note, different use case, when launching a new cluster using Kops and masters are nvme instances like m5.large, the host startup fails to mount the etcd volume and hangs with protokube:1.10.0 in a loop for

I1226 20:23:59.194888    1721 aws_volume.go:320] nvme path not found "/rootfs/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0fdeaa59d34bd2ab1"

After installing nvme-cli I can see that the volume exists

root@ip-10-101-35-149:~# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     vol0913974dacc67c490 Amazon Elastic Block Store               1           0.00   B /  68.72  GB    512   B +  0 B   1.0
/dev/nvme1n1     vol0fdeaa59d34bd2ab1 Amazon Elastic Block Store               1           0.00   B /  21.47  GB    512   B +  0 B   1.0
/dev/nvme2n1     vol0a587f2950331bf7b Amazon Elastic Block Store               1           0.00   B /  21.47  GB    512   B +  0 B   1.0

But this /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol0fdeaa59d34bd2ab1 not exists.

the disk mapping /dev/disk/by-id doesn't exists only /dev/disk/by-uuid/

So basically I can not use any nvme base instance for masters or nodes that has a EBS pvc

chey · 2019-01-31T03:41:42Z

Cloud: AWS
OS: RedHat 7.6
kube version: v1.12.5

Having a similar problem with storage myself. When draining a node that has an EBS volume which is attached to a Pod and/or Deployment, the storage doesn't move. It releases from original node but never makes it to the new/next node.

Like others, switching to t2.xlarge instances fixed this for me.

gabordk · 2019-02-12T17:03:09Z

Same problem here, AWS, k8s version 1.13.0

brynmathias · 2019-04-11T11:50:02Z

Same problem as well, AWS, eks. k8s version 1.11.0
See the issue on c5.9xlarge machines and on p2.xlarge instances.

I have a feeling that it might be due to the maximum number of ebs attachments to an ec2 node.
I do not know if the maximum attachment limit is total, or the attachments are properly removed.

maflaven · 2019-04-24T16:30:07Z

I've been seeing a similar issue on EKS with k8s version 1.11. Our support agent suggested the following:

It's very likely that the Kubernetes scheduler was choosing worker nodes in Availability Zones (AZ) with no volumes available. This can happen when the node selected for placement of the pod by the scheduler is not in the availability zone in which the Persistent volume(s) claim are available i.e. EBS volume exists. For example, when there isn't sufficient CPU and/or memory resources available on Nodes in which the persistent volume exists. That would lead to the scheduler choosing a Node in another zone and failing to schedule pod with this error.

This was a known issue in Kubernetes[1][2] and has been fixed by having "VolumeScheduling"[2] feature enabled in scheduler.

Another workaround could include creating the volumes manually and updating the PVCs but both options would turn into a less available/dynamic cluster.

References:
[1] kubernetes/enhancements#490
[2] #34583
[3] https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/

msau42 · 2019-04-24T21:29:09Z

/assign @leakingtapan

gabordk · 2019-04-29T18:18:38Z

Unfortunately this has nothing to do with AWS Availability Zones or VolumeScheduling. AZ related problems are kinda hot nowadays, so people like to mix up that problem with this one, but a quick look at the Availability Zones makes clear that there is no connection.

Today's testing results:
Kubernetes: 1.13.0
Instance: m5.4xlarge
EBS: gp2

Both the worker node and the EBS volume is in the same AZ (us-east-1c) (VolumeScheduling is enabled in our StorageClass, previously it wasn't, nothing changed).
EBS volume is successfully created and mounted on the worker node
kubectl describe pv reports the volume as "Bound", events are empty
kubectl describe pvc reports as "Bound", events are empty
pod events:
- MountVolume.WaitForAttach failed for volume "pvc-xxxxx" : could not find attached AWS Volume "aws://us-east-1c/vol-xxxxx". Timeout waiting for mount paths to be created
- Unable to mount volumes for pod "xyz": timeout expired waiting for volumes to attach or mount for pod

gabordk · 2019-04-29T19:07:29Z

After some debugging I found my problem is this, all the symptoms are matching:
coreos/bugs#2371

hustshawn · 2019-05-15T10:49:21Z

@gabordk I got the same issue. Any solution or progress on this?

dfang · 2019-08-10T03:11:56Z

same issue here.

on microk8s, enabled default storage addon,

then install helm-consul got this issue, all pv and pvc are bound. but consul-server pods still got "pod has unbound immediate PersistentVolumeClaims", recreate these pods didn't help

jhoblitt · 2019-08-29T18:34:51Z

I believe I'm seeing the same issue on aws w/ eks 1.3.8 (1.3 "eks.2")

  Warning  FailedMount             56s (x11 over 23m)  kubelet, ip-192-168-124-20.ec2.internal  Unable to mount volumes for pod "jenkins-6758665c4c-gg5tl_jenkins(f6440463-ca87-11e9-a31c-0a4da4f89c32)": timeout expired waiting for volumes to attach or mount for pod "jenkins"/"jenkins-6758665c4c-gg5tl". list of unmounted volumes=[jenkins-home]. list of unattached volumes=[plugins tmp jenkins-config plugin-dir secrets-dir jenkins-home sc-config-volume jenkins-token-pn7mq]

ishantanu · 2019-09-02T11:03:48Z

@jhoblitt Did you tried with EKS 1.2.x? I faced some issues with statefulsets with PVC on EKS 1.3.x but everything ran just fine on EKS 1.2.x.

mogaal · 2019-09-04T13:14:14Z

@jhoblitt I was facing the same issue until 10 min ago, I realised there is a problem/bug with the kubernetes-plugin I was using. I solved upgrading to 1.18.3 the kubernetes-plugin for Jenkins.

jhoblitt · 2019-09-04T19:54:08Z

@ishantanu I don't believe I was seeing this problem with 1.2 but it's been awhile since I've tested with that version.

@mogaal This problem is present outside of pods managed by jenkins.

s2504s · 2019-09-12T14:12:04Z

@mogaal
Thank you!! I've upgraded version of kubernetes-plugin for Jenkins and it works!!!

fejta-bot · 2019-12-11T14:15:39Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-01-10T15:02:26Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-02-09T15:43:39Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-02-09T15:43:47Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sudip-moengage · 2021-01-09T19:52:39Z

/reopen

k8s-ci-robot · 2021-01-09T19:52:50Z

@sudip-moengage: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 1, 2017

k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Aug 1, 2017

k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 1, 2017

k8s-ci-robot added the sig/aws label Nov 9, 2017

mattcamp mentioned this issue Nov 23, 2017

Tries to attach already attached EBS Volume (AWS) #45726

Closed

k8s-ci-robot assigned leakingtapan Apr 24, 2019

k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider and removed sig/aws labels Aug 6, 2019

ananth-racherla mentioned this issue Dec 3, 2019

[stable/prometheus-operator] Unable to mount volumes for prometheus pod helm/charts#19192

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 11, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 10, 2020

k8s-ci-robot closed this as completed Feb 9, 2020

GunSik2 mentioned this issue Oct 20, 2021

k8s pvc volume mount guarantee GunSik2/k8s-ai#44

Open

Pod volume mounting failing even after PV is bound and attached to pod #49926

Pod volume mounting failing even after PV is bound and attached to pod #49926

Comments

amolsh commented Aug 1, 2017

xiangpengzhao commented Aug 1, 2017

huangjiasingle commented Aug 1, 2017

amolsh commented Aug 1, 2017 • edited

huangjiasingle commented Aug 2, 2017

amolsh commented Aug 2, 2017

jingxu97 commented Aug 2, 2017

amolsh commented Aug 3, 2017 • edited

amolsh commented Aug 7, 2017 • edited

amolsh commented Aug 7, 2017

amolsh commented Aug 8, 2017

msau42 commented Nov 9, 2017

amolsh commented Nov 9, 2017 via email

msau42 commented Nov 9, 2017

mattcamp commented Nov 23, 2017

gnufied commented Nov 27, 2017 • edited

neverfox commented Nov 30, 2017 • edited

feffi commented Jan 7, 2018 • edited

msau42 commented Jan 8, 2018

macropin commented Jan 10, 2018

hugohenley commented Jan 29, 2018

jolcese commented Jan 31, 2018 • edited

lvicentesanchez commented Feb 19, 2018

lvicentesanchez commented Feb 19, 2018

lvicentesanchez commented Feb 19, 2018 • edited

lvicentesanchez commented Feb 19, 2018 • edited

omerh commented Dec 26, 2018 • edited

chey commented Jan 31, 2019

gabordk commented Feb 12, 2019

brynmathias commented Apr 11, 2019

maflaven commented Apr 24, 2019 • edited

msau42 commented Apr 24, 2019

gabordk commented Apr 29, 2019

gabordk commented Apr 29, 2019

hustshawn commented May 15, 2019

dfang commented Aug 10, 2019

jhoblitt commented Aug 29, 2019

ishantanu commented Sep 2, 2019

mogaal commented Sep 4, 2019

jhoblitt commented Sep 4, 2019

s2504s commented Sep 12, 2019

fejta-bot commented Dec 11, 2019

fejta-bot commented Jan 10, 2020

fejta-bot commented Feb 9, 2020

k8s-ci-robot commented Feb 9, 2020

sudip-moengage commented Jan 9, 2021

k8s-ci-robot commented Jan 9, 2021

amolsh commented Aug 1, 2017 •

edited

amolsh commented Aug 3, 2017 •

edited

amolsh commented Aug 7, 2017 •

edited

gnufied commented Nov 27, 2017 •

edited

neverfox commented Nov 30, 2017 •

edited

feffi commented Jan 7, 2018 •

edited

jolcese commented Jan 31, 2018 •

edited

lvicentesanchez commented Feb 19, 2018 •

edited

lvicentesanchez commented Feb 19, 2018 •

edited

omerh commented Dec 26, 2018 •

edited

maflaven commented Apr 24, 2019 •

edited