[nfs-client-provisioner]PVC pending state #754

diewuq opened this issue May 7, 2018 · 29 comments
area/nfs-client lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.


diewuq commented May 7, 2018


Currently I'm trying to get the nfs-client-provisioner running on k8s v1.10.1, and use your giving yaml for testing with rbac. As I deploy test-claim.yaml, PVC always pending with below info:

Normal ExternalProvisioning 2m (x143 over 37m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "fuserim.pri/ifs" or manually created by system administrator

I check pod nfs-client-provisioner, it is successful:
root@k8s-master1:/data/nfs_file# kubectl logs nfs-client-provisioner-5bff76dd6c-j2mpg
I0507 07:22:12.743160 1 controller.go:407] Starting provisioner controller 5c858a43-51c7-11e8-84df-0a580af40124!
I also refer #174, link " " is expired , and I check with kube-controller-manager.yaml, --controllers like below:

  • --controllers=*,bootstrapsigner,tokencleaner

I don't know whether this is the issue. Anyone can help? Thanks in advance.

rtrive commented May 9, 2018

Hi, mee too i'm stuck here, the pvc is pending state. I try both quick start and the normal mode

same issue to me, i also deploy openebs provisioner, and it work well

wongma7 commented Jun 15, 2018

Is that the full log of the provisioner? Can you provide it if not

/area nfs-client

I have the same issue. The log (-v 4) shows nothing interesting:

I0625 13:00:02.214122 1 controller.go:492] Starting provisioner controller ac4f591d-7877-11e8-8e50-0a58c0a80305!
I0625 13:00:02.214962 1 reflector.go:202] Starting reflector *v1.PersistentVolumeClaim (15s) from
I0625 13:00:02.215061 1 reflector.go:240] Listing and watching *v1.PersistentVolumeClaim from
I0625 13:00:02.215163 1 reflector.go:202] Starting reflector *v1.PersistentVolume (15s) from
I0625 13:00:02.215192 1 reflector.go:240] Listing and watching *v1.PersistentVolume from
I0625 13:00:02.216170 1 reflector.go:202] Starting reflector *v1.StorageClass (15s) from
I0625 13:00:02.216273 1 reflector.go:240] Listing and watching *v1.StorageClass from
I0625 13:00:17.221821 1 reflector.go:286] forcing resync
I0625 13:00:17.229086 1 reflector.go:286] forcing resync

mostlyAtNight commented Jun 28, 2018

Hello - I have the same issue (PVC stuck in Pending state) - but I'm using the EFS provisioner.

Logs (for the pod) do not show much:

W0628 09:37:15.045679       1 efs-provisioner.go:86] couldn't confirm that the EFS file system exists: AccessDeniedException: User: arn:aws:sts::433835555346:assumed-role/eks-sandpit-01-worker-nodes-NodeInstanceRole-AZUYA99QG92C/i-06832ac3659b22ca0 is not authorized to perform: elasticfilesystem:DescribeFileSystems on the specified resource
	status code: 403, request id: d757ebaf-7ab6-11e8-901a-91e9a34a399e
I0628 09:37:15.047517       1 controller.go:492] Starting provisioner controller d75a58ef-7ab6-11e8-9ea8-66d5a8a809a1! 

How can I get further information to help debugging this issue?

One other thing: I;m running on Amazon EKS (quite new) and had to install nfs-utils on the nodes as it was not installed by default. Perhaps the issue is related and there is some other missing software there?

Kind regards, Pete

Strangely it's now working as expected, things I changed:

  • Set up everything (ConfigMap, Deployment, StorageClass) via a single manifest file
    • I modified the manifest file to contain the serviceAccount patch mentioned in the docs
  • Did the above after deleting any leftovers from previous attempts (with the exception of the RBAC permissions modifications)
  • Applied the manifest file

The PVCs now bind properly and are not stuck in the Pending state - perhaps it's something to do with the container not liking starting out in a different serviceAccount and then being subsequently patched?


wongma7 commented Jun 28, 2018

If it were a serviceAccount misconfiguration I would expect to see many permission denied errors in the log. So I am still perplexed!

n0thing2333 commented Jun 29, 2018

stuck here too. @mostlyAtNight .
I'm also using EKS+EFS. Can you elaborate more on how you solve this problem? (The service account part)

stuck for me too. Describe log for PVC:

# kubectl describe pvc efs
Name:          efs
Namespace:     default
StorageClass:  aws-efs
Status:        Pending
Labels:        <none>
Finalizers:    []
Access Modes:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  ExternalProvisioning  1m (x62 over 16m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "" or manually created by system administrator

Looks like the functionality from Documentation: When you create a claim that asks for the class, a volume will be automatically created. is no longer working.

matthieudolci commented Jul 7, 2018

We had the same issue with efs and eks using the aws eks ami for the nodes, we solved it by using amazon-efs-utils instead of nfs-utils

I have the same pvc pending issue

Name: efs
Namespace: default
StorageClass: aws-efs
Status: Pending
Finalizers: []
Access Modes:
Type Reason Age From Message

Normal ExternalProvisioning 1m (x61 over 16m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "" or manually created by system administrator

pictolearn commented Jul 10, 2018

I came across the same issue using KOPS. I have followed whatever was specified in the docs and also used the following as a reference but my pod file look different with an nginx instance trying to mount to the EFS cluster

$ kubectl describe deployment  efs-provisioner
Name:               efs-provisioner
Namespace:          default
CreationTimestamp:  Tue, 10 Jul 2018 20:42:17 +0530
Labels:             app=efs-provisioner
Selector:           app=efs-provisioner
Replicas:           1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:       Recreate
MinReadySeconds:    0
Pod Template:
  Labels:  app=efs-provisioner
    Port:       <none>
    Host Port:  <none>
      FILE_SYSTEM_ID:    <set to the key '' of config map 'efs-provisioner'>    Optional: false
      AWS_REGION:        <set to the key 'aws.region' of config map 'efs-provisioner'>        Optional: false
      PROVISIONER_NAME:  <set to the key '' of config map 'efs-provisioner'>  Optional: false
      /persistentvolumes from pv-volume (rw)
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Path:      /pvs
    ReadOnly:  false
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  <none>
NewReplicaSet:   efs-provisioner-7f764f8cc4 (1/1 replicas created)
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  2m    deployment-controller  Scaled up replica set efs-provisioner-7f764f8cc4 to 1

$ kubectl describe pvc efs
Name:          efs
Namespace:     default
StorageClass:  aws-efs
Status:        Pending
Labels:        <none>
Finalizers:    []
Access Modes:  
  Type    Reason                Age   From                         Message
  ----    ------                ----  ----                         -------
  Normal  ExternalProvisioning  10s   persistentvolume-controller  waiting for a volume to be created, either by external provisioner "" or manually created by system administrator

here is a sample nginx yaml which i used to test

apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
  name: nginx-deployment
      app: nginx
  replicas: 1 # tells deployment to run 2 pods matching the template
        app: nginx
      - name: nginx
        image: nginx:1.7.9
        - containerPort: 80
        - name: efs
          mountPath: /usr/share/nginx/html
        - name: efs
            claimName: efs
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T22:29:25Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

$ kops version
Version 1.9.1

wongma7 commented Jul 10, 2018

Can you confirm the provisioner log looks the same as above (i.e. it is basically empty) #754 (comment) ? What if you try a newer/older image? (What version are you using now?)

I resolved the problem by creating PV volume before PVC.

pictolearn commented Jul 11, 2018

I tried the same to create a PV before a PVC, did not work. Not sure if it has anything to do with the order. I suspect it could be something with KOPS but it shouldnt matter. I have ensured that the workers nodes and the master nodes have nfs-commons installed on them.

For those getting

waiting for a volume to be created, either by external provisioner "" or manually created by system administrator

If you're using rbac (which it seems kops does by default), make sure you've created a serviceAccount. Look at the deployment.yaml rather than the manifest.yaml.

You'll also need rbac.yaml. The readme describes this but doesn't mention the service account.

Sigh. This took me all day to figure out. I hope this helps someone save some time:

The issue, for me, was the default service account. It didn't have access to get endpoints. Mind you, this is a fresh EKS cluster install, not running anything special yet.

You can confirm this by exec'ing into your efs-provisioner pod with a shell (you will have to install bash since it's alpine, run "apk add --no-cache bash" and then shell in with /bin/bash).

Once you're in, run "/efs-provisioner". You may start seeing this output:

error retrieving resource lock default/ endpoints "" is forbidden: User "system:serviceaccount:default:default" cannot get endpoints in the namespace "default"

If you do, you have the same issue as I did.

The quick fix is to give the cluster-admin role to the default service account. Of course, depending on your environment and security, you may need a more elaborate fix. If you elect to go the easy way, you can simply apply this:

kind: ClusterRoleBinding
  name: default-admin-rbac (or whatever)
  - kind: ServiceAccount
    name: default
    namespace: default
  kind: ClusterRole
  name: cluster-admin

After I did this, the output from the provisioner immediately became successful and started creating volumes.

mustafaakin commented Sep 17, 2018

There is also RBAC configuration in the repo for the EFS provisioner:

geerlingguy commented Dec 21, 2018

@ParaSwarm - Thank you so much for posting the detailed guide. Note that for me, I ran apk add bash (without the --no-cache flag) to make installing bash work without any fancy quoting or other steps.

I am running the provisioner in a namespace (efs-provisioner) instead of in the default namespace, so I am seeing:

E1221 22:10:38.183781      79 leaderelection.go:252] error retrieving resource lock efs-provisioner/ endpoints "" is forbidden: User "system:serviceaccount:efs-provisioner:default" cannot get endpoints in the namespace "efs-provisioner"

I am using the RBAC config from the external-storage/aws/efs/deploy/rbac.yaml file (with the namespace replaced where needed), but I'm still getting the above errors. I'm trying to debug what is wrong with the RBAC setup...

This is in an EKS cluster, running 1.11.5.

geerlingguy commented Dec 21, 2018

So turns out the issue for me was related to #953 — there are some missing rules in the RBAC ClusterRole which prevented the efs-provisioner service account from being able to do what it needed to do.

Adding the following lines to the rules in the ClusterRole fixed it for me:

  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]


I also added a ServiceAccount for the provisioner:

apiVersion: v1
kind: ServiceAccount
  name: efs-provisioner
  namespace: default

And attached that serviceAccount to my pod spec for the efs-provisioner Deployment:

      serviceAccount: efs-provisioner
        - name: efs-provisioner

Finally, I had to put the Role and RoleBinding into the namespace as well.

@geerlingguy This worked for me. Thank you a ton!

weisjohn commented Feb 4, 2019

@geerlingguy also worked for me, thanks for the great explanation!

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 4, 2019
I think that it's your RBAC configuration issue. If you specify some namespace parameters in your RBAC yaml file, you can refer to this file: Only specify namespace parameter in ClusterRoleBinding/RoleBingding Subjects section. Also don't forget to add endpoints permission in your ClusterRole resource.

Copy link

wiltonfelix commented Jun 23, 2019

@liozzazhang thanks, that fixed my problem!

