Skip to content
This repository has been archived by the owner on Oct 21, 2020. It is now read-only.

[nfs-client-provisioner]PVC pending state #754

Closed
diewuq opened this issue May 7, 2018 · 29 comments
Closed

[nfs-client-provisioner]PVC pending state #754

diewuq opened this issue May 7, 2018 · 29 comments
Assignees
Labels
area/nfs-client lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@diewuq
Copy link

diewuq commented May 7, 2018

Hi,

Currently I'm trying to get the nfs-client-provisioner running on k8s v1.10.1, and use your giving yaml for testing with rbac. As I deploy test-claim.yaml, PVC always pending with below info:

Normal ExternalProvisioning 2m (x143 over 37m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "fuserim.pri/ifs" or manually created by system administrator

I check pod nfs-client-provisioner, it is successful:
root@k8s-master1:/data/nfs_file# kubectl logs nfs-client-provisioner-5bff76dd6c-j2mpg
I0507 07:22:12.743160 1 controller.go:407] Starting provisioner controller 5c858a43-51c7-11e8-84df-0a580af40124!
I also refer #174, link " https://kubernetes.io/docs/admin/kube-controller-manager/ " is expired , and I check with kube-controller-manager.yaml, --controllers like below:

  • --controllers=*,bootstrapsigner,tokencleaner

I don't know whether this is the issue. Anyone can help? Thanks in advance.

@rtrive
Copy link

rtrive commented May 9, 2018

Hi, mee too i'm stuck here, the pvc is pending state. I try both quick start and the normal mode

@dragon9783
Copy link

same issue to me, i also deploy openebs provisioner, and it work well

@wongma7
Copy link
Contributor

wongma7 commented Jun 15, 2018

Is that the full log of the provisioner? Can you provide it if not

/area nfs-client

@jrfeenst
Copy link

I have the same issue. The log (-v 4) shows nothing interesting:

I0625 13:00:02.214122 1 controller.go:492] Starting provisioner controller ac4f591d-7877-11e8-8e50-0a58c0a80305!
I0625 13:00:02.214962 1 reflector.go:202] Starting reflector *v1.PersistentVolumeClaim (15s) from github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:496
I0625 13:00:02.215061 1 reflector.go:240] Listing and watching *v1.PersistentVolumeClaim from github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:496
I0625 13:00:02.215163 1 reflector.go:202] Starting reflector *v1.PersistentVolume (15s) from github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:497
I0625 13:00:02.215192 1 reflector.go:240] Listing and watching *v1.PersistentVolume from github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:497
I0625 13:00:02.216170 1 reflector.go:202] Starting reflector *v1.StorageClass (15s) from github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:498
I0625 13:00:02.216273 1 reflector.go:240] Listing and watching *v1.StorageClass from github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:498
I0625 13:00:17.221821 1 reflector.go:286] github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:498: forcing resync
I0625 13:00:17.229086 1 reflector.go:286] github.com/kubernetes-incubator/external-storage/lib/controller/controller.go:496: forcing resync

@mostlyAtNight
Copy link

mostlyAtNight commented Jun 28, 2018

Hello - I have the same issue (PVC stuck in Pending state) - but I'm using the EFS provisioner.

Logs (for the pod) do not show much:

W0628 09:37:15.045679       1 efs-provisioner.go:86] couldn't confirm that the EFS file system exists: AccessDeniedException: User: arn:aws:sts::433835555346:assumed-role/eks-sandpit-01-worker-nodes-NodeInstanceRole-AZUYA99QG92C/i-06832ac3659b22ca0 is not authorized to perform: elasticfilesystem:DescribeFileSystems on the specified resource
	status code: 403, request id: d757ebaf-7ab6-11e8-901a-91e9a34a399e
I0628 09:37:15.047517       1 controller.go:492] Starting provisioner controller d75a58ef-7ab6-11e8-9ea8-66d5a8a809a1! 

How can I get further information to help debugging this issue?

One other thing: I;m running on Amazon EKS (quite new) and had to install nfs-utils on the nodes as it was not installed by default. Perhaps the issue is related and there is some other missing software there?

Kind regards, Pete

@mostlyAtNight
Copy link

Strangely it's now working as expected, things I changed:

  • Set up everything (ConfigMap, Deployment, StorageClass) via a single manifest file
    • I modified the manifest file to contain the serviceAccount patch mentioned in the docs
  • Did the above after deleting any leftovers from previous attempts (with the exception of the RBAC permissions modifications)
  • Applied the manifest file

The PVCs now bind properly and are not stuck in the Pending state - perhaps it's something to do with the container not liking starting out in a different serviceAccount and then being subsequently patched?

...

@wongma7
Copy link
Contributor

wongma7 commented Jun 28, 2018

If it were a serviceAccount misconfiguration I would expect to see many permission denied errors in the log. So I am still perplexed!

@n0thing2333
Copy link

n0thing2333 commented Jun 29, 2018

stuck here too. @mostlyAtNight .
I'm also using EKS+EFS. Can you elaborate more on how you solve this problem? (The service account part)

@nagapavan
Copy link

stuck for me too. Describe log for PVC:

# kubectl describe pvc efs
Name:          efs
Namespace:     default
StorageClass:  aws-efs
Status:        Pending
Volume:
Labels:        <none>
Annotations:   volume.beta.kubernetes.io/storage-class=aws-efs
               volume.beta.kubernetes.io/storage-provisioner=example.com/aws-efs
Finalizers:    []
Capacity:
Access Modes:
Events:
  Type    Reason                Age                From                         Message
  ----    ------                ----               ----                         -------
  Normal  ExternalProvisioning  1m (x62 over 16m)  persistentvolume-controller  waiting for a volume to be created, either by external provisioner "example.com/aws-efs" or manually created by system administrator

Looks like the functionality from Documentation: When you create a claim that asks for the class, a volume will be automatically created. is no longer working.

@matthieudolci
Copy link

matthieudolci commented Jul 7, 2018

We had the same issue with efs and eks using the aws eks ami for the nodes, we solved it by using amazon-efs-utils instead of nfs-utils

@Youhana-Hana
Copy link

I have the same pvc pending issue

Name: efs
Namespace: default
StorageClass: aws-efs
Status: Pending
Volume:
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{"volume.beta.kubernetes.io/storage-class":"aws-efs"},"name":"efs","namespa..
volume.beta.kubernetes.io/storage-class=aws-efs
volume.beta.kubernetes.io/storage-provisioner=example.com/aws-efs
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
Events:
Type Reason Age From Message


Normal ExternalProvisioning 1m (x61 over 16m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "example.com/aws-efs" or manually created by system administrator

@pictolearn
Copy link

pictolearn commented Jul 10, 2018

I came across the same issue using KOPS. I have followed whatever was specified in the docs and also used the following as a reference https://medium.com/@while1eq1/using-amazon-efs-in-a-multiaz-kubernetes-setup-57922e032776 but my pod file look different with an nginx instance trying to mount to the EFS cluster

$ kubectl describe deployment  efs-provisioner
Name:               efs-provisioner
Namespace:          default
CreationTimestamp:  Tue, 10 Jul 2018 20:42:17 +0530
Labels:             app=efs-provisioner
Annotations:        deployment.kubernetes.io/revision=1
                    kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata":{"annotations":{},"name":"efs-provisioner","namespace":"default"},"spec":{"replicas":...
Selector:           app=efs-provisioner
Replicas:           1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:       Recreate
MinReadySeconds:    0
Pod Template:
  Labels:  app=efs-provisioner
  Containers:
   efs-provisioner:
    Image:      quay.io/external_storage/efs-provisioner:latest
    Port:       <none>
    Host Port:  <none>
    Environment:
      FILE_SYSTEM_ID:    <set to the key 'file.system.id' of config map 'efs-provisioner'>    Optional: false
      AWS_REGION:        <set to the key 'aws.region' of config map 'efs-provisioner'>        Optional: false
      PROVISIONER_NAME:  <set to the key 'provisioner.name' of config map 'efs-provisioner'>  Optional: false
    Mounts:
      /persistentvolumes from pv-volume (rw)
  Volumes:
   pv-volume:
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Server:    fs-b27d34fa.efs.us-east-1.amazonaws.com
    Path:      /pvs
    ReadOnly:  false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  <none>
NewReplicaSet:   efs-provisioner-7f764f8cc4 (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  2m    deployment-controller  Scaled up replica set efs-provisioner-7f764f8cc4 to 1


$ kubectl describe pvc efs
Name:          efs
Namespace:     default
StorageClass:  aws-efs
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   volume.beta.kubernetes.io/storage-class=aws-efs
               volume.beta.kubernetes.io/storage-provisioner=example.com/aws-efs
Finalizers:    []
Capacity:      
Access Modes:  
Events:
  Type    Reason                Age   From                         Message
  ----    ------                ----  ----                         -------
  Normal  ExternalProvisioning  10s   persistentvolume-controller  waiting for a volume to be created, either by external provisioner "example.com/aws-efs" or manually created by system administrator

here is a sample nginx yaml which i used to test

apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80
        volumeMounts:
        - name: efs
          mountPath: /usr/share/nginx/html
      volumes:
        - name: efs
          persistentVolumeClaim:
            claimName: efs
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T22:29:25Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:13:31Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

$ kops version
Version 1.9.1

@wongma7
Copy link
Contributor

wongma7 commented Jul 10, 2018

Can you confirm the provisioner log looks the same as above (i.e. it is basically empty) #754 (comment) ? What if you try a newer/older image? https://quay.io/repository/external_storage/efs-provisioner?tab=tags (What version are you using now?)

@nagapavan
Copy link

I resolved the problem by creating PV volume before PVC.

@pictolearn
Copy link

pictolearn commented Jul 11, 2018

I tried the same to create a PV before a PVC, did not work. Not sure if it has anything to do with the order. I suspect it could be something with KOPS but it shouldnt matter. I have ensured that the workers nodes and the master nodes have nfs-commons installed on them.

@tonybranfort
Copy link

For those getting

waiting for a volume to be created, either by external provisioner "example.com/aws-efs" or manually created by system administrator

If you're using rbac (which it seems kops does by default), make sure you've created a serviceAccount. Look at the deployment.yaml rather than the manifest.yaml.

You'll also need rbac.yaml. The readme describes this but doesn't mention the service account.

@ParaSwarm
Copy link

Sigh. This took me all day to figure out. I hope this helps someone save some time:

The issue, for me, was the default service account. It didn't have access to get endpoints. Mind you, this is a fresh EKS cluster install, not running anything special yet.

You can confirm this by exec'ing into your efs-provisioner pod with a shell (you will have to install bash since it's alpine, run "apk add --no-cache bash" and then shell in with /bin/bash).

Once you're in, run "/efs-provisioner". You may start seeing this output:

error retrieving resource lock default/example.com-aws-efs: endpoints "example.com-aws-efs" is forbidden: User "system:serviceaccount:default:default" cannot get endpoints in the namespace "default"

If you do, you have the same issue as I did.

The quick fix is to give the cluster-admin role to the default service account. Of course, depending on your environment and security, you may need a more elaborate fix. If you elect to go the easy way, you can simply apply this:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: default-admin-rbac (or whatever)
subjects:
  - kind: ServiceAccount
    name: default
    namespace: default
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

After I did this, the output from the provisioner immediately became successful and started creating volumes.

@mustafaakin
Copy link

mustafaakin commented Sep 17, 2018

There is also RBAC configuration in the repo for the EFS provisioner: https://github.com/kubernetes-incubator/external-storage/blob/master/aws/efs/deploy/rbac.yaml

@geerlingguy
Copy link
Contributor

geerlingguy commented Dec 21, 2018

@ParaSwarm - Thank you so much for posting the detailed guide. Note that for me, I ran apk add bash (without the --no-cache flag) to make installing bash work without any fancy quoting or other steps.

I am running the provisioner in a namespace (efs-provisioner) instead of in the default namespace, so I am seeing:

E1221 22:10:38.183781      79 leaderelection.go:252] error retrieving resource lock efs-provisioner/acquia.com-aws-efs: endpoints "acquia.com-aws-efs" is forbidden: User "system:serviceaccount:efs-provisioner:default" cannot get endpoints in the namespace "efs-provisioner"

I am using the RBAC config from the external-storage/aws/efs/deploy/rbac.yaml file (with the namespace replaced where needed), but I'm still getting the above errors. I'm trying to debug what is wrong with the RBAC setup...

This is in an EKS cluster, running 1.11.5.

@geerlingguy
Copy link
Contributor

geerlingguy commented Dec 21, 2018

So turns out the issue for me was related to #953 — there are some missing rules in the RBAC ClusterRole which prevented the efs-provisioner service account from being able to do what it needed to do.

Adding the following lines to the rules in the ClusterRole fixed it for me:

  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

(see https://github.com/helm/charts/pull/9127/files#diff-1f3ae64e932358240df168628073a894R25)

I also added a ServiceAccount for the provisioner:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: efs-provisioner
  namespace: default

And attached that serviceAccount to my pod spec for the efs-provisioner Deployment:

...
    spec:
      serviceAccount: efs-provisioner
      containers:
        - name: efs-provisioner
          image: quay.io/external_storage/efs-provisioner:latest
...

Finally, I had to put the Role and RoleBinding into the namespace as well.

@regan-karlewicz
Copy link

@geerlingguy This worked for me. Thank you a ton!

@weisjohn
Copy link

weisjohn commented Feb 4, 2019

@geerlingguy also worked for me, thanks for the great explanation!

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 4, 2019
@liozzazhang
Copy link

I think that it's your RBAC configuration issue. If you specify some namespace parameters in your RBAC yaml file, you can refer to this file: https://github.com/kubernetes-incubator/external-storage/blob/master/nfs/deploy/kubernetes/rbac.yaml. Only specify namespace parameter in ClusterRoleBinding/RoleBingding Subjects section. Also don't forget to add endpoints permission in your ClusterRole resource.

@wiltonfelix
Copy link

wiltonfelix commented Jun 23, 2019

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pranav-patil referenced this issue in pranav-patil/spring-kubernetes-microservices Sep 2, 2019
@marcelolima
Copy link

@liozzazhang thanks, that fixed my problem!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/nfs-client lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests