Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pvc status is pending #4251

Closed
Jamiejoin opened this issue Nov 5, 2019 · 40 comments
Closed

Pvc status is pending #4251

Jamiejoin opened this issue Nov 5, 2019 · 40 comments
Labels
ceph main ceph tag

Comments

@Jamiejoin
Copy link

Jamiejoin commented Nov 5, 2019

I am using rook 1.1.6, kubernetes 1.13.7,CentOS Linux release 7.5.1804。I use the cephfs status to display pending,
Below is my specific configuration information:

Filesystem configuration information

[10.0.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph]# kubectl get cephfilesystem -n rook-ceph 
NAME   ACTIVEMDS   AGE
myfs   2           32m
[10.0.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph]# more filesystem.yaml 
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: myfs
  namespace: rook-ceph
spec:
  metadataPool:
    replicated:
      size: 2
  dataPools:
    - replicated:
        size: 2
  preservePoolsOnDelete: true
  metadataServer:
    activeCount: 2
    activeStandby: true

Storageclass configuration information

[10.0.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph]# kubectl get storageclass
NAME              PROVISIONER                     AGE
rook-cephfs       rook-ceph.cephfs.csi.ceph.com   5m42s
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
  clusterID: rook-ceph
  fsName: myfs
  pool: myfs-data0
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph

reclaimPolicy: Delete

I tried to create a Pvc and there was a pending state.Below is my pvc information

[10.9.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph/ceph-test]# kubectl get pvc
NAME             STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
my-claim         Pending                                                                            rook-cephfs       96m
[10.9.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph/ceph-test]# more pvc.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-claim
spec:
  storageClassName: rook-cephfs
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  resources:
    requests:
      storage: 10Gi

My ceph cluster status

[root@bj-oam-k8snode-test-0012 /]#  ceph status
  cluster:
    id:     6f9fa07f-b2fa-43c9-bd1b-09421d583c06
    health: HEALTH_WARN
            too few PGs per OSD (24 < min 30)
            clock skew detected on mon.b, mon.c
 
  services:
    mon: 3 daemons, quorum a,b,c (age 4d)
    mgr: a(active, since 6h)
    mds: myfs:2 {0=myfs-a=up:active,1=myfs-c=up:active} 2 up:standby-replay
    osd: 3 osds: 3 up (since 4d), 3 in (since 4d)
 
  data:
    pools:   3 pools, 24 pgs
    objects: 323 objects, 775 MiB
    usage:   5.2 GiB used, 2.6 TiB / 2.6 TiB avail
    pgs:     24 active+clean
 
  io:
    client:   2.5 KiB/s rd, 4 op/s rd, 0 op/s wr

Where am I wrong? What method can I use to find the problem? thank you very much.

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 5, 2019

@xiaoqge please check csi-cephfs-provisioner logs for more details on why its failing

@Madhu-1 Madhu-1 removed the feature label Nov 5, 2019
@Jamiejoin
Copy link
Author

@Madhu-1 Thank you very much, I want to try to deploy from a new installation. In the process of deleting pvc, I checked the csi-cephfs-provisione log as you said, and I see the following error message:

[10.9.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph]# kubectl delete pvc rbd-pvc      
persistentvolumeclaim "rbd-pvc" deleted

I1105 10:18:23.581412       1 controller.go:979] Final error received, removing PVC fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c from claims in progress
W1105 10:18:23.581458       1 controller.go:886] Retrying syncing claim "fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c", failure 8
I1105 10:18:23.581457       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"my-claim", UID:"fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c", APIVersion:"v1", ResourceVersion:"53175462", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-cephfs": error getting secret rook-csi-cephfs-provisioner in namespace rook-ceph: secrets "rook-csi-cephfs-provisioner" not found
E1105 10:18:23.581484       1 controller.go:908] error syncing claim "fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c": failed to provision volume with StorageClass "rook-cephfs": error getting secret rook-csi-cephfs-provisioner in namespace rook-ceph: secrets "rook-csi-cephfs-provisioner" not found
I1105 10:19:50.276311       1 controller.go:979] Final error received, removing PVC 6a303d22-fbc2-11e9-bf71-246e965dfe0c from claims in progress
I1105 10:19:50.276327       1 controller.go:902] Provisioning succeeded, removing PVC 6a303d22-fbc2-11e9-bf71-246e965dfe0c from claims in progress
I1105 10:20:15.420012       1 controller.go:979] Final error received, removing PVC 6a303d22-fbc2-11e9-bf71-246e965dfe0c from claims in progress
I1105 10:20:15.420031       1 controller.go:902] Provisioning succeeded, removing PVC 6a303d22-fbc2-11e9-bf71-246e965dfe0c from claims in progress
I1105 10:20:31.581688       1 controller.go:1196] provision "default/my-claim" class "rook-cephfs": started
I1105 10:20:31.587211       1 controller.go:471] CreateVolumeRequest {Name:pvc-fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:MULTI_NODE_MULTI_WRITER >  mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:myfs pool:myfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1105 10:20:31.587307       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"my-claim", UID:"fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c", APIVersion:"v1", ResourceVersion:"53175462", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/my-claim"
I1105 10:20:31.589065       1 controller.go:979] Final error received, removing PVC fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c from claims in progress
W1105 10:20:31.589080       1 controller.go:886] Retrying syncing claim "fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c", failure 9
E1105 10:20:31.589097       1 controller.go:908] error syncing claim "fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c": failed to provision volume with StorageClass "rook-cephfs": error getting secret rook-csi-cephfs-provisioner in namespace rook-ceph: secrets "rook-csi-cephfs-provisioner" not found
I1105 10:20:31.589116       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"my-claim", UID:"fb3655cf-ffa3-11e9-bcbe-ecf4bbd8d22c", APIVersion:"v1", ResourceVersion:"53175462", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-cephfs": error getting secret rook-csi-cephfs-provisioner in namespace rook-ceph: secrets "rook-csi-cephfs-provisioner" not found

Is such a mistake missing secrets? Is there a document for reference to fix?

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 5, 2019

rook should create this secret. can you please provide the operator logs to check whats happening

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 5, 2019

and also provide what rook version you are using

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 5, 2019

@Madhu-1 I am using the current rook-ceph master branch,I tried to specify a namespace to create pvc, but the error log still exists。

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-claim
  namespace: rook-ceph
spec:
  storageClassName: rook-cephfs
  accessModes: ["ReadWriteMany","ReadWriteOnce"]
  resources:
    requests:
      storage: 10Gi

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/ceph-test]# kubectl apply -f pvc.yaml  
persistentvolumeclaim/test-claim created

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/ceph-test]# kubectl get pvc -n rook-ceph
NAME         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-claim   Pending                                      rook-cephfs    11s

[10.9.0.1:root@bj-oam-k8smaster-test-0001:/opt/soft/ceph]# kubectl get secret -n rook-ceph |grep rook-csi-cephfs-provisioner
rook-csi-cephfs-provisioner-sa-token-6xql7   kubernetes.io/service-account-token   3      35m

The following is the error log


I1105 10:56:56.458482       1 controller.go:1196] provision "rook-ceph/test-claim" class "rook-cephfs": started
I1105 10:56:56.463764       1 controller.go:471] CreateVolumeRequest {Name:pvc-f2220e75-ffba-11e9-be08-ecf4bbd8c964 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:MULTI_NODE_MULTI_WRITER >  mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:myfs pool:myfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1105 10:56:56.463909       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rook-ceph", Name:"test-claim", UID:"f2220e75-ffba-11e9-be08-ecf4bbd8c964", APIVersion:"v1", ResourceVersion:"53253063", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "rook-ceph/test-claim"
I1105 10:56:56.465391       1 controller.go:979] Final error received, removing PVC f2220e75-ffba-11e9-be08-ecf4bbd8c964 from claims in progress
W1105 10:56:56.465403       1 controller.go:886] Retrying syncing claim "f2220e75-ffba-11e9-be08-ecf4bbd8c964", failure 5
E1105 10:56:56.465495       1 controller.go:908] error syncing claim "f2220e75-ffba-11e9-be08-ecf4bbd8c964": failed to provision volume with StorageClass "rook-cephfs": error getting secret rook-csi-cephfs-provisioner in namespace rook-ceph: secrets "rook-csi-cephfs-provisioner" not found
I1105 10:56:56.465495       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"rook-ceph", Name:"test-claim", UID:"f2220e75-ffba-11e9-be08-ecf4bbd8c964", APIVersion:"v1", ResourceVersion:"53253063", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-cephfs": error getting secret rook-csi-cephfs-provisioner in namespace rook-ceph: secrets "rook-csi-cephfs-provisioner" not found
I1105 10:57:11.963081       1 reflector.go:370] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:803: Watch close - *v1.PersistentVolume total 0 items received

Thank you again。

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 5, 2019

@xiaoqge please provide rook operator pods logs

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 5, 2019

@Madhu-1 This is some of the latest rook operator pods logs

2019-11-05 10:16:07.820641 I | operator: starting the controller-runtime manager
2019-11-05 10:16:07.989595 I | op-agent: getting flexvolume dir path from FLEXVOLUME_DIR_PATH env var
2019-11-05 10:16:07.989616 I | op-agent: discovered flexvolume dir path from source env var. value: /var/lib/kubelet/volumeplugins
2019-11-05 10:16:07.989622 I | op-agent: no agent mount security mode given, defaulting to 'Any' mode
2019-11-05 10:16:07.992922 I | op-agent: rook-ceph-agent daemonset started
I1105 10:16:08.817793       7 controller.go:818] Started provisioner controller rook.io/block_rook-ceph-operator-674cc8fdd-qb7ct_48441715-ffb5-11e9-a3a5-ee7fe49c3ba1!
I1105 10:16:09.217938       7 controller.go:818] Started provisioner controller ceph.rook.io/block_rook-ceph-operator-674cc8fdd-qb7ct_4843ffbd-ffb5-11e9-a3a5-ee7fe49c3ba1!
2019-11-05 10:16:10.187341 I | operator: successfully started Ceph CSI driver(s)
2019-11-05 10:16:16.187559 I | op-cluster: detecting the ceph image version for image bj-jd-oam-k8snode-test-0002/library/ceph:v14.2.4-20190917...
2019-11-05 10:16:18.742244 I | op-cluster: Detected ceph image version: 14.2.4 nautilus
2019-11-05 10:16:18.743682 E | cephconfig: clusterInfo: <nil>
2019-11-05 10:16:18.743696 I | op-cluster: CephCluster rook-ceph status: Creating. 
2019-11-05 10:16:18.754391 I | op-mon: start running mons
2019-11-05 10:16:18.755966 I | exec: Running command: ceph-authtool --create-keyring /var/lib/rook/rook-ceph/mon.keyring --gen-key -n mon. --cap mon 'allow *'
2019-11-05 10:16:18.803658 I | exec: Running command: ceph-authtool --create-keyring /var/lib/rook/rook-ceph/client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mgr 'allow *' --cap mds 'allow'
2019-11-05 10:16:18.861437 I | op-mon: creating mon secrets for a new cluster
2019-11-05 10:16:18.867640 I | op-mon: created csi secret for cluster rook-ceph
2019-11-05 10:16:18.871863 I | op-mon: saved mon endpoints to config map map[data: maxMonId:-1 mapping:{"node":{}} csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":[]}]]
2019-11-05 10:16:19.132683 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:16:19.132838 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:16:21.132628 I | op-mon: targeting the mon count 3
2019-11-05 10:16:21.137010 I | op-mon: sched-mon: created canary deployment rook-ceph-mon-a-canary
2019-11-05 10:16:21.532534 I | op-mon: sched-mon: canary monitor deployment rook-ceph-mon-a-canary scheduled to bj-jd-oam-k8snode-test-0014
2019-11-05 10:16:21.532547 I | op-mon: assignmon: mon a assigned to node bj-jd-oam-k8snode-test-0014
2019-11-05 10:16:21.536542 I | op-mon: sched-mon: created canary deployment rook-ceph-mon-b-canary
2019-11-05 10:16:21.932437 I | op-mon: sched-mon: canary monitor deployment rook-ceph-mon-b-canary scheduled to bj-jd-oam-k8snode-test-0013
2019-11-05 10:16:21.932453 I | op-mon: assignmon: mon b assigned to node bj-jd-oam-k8snode-test-0013
2019-11-05 10:16:21.936609 I | op-mon: sched-mon: created canary deployment rook-ceph-mon-c-canary
2019-11-05 10:16:22.332840 I | op-mon: sched-mon: canary monitor deployment rook-ceph-mon-c-canary scheduled to bj-jd-oam-k8snode-test-0012
2019-11-05 10:16:22.332856 I | op-mon: assignmon: mon c assigned to node bj-jd-oam-k8snode-test-0012
2019-11-05 10:16:22.332863 I | op-mon: assignmon: cleaning up canary deployment rook-ceph-mon-a-canary and canary pvc 
2019-11-05 10:16:22.332869 I | op-k8sutil: removing deployment rook-ceph-mon-a-canary if it exists
2019-11-05 10:16:22.337239 I | op-k8sutil: Removed deployment rook-ceph-mon-a-canary
2019-11-05 10:16:22.339195 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:24.341934 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:26.344795 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:28.347664 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:30.350550 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:32.353851 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:34.356568 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:36.359258 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:38.362049 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:40.364942 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:42.367924 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:44.370656 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:46.373293 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:48.376257 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:50.378882 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:52.381476 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:54.384090 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:56.387153 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:16:58.390228 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:00.392982 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:02.395661 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:04.398291 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:06.400973 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:08.403412 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:10.405922 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:12.408478 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:14.411221 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:16.413998 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:18.416817 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:20.419855 I | op-k8sutil: rook-ceph-mon-a-canary still found. waiting...
2019-11-05 10:17:22.420006 I | op-mon: assignmon: error deleting canary deployment rook-ceph-mon-a-canary. gave up waiting for rook-ceph-mon-a-canary pods to be terminated
2019-11-05 10:17:22.420025 I | op-mon: assignmon: cleaning up canary deployment rook-ceph-mon-b-canary and canary pvc 
2019-11-05 10:17:22.420048 I | op-k8sutil: removing deployment rook-ceph-mon-b-canary if it exists
2019-11-05 10:17:22.425099 I | op-k8sutil: Removed deployment rook-ceph-mon-b-canary
2019-11-05 10:17:22.427085 I | op-k8sutil: rook-ceph-mon-b-canary still found. waiting...
2019-11-05 10:17:24.430132 I | op-k8sutil: rook-ceph-mon-b-canary still found. waiting...
2019-11-05 10:17:26.432303 I | op-k8sutil: confirmed rook-ceph-mon-b-canary does not exist
2019-11-05 10:17:26.432321 I | op-mon: assignmon: cleaning up canary deployment rook-ceph-mon-c-canary and canary pvc 
2019-11-05 10:17:26.432327 I | op-k8sutil: removing deployment rook-ceph-mon-c-canary if it exists
2019-11-05 10:17:26.436235 I | op-k8sutil: Removed deployment rook-ceph-mon-c-canary
2019-11-05 10:17:26.438079 I | op-k8sutil: rook-ceph-mon-c-canary still found. waiting...
2019-11-05 10:17:28.440997 I | op-k8sutil: rook-ceph-mon-c-canary still found. waiting...
2019-11-05 10:17:30.442985 I | op-k8sutil: confirmed rook-ceph-mon-c-canary does not exist
2019-11-05 10:17:30.443002 I | op-mon: creating mon a
2019-11-05 10:17:30.449655 I | op-mon: mon a endpoint are [v2:10.9.61.114:3300,v1:10.9.61.114:6789]
2019-11-05 10:17:30.456046 I | op-mon: saved mon endpoints to config map map[data:a=10.9.61.114:6789 maxMonId:2 mapping:{"node":{"a":{"Name":"bj-jd-oam-k8snode-test-0014,"Hostname":"bj-jd-oam-k8snode-test-0014","Address":"10.9.0.23"},"b":{"Name":"bj-jd-oam-k8snode-test-0013","Hostname":"bj-jd-oam-k8snode-test-0013","Address":"10.9.0.22"},"c":{"Name":"bj-jd-oam-k8snode-test-0012","Hostname":"bj-jd-oam-k8snode-test-0012","Address":"10.9.0.21"}}} csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.9.61.114:6789"]}]]
2019-11-05 10:17:30.460320 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:17:30.460506 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:17:30.465223 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:17:30.465387 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:17:30.471428 I | op-mon: waiting for mon quorum with [a]
2019-11-05 10:17:30.477952 I | op-mon: mon a is not yet running
2019-11-05 10:17:30.477978 I | op-mon: mons running: []
2019-11-05 10:17:30.478225 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/357990957
2019-11-05 10:17:43.837549 I | op-mon: Monitors in quorum: [a]
2019-11-05 10:17:43.837568 I | op-mon: mons created: 1
2019-11-05 10:17:43.837703 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/148957608
2019-11-05 10:17:44.167943 I | op-mon: waiting for mon quorum with [a]
2019-11-05 10:17:44.173961 I | op-mon: mons running: [a]
2019-11-05 10:17:44.174098 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/178330855
2019-11-05 10:17:44.489024 I | op-mon: Monitors in quorum: [a]
2019-11-05 10:17:44.489136 I | exec: Running command: ceph config set global mon_allow_pool_delete true --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/148629786
2019-11-05 10:17:44.813111 I | exec: Running command: ceph config set global bluestore_warn_on_legacy_statfs false --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/971262897
2019-11-05 10:17:45.133534 I | exec: Running command: ceph config set global rbd_default_features 3 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/481343836
2019-11-05 10:17:45.458856 I | op-mon: creating mon b
2019-11-05 10:17:45.472388 I | op-mon: mon a endpoint are [v2:10.9.61.114:3300,v1:10.9.61.114:6789]
2019-11-05 10:17:45.477299 I | op-mon: mon b endpoint are [v2:10.9.60.209:3300,v1:10.9.60.209:6789]
2019-11-05 10:17:45.483787 I | op-mon: saved mon endpoints to config map map[data:a=10.9.61.114:6789,b=10.9.60.209:6789 maxMonId:2 mapping:{"node":{"a":{"Name":"bj-jd-oam-k8snode-test-0014","Hostname":"bj-jd-oam-k8snode-test-0014","Address":"10.9.0.23"},"b":{"Name":"bj-jd-oam-k8snode-test-0013","Hostname":"bj-jd-oam-k8snode-test-0013","Address":"10.9.0.22"},"c":{"Name":"bj-jd-oam-k8snode-test-0012","Hostname":"bj-jd-oam-k8snode-test-0012","Address":"10.9.0.21"}}} csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.9.61.114:6789","10.9.60.209:6789"]}]]
2019-11-05 10:17:45.488559 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:17:45.488690 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:17:45.769903 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:17:45.770018 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:17:45.773013 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed
2019-11-05 10:17:45.775442 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-11-05 10:17:45.775453 I | op-k8sutil: updating deployment rook-ceph-mon-a
2019-11-05 10:17:47.785339 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mon-a
2019-11-05 10:17:47.785354 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-11-05 10:17:47.785379 I | op-mon: waiting for mon quorum with [a b]
2019-11-05 10:17:47.796714 I | op-mon: mon b is not yet running
2019-11-05 10:17:47.796726 I | op-mon: mons running: [a]
2019-11-05 10:17:47.796833 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/101857035
2019-11-05 10:17:48.120299 I | op-mon: Monitors in quorum: [a]
2019-11-05 10:17:48.126588 I | op-mon: waiting for mon quorum with [a b]
2019-11-05 10:17:48.142736 I | op-mon: mon b is not yet running
2019-11-05 10:17:48.142749 I | op-mon: mons running: [a]
2019-11-05 10:17:48.142860 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/135111918
2019-11-05 10:17:48.486895 I | op-mon: Monitors in quorum: [a]
2019-11-05 10:17:48.486914 I | op-mon: mons created: 2
2019-11-05 10:17:48.486974 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/806692725
2019-11-05 10:17:48.842311 I | op-mon: waiting for mon quorum with [a b]
2019-11-05 10:17:48.853665 I | op-mon: mon b is not yet running
2019-11-05 10:17:48.853676 I | op-mon: mons running: [a]
2019-11-05 10:17:53.865478 I | op-mon: mons running: [a b]
2019-11-05 10:17:53.865637 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/121399888
2019-11-05 10:17:56.600083 I | op-mon: Monitors in quorum: [a b]
2019-11-05 10:17:56.600270 I | exec: Running command: ceph config set global mon_allow_pool_delete true --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/289203055
2019-11-05 10:17:56.909251 I | exec: Running command: ceph config set global bluestore_warn_on_legacy_statfs false --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/708059650
2019-11-05 10:17:57.230863 I | exec: Running command: ceph config set global rbd_default_features 3 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/235659641
2019-11-05 10:17:57.544702 I | op-mon: creating mon c
2019-11-05 10:17:57.558333 I | op-mon: mon a endpoint are [v2:10.9.61.114:3300,v1:10.9.61.114:6789]
2019-11-05 10:17:57.570878 I | op-mon: mon b endpoint are [v2:10.9.60.209:3300,v1:10.9.60.209:6789]
2019-11-05 10:17:57.575327 I | op-mon: mon c endpoint are [v2:10.9.63.150:3300,v1:10.9.63.150:6789]
2019-11-05 10:17:57.603788 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.9.61.114:6789","10.9.60.209:6789","10.9.63.150:6789"]}] data:c=10.9.63.150:6789,a=10.9.61.114:6789,b=10.9.60.209:6789 maxMonId:2 mapping:{"node":{"a":{"Name":"bj-jd-oam-k8snode-test-0014","Hostname":"bj-jd-oam-k8snode-test-0014,"Address":"10.9.0.23"},"b":{"Name":"bj-jd-oam-k8snode-test-0013","Hostname":"bj-jd-oam-k8snode-test-0013","Address":"10.9.0.22"},"c":{"Name":"bj-jd-oam-k8snode-test-0012","Hostname":"bj-jd-oam-k8snode-test-0012","Address":"10.9.0.21"}}}]
2019-11-05 10:17:58.004457 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:17:58.004609 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:17:58.404133 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-11-05 10:17:58.404240 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-11-05 10:17:58.406756 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed
2019-11-05 10:17:58.409212 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-11-05 10:17:58.409224 I | op-k8sutil: updating deployment rook-ceph-mon-a
2019-11-05 10:18:00.418612 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mon-a
2019-11-05 10:18:00.418628 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-11-05 10:18:00.418638 I | op-mon: waiting for mon quorum with [a b c]
2019-11-05 10:18:00.436261 I | op-mon: mon c is not yet running
2019-11-05 10:18:00.436273 I | op-mon: mons running: [a b]
2019-11-05 10:18:00.436388 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/223230596
2019-11-05 10:18:00.757252 I | op-mon: Monitors in quorum: [a b]
2019-11-05 10:18:00.760601 I | op-mon: deployment for mon rook-ceph-mon-b already exists. updating if needed
2019-11-05 10:18:00.762957 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-11-05 10:18:00.762970 I | op-k8sutil: updating deployment rook-ceph-mon-b
2019-11-05 10:18:02.772859 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mon-b
2019-11-05 10:18:02.772875 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-11-05 10:18:02.772886 I | op-mon: waiting for mon quorum with [a b c]
2019-11-05 10:18:02.790944 I | op-mon: mon c is not yet running
2019-11-05 10:18:02.790957 I | op-mon: mons running: [a b]
2019-11-05 10:18:02.791084 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/164100115
2019-11-05 10:18:03.114001 I | op-mon: Monitors in quorum: [a b]
2019-11-05 10:18:03.122006 I | op-mon: waiting for mon quorum with [a b c]
2019-11-05 10:18:03.141060 I | op-mon: mon c is not yet running
2019-11-05 10:18:03.141073 I | op-mon: mons running: [a b]
2019-11-05 10:18:03.141231 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/096734806
2019-11-05 10:18:03.471045 I | op-mon: Monitors in quorum: [a b]
2019-11-05 10:18:03.471064 I | op-mon: mons created: 3
2019-11-05 10:18:03.471129 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/861935549
2019-11-05 10:18:03.806295 I | op-mon: waiting for mon quorum with [a b c]
2019-11-05 10:18:03.824195 I | op-mon: mon c is not yet running
2019-11-05 10:18:03.824211 I | op-mon: mons running: [a b]
2019-11-05 10:18:08.842641 I | op-mon: mons running: [a b c]
2019-11-05 10:18:08.842816 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/141745656
2019-11-05 10:18:11.469392 W | op-mon: monitor c is not in quorum list
2019-11-05 10:18:16.488967 I | op-mon: mons running: [a b c]
2019-11-05 10:18:16.489179 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/677053687
2019-11-05 10:18:16.814128 I | op-mon: Monitors in quorum: [a b c]
2019-11-05 10:18:16.814284 I | exec: Running command: ceph config set global mon_allow_pool_delete true --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/988091882
2019-11-05 10:18:17.135630 I | exec: Running command: ceph config set global bluestore_warn_on_legacy_statfs false --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/301895233
2019-11-05 10:18:17.439510 I | exec: Running command: ceph config set global rbd_default_features 3 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/002981548
W1105 10:21:33.812659       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53233563 (53236626)
W1105 10:26:53.815887       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53236855 (53239085)
W1105 10:34:22.817567       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53239304 (53242486)
W1105 10:42:58.818711       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53242714 (53246452)
W1105 10:51:41.820368       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53246703 (53250523)
W1105 10:58:56.822504       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53250754 (53253911)
W1105 11:06:35.824518       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53254135 (53257420)
W1105 11:14:09.861684       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53257652 (53260874)
W1105 11:20:31.864338       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53261100 (53263777)
W1105 11:25:51.867480       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53264010 (53266206)
W1105 11:32:54.868970       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53266444 (53269438)
W1105 11:41:10.871101       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53269686 (53273207)
W1105 11:49:02.873525       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53273444 (53276807)
W1105 11:55:47.875993       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53277039 (53279884)
W1105 12:05:04.876589       7 reflector.go:289] github.com/rook/rook/pkg/operator/ceph/cluster/controller.go:173: watch of *v1.ConfigMap ended with: too old resource version: 53280122 (53284125)

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 5, 2019

@xiaoqge I don't see any error logs related to secret creation, make sure your rook image is up to date. if yes, can you please provide the list of all secrets in the rook-ceph namespace

@leseb PTAL

@leseb
Copy link
Member

leseb commented Nov 5, 2019

You're clearly not using rook master, the log message created csi secret for cluster is old and was before my refactor from #4086. Please exec into the operator and run rook version and paste the output. Thanks.

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 6, 2019

@leseb Thanks, in order to ensure that my operation is no problem, I tried to download the latest version of rook-1.1.6 to reinstall the deployment, but the problem still exists.

Below is the specific information:

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubectl describe pvc cephfs-pvc
Name:          cephfs-pvc
Namespace:     default
StorageClass:  csi-cephfs
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   kubectl.kubernetes.io/last-applied-configuration:
                 {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"cephfs-pvc","namespace":"default"},"spec":{"accessM...
               volume.beta.kubernetes.io/storage-provisioner: rook-ceph.cephfs.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Events:
  Type       Reason                Age                   From                                                                                               Message
  ----       ------                ----                  ----                                                                                               -------
  Normal     Provisioning          4m4s                  rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-0_2e0face1-0076-11ea-b92e-d2d56a58f28f  External provisioner is provisioning volume for claim "default/cephfs-pvc"
  Normal     ExternalProvisioning  3m3s (x7 over 3m55s)  persistentvolume-controller                                                                        waiting for a volume to be created, either by external provisioner "rook-ceph.cephfs.csi.ceph.com" or manually created by system administrator
  Warning    ProvisioningFailed    94s                   rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-0_2e0face1-0076-11ea-b92e-d2d56a58f28f  failed to provision volume with StorageClass "csi-cephfs": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Warning    ProvisioningFailed    18s (x8 over 2m48s)   persistentvolume-controller                                                                        storageclass.storage.k8s.io "csi-cephfs" not found
Mounted By:  <none>
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubectl get secret -n rook-ceph
NAME                                         TYPE                                  DATA   AGE
default-token-lrccb                          kubernetes.io/service-account-token   3      40m
rook-ceph-admin-keyring                      kubernetes.io/rook                    1      40m
rook-ceph-cmd-reporter-token-468ws           kubernetes.io/service-account-token   3      40m
rook-ceph-config                             kubernetes.io/rook                    2      40m
rook-ceph-dashboard-password                 kubernetes.io/rook                    1      35m
rook-ceph-mds-myfs-a-keyring                 kubernetes.io/rook                    1      10m
rook-ceph-mds-myfs-b-keyring                 kubernetes.io/rook                    1      10m
rook-ceph-mgr-a-keyring                      kubernetes.io/rook                    1      35m
rook-ceph-mgr-token-vtjch                    kubernetes.io/service-account-token   3      40m
rook-ceph-mon                                kubernetes.io/rook                    4      40m
rook-ceph-mons-keyring                       kubernetes.io/rook                    1      40m
rook-ceph-osd-token-528hf                    kubernetes.io/service-account-token   3      40m
rook-ceph-system-token-5cjd9                 kubernetes.io/service-account-token   3      40m
rook-csi-cephfs-node                         kubernetes.io/rook                    2      35m
rook-csi-cephfs-plugin-sa-token-s55dn        kubernetes.io/service-account-token   3      40m
rook-csi-cephfs-provisioner                  kubernetes.io/rook                    2      35m
rook-csi-cephfs-provisioner-sa-token-26znz   kubernetes.io/service-account-token   3      40m
rook-csi-rbd-node                            kubernetes.io/rook                    2      35m
rook-csi-rbd-plugin-sa-token-z72ht           kubernetes.io/service-account-token   3      40m
rook-csi-rbd-provisioner                     kubernetes.io/rook                    2      35m
rook-csi-rbd-provisioner-sa-token-bcrdv      kubernetes.io/service-account-token   3      40m

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# more storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
  clusterID: rook-ceph
  fsName: myfs

  pool: myfs-data0

  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
mountOptions:
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# more pvc.yaml
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: csi-cephfs

How can I solve this problem, thank you for your help.

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 6, 2019

the logs say storage-class not found please check

Warning ProvisioningFailed 18s (x8 over 2m48s) persistentvolume-controller storageclass.storage.k8s.io "csi-cephfs" not found

@Jamiejoin
Copy link
Author

Thanks, I posted the error log that created the storageclass. Below is the error message that is now displayed.

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubectl get storageclass 
NAME         PROVISIONER                     AGE
csi-cephfs   rook-ceph.cephfs.csi.ceph.com   4m52s
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubectl get pvc
NAME         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cephfs-pvc   Pending                                      csi-cephfs     4m41s
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubectl describe pvc cephfs-pvc 
Name:          cephfs-pvc
Namespace:     default
StorageClass:  csi-cephfs
Status:        Pending
Volume:        
Labels:        <none>
Annotations:   kubectl.kubernetes.io/last-applied-configuration:
                 {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"cephfs-pvc","namespace":"default"},"spec":{"accessM...
               volume.beta.kubernetes.io/storage-provisioner: rook-ceph.cephfs.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Events:
  Type       Reason                Age                  From                                                                                               Message
  ----       ------                ----                 ----                                                                                               -------
  Warning    ProvisioningFailed    2m25s                rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-0_2e0face1-0076-11ea-b92e-d2d56a58f28f  failed to provision volume with StorageClass "csi-cephfs": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Normal     Provisioning          17s (x9 over 4m55s)  rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-0_2e0face1-0076-11ea-b92e-d2d56a58f28f  External provisioner is provisioning volume for claim "default/cephfs-pvc"
  Warning    ProvisioningFailed    17s (x8 over 2m24s)  rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-0_2e0face1-0076-11ea-b92e-d2d56a58f28f  failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
  Normal     ExternalProvisioning  1s (x21 over 4m46s)  persistentvolume-controller                                                                        waiting for a volume to be created, either by external provisioner "rook-ceph.cephfs.csi.ceph.com" or manually created by system administrator
Mounted By:  <none>

Below is the csi-provisioner error log

I1106 10:12:55.556113       1 controller.go:1196] provision "default/cephfs-pvc" class "csi-cephfs": started
I1106 10:12:55.561609       1 controller.go:471] CreateVolumeRequest {Name:pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c CapacityRange:required_bytes:107374182400  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:myfs pool:myfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1106 10:12:55.561687       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c", APIVersion:"v1", ResourceVersion:"53848426", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/cephfs-pvc"
I1106 10:12:55.563272       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:12:55.563284       1 connection.go:181] GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:12:55.567802       1 connection.go:183] GRPC response: {}
I1106 10:12:55.568253       1 connection.go:184] GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:12:55.568298       1 controller.go:979] Final error received, removing PVC 0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c from claims in progress
W1106 10:12:55.568323       1 controller.go:886] Retrying syncing claim "0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c", failure 9
E1106 10:12:55.568339       1 controller.go:908] error syncing claim "0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c": failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:12:55.568395       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c", APIVersion:"v1", ResourceVersion:"53848426", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 6, 2019

please provide cephfs-plugin container logs, am suspecting some commands are struck, if possible exec into the cephfs-plugin container and check current running process

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 6, 2019

@Madhu-1 Thank you
Below is the log of csi-cephfsplugin-provisioner-0

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubectl logs -f  csi-cephfsplugin-provisioner-0 csi-cephfsplugin -n rook-ceph    
I1106 09:16:55.920292       1 cephcsi.go:103] Driver version: v1.2.1 and Git version: ecac134c38c97ac04b2e80395ecbb6231c0ec250
I1106 09:16:55.920885       1 cachepersister.go:45] cache-perister: using kubernetes configmap as metadata cache persister
I1106 09:16:55.934041       1 cephcsi.go:130] Initial PID limit is set to -1
I1106 09:16:55.934140       1 cephcsi.go:139] Reconfigured PID limit to -1 (max)
I1106 09:16:55.934193       1 cephcsi.go:158] Starting driver type: cephfs with name: rook-ceph.cephfs.csi.ceph.com
I1106 09:16:55.935206       1 util.go:48] cephfs: EXEC uname [-r]
I1106 09:16:55.936044       1 volumemounter.go:77] kernel version < 4.17 might not support quota feature, hence not loading kernel client
I1106 09:16:55.947205       1 volumemounter.go:82] loaded mounter: fuse
I1106 09:16:55.947438       1 mountcache.go:59] mount-cache: name: rook-ceph.cephfs.csi.ceph.com, version: v1.2.1, mountCacheDir:
I1106 09:16:55.947456       1 driver.go:86] Enabling controller service capability: CREATE_DELETE_VOLUME
I1106 09:16:55.947464       1 driver.go:98] Enabling volume access mode: MULTI_NODE_MULTI_WRITER
I1106 09:16:55.947740       1 server.go:118] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi-provisioner.sock", Net:"unix"}
I1106 09:16:56.449534       1 utils.go:119] ID: 1 GRPC call: /csi.v1.Identity/Probe
I1106 09:16:56.449550       1 utils.go:120] ID: 1 GRPC request: {}
I1106 09:16:56.450485       1 utils.go:125] ID: 1 GRPC response: {}
I1106 09:16:56.452639       1 utils.go:119] ID: 2 GRPC call: /csi.v1.Identity/GetPluginInfo
I1106 09:16:56.452648       1 utils.go:120] ID: 2 GRPC request: {}
I1106 09:16:56.453106       1 identityserver-default.go:37] ID: 2 Using default GetPluginInfo
I1106 09:16:56.453116       1 utils.go:125] ID: 2 GRPC response: {"name":"rook-ceph.cephfs.csi.ceph.com","vendor_version":"v1.2.1"}
I1106 09:16:56.456217       1 utils.go:119] ID: 3 GRPC call: /csi.v1.Identity/GetPluginCapabilities
I1106 09:16:56.456228       1 utils.go:120] ID: 3 GRPC request: {}
I1106 09:16:56.456821       1 utils.go:125] ID: 3 GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}}]}
I1106 09:16:56.459661       1 utils.go:119] ID: 4 GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I1106 09:16:56.459672       1 utils.go:120] ID: 4 GRPC request: {}
I1106 09:16:56.460177       1 controllerserver-default.go:63] ID: 4 Using default ControllerGetCapabilities
I1106 09:16:56.460200       1 utils.go:125] ID: 4 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I1106 09:16:56.701534       1 utils.go:119] ID: 5 GRPC call: /csi.v1.Identity/Probe
I1106 09:16:56.701547       1 utils.go:120] ID: 5 GRPC request: {}
I1106 09:16:56.702060       1 utils.go:125] ID: 5 GRPC response: {}

I1106 09:16:56.703833       1 utils.go:119] ID: 6 GRPC call: /csi.v1.Identity/GetPluginInfo
I1106 09:16:56.703843       1 utils.go:120] ID: 6 GRPC request: {}
I1106 09:16:56.704751       1 identityserver-default.go:37] ID: 6 Using default GetPluginInfo
I1106 09:16:56.704800       1 utils.go:125] ID: 6 GRPC response: {"name":"rook-ceph.cephfs.csi.ceph.com","vendor_version":"v1.2.1"}
I1106 09:16:56.704800       1 utils.go:125] ID: 6 GRPC response: {"name":"rook-ceph.cephfs.csi.ceph.com","vendor_version":"v1.2.1"}
I1106 09:16:56.708492       1 utils.go:119] ID: 7 GRPC call: /csi.v1.Identity/GetPluginCapabilities
I1106 09:16:56.708504       1 utils.go:120] ID: 7 GRPC request: {}
I1106 09:16:56.709004       1 utils.go:125] ID: 7 GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}}]}
I1106 09:16:56.711588       1 utils.go:119] ID: 8 GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I1106 09:16:56.711597       1 utils.go:120] ID: 8 GRPC request: {}
I1106 09:16:56.712052       1 controllerserver-default.go:63] ID: 8 Using default ControllerGetCapabilities
I1106 09:16:56.712078       1 utils.go:125] ID: 8 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I1106 09:17:56.148428       1 utils.go:119] ID: 9 GRPC call: /csi.v1.Identity/Probe
I1106 09:17:56.148440       1 utils.go:120] ID: 9 GRPC request: {}
I1106 09:17:56.149002       1 utils.go:125] ID: 9 GRPC response: {}
I1106 09:18:56.148490       1 utils.go:119] ID: 10 GRPC call: /csi.v1.Identity/Probe
I1106 09:18:56.148504       1 utils.go:120] ID: 10 GRPC request: {}
I1106 09:18:56.149130       1 utils.go:125] ID: 10 GRPC response: {}
I1106 09:19:56.148232       1 utils.go:119] ID: 11 GRPC call: /csi.v1.Identity/Probe
I1106 09:19:56.148262       1 utils.go:120] ID: 11 GRPC request: {}
I1106 09:19:56.148816       1 utils.go:125] ID: 11 GRPC response: {}
I1106 09:20:56.148058       1 utils.go:119] ID: 12 GRPC call: /csi.v1.Identity/Probe
I1106 09:20:56.148071       1 utils.go:120] ID: 12 GRPC request: {}
I1106 09:20:56.148636       1 utils.go:125] ID: 12 GRPC response: {}
I1106 09:21:56.148191       1 utils.go:119] ID: 13 GRPC call: /csi.v1.Identity/Probe
I1106 09:21:56.148220       1 utils.go:120] ID: 13 GRPC request: {}
I1106 09:21:56.148654       1 utils.go:125] ID: 13 GRPC response: {}
I1106 09:22:56.148212       1 utils.go:119] ID: 14 GRPC call: /csi.v1.Identity/Probe
I1106 09:22:56.148278       1 utils.go:120] ID: 14 GRPC request: {}
I1106 09:22:56.148832       1 utils.go:125] ID: 14 GRPC response: {}
I1106 09:23:56.148207       1 utils.go:119] ID: 15 GRPC call: /csi.v1.Identity/Probe
I1106 09:23:56.148220       1 utils.go:120] ID: 15 GRPC request: {}
I1106 09:23:56.148728       1 utils.go:125] ID: 15 GRPC response: {}
I1106 09:24:56.148072       1 utils.go:119] ID: 16 GRPC call: /csi.v1.Identity/Probe
I1106 09:24:56.148087       1 utils.go:120] ID: 16 GRPC request: {}
I1106 09:24:56.148643       1 utils.go:125] ID: 16 GRPC response: {}
I1106 09:25:56.148091       1 utils.go:119] ID: 17 GRPC call: /csi.v1.Identity/Probe
I1106 09:25:56.148105       1 utils.go:120] ID: 17 GRPC request: {}
I1106 09:48:56.148717       1 utils.go:125] ID: 40 GRPC response: {}
......
I1106 09:49:17.833398       1 utils.go:119] ID: 41 GRPC call: /csi.v1.Controller/CreateVolume
I1106 09:49:17.833413       1 utils.go:120] ID: 41 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-b848c474-007a-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 09:49:17.835576       1 util.go:48] ID: 41 cephfs: EXEC ceph [-m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs get myfs --format=json]
I1106 09:49:18.234000       1 util.go:48] ID: 41 cephfs: EXEC ceph [-m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs ls --format=json]
I1106 09:49:18.975394       1 fsjournal.go:137] ID: 41 Generated Volume ID (0001-0009-rook-ceph-0000000000000001-b3949eee-007a-11ea-923b-d2d56a58f28f) and subvolume name (csi-vol-b3949eee-007a-11ea-923b-d2d56a58f28f) for request name (pvc-b848c474-007a-11ea-bf71-246e965dfe0c)
I1106 09:49:18.975501       1 util.go:48] ID: 41 cephfs: EXEC ceph [fs subvolumegroup create myfs csi -m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 -c /etc/ceph/ceph.conf -n client.csi-cephfs-provisioner --keyfile=***stripped***]
I1106 09:49:56.148582       1 utils.go:119] ID: 42 GRPC call: /csi.v1.Identity/Probe
I1106 09:49:56.148605       1 utils.go:120] ID: 42 GRPC request: {}
I1106 09:49:56.149408       1 utils.go:125] ID: 42 GRPC response: {}
I1106 09:50:56.148231       1 utils.go:119] ID: 43 GRPC call: /csi.v1.Identity/Probe
I1106 09:50:56.148246       1 utils.go:120] ID: 43 GRPC request: {}
I1106 09:50:56.148833       1 utils.go:125] ID: 43 GRPC response: {}
I1106 09:51:56.148160       1 utils.go:119] ID: 44 GRPC call: /csi.v1.Identity/Probe
I1106 09:51:56.148173       1 utils.go:120] ID: 44 GRPC request: {}
I1106 09:51:56.148725       1 utils.go:125] ID: 44 GRPC response: {}
I1106 09:52:56.148142       1 utils.go:119] ID: 45 GRPC call: /csi.v1.Identity/Probe
I1106 09:52:56.148155       1 utils.go:120] ID: 45 GRPC request: {}
I1106 09:52:56.148634       1 utils.go:125] ID: 45 GRPC response: {}
.....
I1106 10:06:09.969552       1 utils.go:119] ID: 59 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:06:09.969568       1 utils.go:120] ID: 59 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:06:09.971422       1 util.go:48] ID: 59 cephfs: EXEC ceph [-m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs get myfs --format=json]
I1106 10:06:10.356923       1 util.go:48] ID: 59 cephfs: EXEC ceph [-m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs ls --format=json]
I1106 10:06:11.073464       1 fsjournal.go:137] ID: 59 Generated Volume ID (0001-0009-rook-ceph-0000000000000001-0ed64f8b-007d-11ea-923b-d2d56a58f28f) and subvolume name (csi-vol-0ed64f8b-007d-11ea-923b-d2d56a58f28f) for request name (pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c)
I1106 10:06:11.073604       1 util.go:48] ID: 59 cephfs: EXEC ceph [fs subvolumegroup create myfs csi -m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 -c /etc/ceph/ceph.conf -n client.csi-cephfs-provisioner --keyfile=***stripped***]
I1106 10:06:56.148119       1 utils.go:119] ID: 60 GRPC call: /csi.v1.Identity/Probe
I1106 10:06:56.148132       1 utils.go:120] ID: 60 GRPC request: {}
I1106 10:06:56.148677       1 utils.go:125] ID: 60 GRPC response: {}
I1106 10:07:39.951272       1 utils.go:119] ID: 61 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:07:39.951287       1 utils.go:120] ID: 61 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-e809e53d-007a-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:07:39.952983       1 util.go:48] ID: 61 cephfs: EXEC ceph [-m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs get myfs --format=json]
I1106 10:07:40.339689       1 util.go:48] ID: 61 cephfs: EXEC ceph [-m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf fs ls --format=json]
I1106 10:07:41.065818       1 fsjournal.go:137] ID: 61 Generated Volume ID (0001-0009-rook-ceph-0000000000000001-447a5be0-007d-11ea-923b-d2d56a58f28f) and subvolume name (csi-vol-447a5be0-007d-11ea-923b-d2d56a58f28f) for request name (pvc-e809e53d-007a-11ea-bf71-246e965dfe0c)
I1106 10:07:41.065910       1 util.go:48] ID: 61 cephfs: EXEC ceph [fs subvolumegroup create myfs csi -m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 -c /etc/ceph/ceph.conf -n client.csi-cephfs-provisioner --keyfile=***stripped***]
I1106 10:08:41.484611       1 utils.go:119] ID: 64 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:08:41.484626       1 utils.go:120] ID: 64 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:08:41.486329       1 controllerserver.go:82] ID: 64 an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
E1106 10:08:41.486371       1 utils.go:123] ID: 64 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:08:43.493665       1 utils.go:119] ID: 65 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:08:43.493680       1 utils.go:120] ID: 65 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:08:43.495510       1 controllerserver.go:82] ID: 65 an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
E1106 10:08:43.495524       1 utils.go:123] ID: 65 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:08:47.504567       1 utils.go:119] ID: 66 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:08:47.504584       1 utils.go:120] ID: 66 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:08:47.506527       1 controllerserver.go:82] ID: 66 an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
E1106 10:08:47.506542       1 utils.go:123] ID: 66 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:08:55.516937       1 utils.go:119] ID: 67 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:08:55.516952       1 utils.go:120] ID: 67 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:08:55.518644       1 controllerserver.go:82] ID: 67 an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
E1106 10:08:55.518677       1 utils.go:123] ID: 67 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:08:56.148273       1 utils.go:119] ID: 68 GRPC call: /csi.v1.Identity/Probe
I1106 10:08:56.148285       1 utils.go:120] ID: 68 GRPC request: {}
I1106 10:08:56.148739       1 utils.go:125] ID: 68 GRPC response: {}
I1106 10:09:11.529568       1 utils.go:119] ID: 69 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:09:11.529587       1 utils.go:120] ID: 69 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:09:11.532217       1 controllerserver.go:82] ID: 69 an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
E1106 10:09:11.532232       1 utils.go:123] ID: 69 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:09:43.542983       1 utils.go:119] ID: 70 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:09:43.542998       1 utils.go:120] ID: 70 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:09:43.544794       1 controllerserver.go:82] ID: 70 an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
E1106 10:09:43.544813       1 utils.go:123] ID: 70 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-0d1ae31c-007d-11ea-bcbe-ecf4bbd8d22c already exists
I1106 10:09:56.148045       1 utils.go:119] ID: 71 GRPC call: /csi.v1.Identity/Probe
I1106 10:09:56.148057       1 utils.go:120] ID: 71 GRPC request: {}
I1106 10:09:56.148574       1 utils.go:125] ID: 71 GRPC response: {}
I1106 10:10:10.457748       1 utils.go:119] ID: 72 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:10:10.457794       1 utils.go:120] ID: 72 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-e809e53d-007a-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:10:10.461669       1 controllerserver.go:82] ID: 72 an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
E1106 10:10:10.461686       1 utils.go:123] ID: 72 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
I1106 10:10:11.468363       1 utils.go:119] ID: 73 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:10:11.468392       1 utils.go:120] ID: 73 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-e809e53d-007a-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:10:11.470080       1 controllerserver.go:82] ID: 73 an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
E1106 10:10:11.470098       1 utils.go:123] ID: 73 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
I1106 10:10:13.478268       1 utils.go:119] ID: 74 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:10:13.478303       1 utils.go:120] ID: 74 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-e809e53d-007a-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:10:13.480012       1 controllerserver.go:82] ID: 74 an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
E1106 10:10:13.480027       1 utils.go:123] ID: 74 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
I1106 10:10:17.487376       1 utils.go:119] ID: 75 GRPC call: /csi.v1.Controller/CreateVolume
I1106 10:10:17.487391       1 utils.go:120] ID: 75 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-e809e53d-007a-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1106 10:10:17.489525       1 controllerserver.go:82] ID: 75 an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists
E1106 10:10:17.489554       1 utils.go:123] ID: 75 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-e809e53d-007a-11ea-bf71-246e965dfe0c already exists

Below is the node log


{"log":"I1106 09:16:51.450687       1 cephcsi.go:103] Driver version: v1.2.1 and Git version: ecac134c38c97ac04b2e80395ecbb6231c0ec250\n","stream":"stderr","ti
me":"2019-11-06T09:16:51.450817115Z"}
{"log":"I1106 09:16:51.450798       1 cachepersister.go:45] cache-perister: using kubernetes configmap as metadata cache persister\n","stream":"stderr","time":
"2019-11-06T09:16:51.450881997Z"}
{"log":"I1106 09:16:51.453178       1 cephcsi.go:130] Initial PID limit is set to -1\n","stream":"stderr","time":"2019-11-06T09:16:51.453276544Z"}
{"log":"I1106 09:16:51.453375       1 cephcsi.go:139] Reconfigured PID limit to -1 (max)\n","stream":"stderr","time":"2019-11-06T09:16:51.45342785Z"}
{"log":"I1106 09:16:51.453385       1 cephcsi.go:158] Starting driver type: cephfs with name: rook-ceph.cephfs.csi.ceph.com\n","stream":"stderr","time":"2019-1
1-06T09:16:51.453441505Z"}
{"log":"I1106 09:16:51.454122       1 util.go:48] cephfs: EXEC uname [-r]\n","stream":"stderr","time":"2019-11-06T09:16:51.454241406Z"}
{"log":"I1106 09:16:51.454847       1 volumemounter.go:77] kernel version \u003c 4.17 might not support quota feature, hence not loading kernel client\n","stre
am":"stderr","time":"2019-11-06T09:16:51.455036738Z"}
{"log":"I1106 09:16:51.466103       1 volumemounter.go:82] loaded mounter: fuse\n","stream":"stderr","time":"2019-11-06T09:16:51.466177242Z"}
{"log":"I1106 09:16:51.466340       1 mountcache.go:59] mount-cache: name: rook-ceph.cephfs.csi.ceph.com, version: v1.2.1, mountCacheDir: /mount-cache-dir\n","
stream":"stderr","time":"2019-11-06T09:16:51.466388924Z"}
{"log":"I1106 09:16:51.466454       1 mountcache.go:99] mount-cache: successfully remounted 0 volumes\n","stream":"stderr","time":"2019-11-06T09:16:51.46649556
1Z"}
{"log":"I1106 09:16:51.466837       1 server.go:118] Listening for connections on address: \u0026net.UnixAddr{Name:\"//csi/csi.sock\", Net:\"unix\"}\n","stream
":"stderr","time":"2019-11-06T09:16:51.466904864Z"}
{"log":"I1106 09:16:52.186937       1 utils.go:119] ID: 1 GRPC call: /csi.v1.Identity/GetPluginInfo\n","stream":"stderr","time":"2019-11-06T09:16:52.187044133Z
"}
{"log":"I1106 09:16:52.186982       1 utils.go:120] ID: 1 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:16:52.187868349Z"}
{"log":"I1106 09:16:52.187837       1 identityserver-default.go:37] ID: 1 Using default GetPluginInfo\n","stream":"stderr","time":"2019-11-06T09:16:52.18788370
2Z"}
{"log":"I1106 09:16:52.187852       1 utils.go:125] ID: 1 GRPC response: {\"name\":\"rook-ceph.cephfs.csi.ceph.com\",\"vendor_version\":\"v1.2.1\"}\n","stream"
:"stderr","time":"2019-11-06T09:16:52.188374606Z"}
{"log":"I1106 09:16:52.187852       1 utils.go:125] ID: 1 GRPC response: {\"name\":\"rook-ceph.cephfs.csi.ceph.com\",\"vendor_version\":\"v1.2.1\"}\n","stream"
:"stderr","time":"2019-11-06T09:16:52.188374606Z"}
{"log":"I1106 09:16:52.196424       1 utils.go:119] ID: 2 GRPC call: /csi.v1.Node/NodeGetInfo\n","stream":"stderr","time":"2019-11-06T09:16:52.196507561Z"}
{"log":"I1106 09:16:52.196435       1 utils.go:120] ID: 2 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:16:52.198217786Z"}
{"log":"I1106 09:16:52.198176       1 nodeserver-default.go:58] ID: 2 Using default NodeGetInfo\n","stream":"stderr","time":"2019-11-06T09:16:52.198229808Z"}
{"log":"I1106 09:16:52.198191       1 utils.go:125] ID: 2 GRPC response: {\"node_id\":\"bj-jd-oam-k8snode-test-0012\"}\n","stream":"stderr","time
":"2019-11-06T09:16:52.198849967Z"}
{"log":"I1106 09:17:51.665550       1 utils.go:119] ID: 3 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:17:51.665690952Z"}
{"log":"I1106 09:17:51.665562       1 utils.go:120] ID: 3 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:17:51.666165872Z"}
{"log":"I1106 09:17:51.666111       1 utils.go:125] ID: 3 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:17:51.666577041Z"}
{"log":"I1106 09:18:51.665357       1 utils.go:119] ID: 4 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:18:51.665454695Z"}
{"log":"I1106 09:18:51.665371       1 utils.go:120] ID: 4 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:18:51.666074468Z"}
{"log":"I1106 09:18:51.665994       1 utils.go:125] ID: 4 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:18:51.666471412Z"}
{"log":"I1106 09:19:51.665258       1 utils.go:119] ID: 5 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:19:51.665335087Z"}
{"log":"I1106 09:19:51.665270       1 utils.go:120] ID: 5 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:19:51.665845881Z"}
{"log":"I1106 09:19:51.665812       1 utils.go:125] ID: 5 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:19:51.666294915Z"}
{"log":"I1106 09:20:51.665351       1 utils.go:119] ID: 6 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:20:51.665405584Z"}
{"log":"I1106 09:20:51.665366       1 utils.go:120] ID: 6 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:20:51.666012182Z"}
{"log":"I1106 09:20:51.665935       1 utils.go:125] ID: 6 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:20:51.666482737Z"}
{"log":"I1106 09:21:51.665332       1 utils.go:119] ID: 7 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:21:51.665383899Z"}
{"log":"I1106 09:21:51.665345       1 utils.go:120] ID: 7 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:21:51.666032724Z"}
{"log":"I1106 09:21:51.665942       1 utils.go:125] ID: 7 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:21:51.666426379Z"}
{"log":"I1106 09:22:51.665454       1 utils.go:119] ID: 8 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:22:51.665582104Z"}
{"log":"I1106 09:22:51.665491       1 utils.go:120] ID: 8 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:22:51.666164068Z"}
{"log":"I1106 09:22:51.666086       1 utils.go:125] ID: 8 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:22:51.666683948Z"}
{"log":"I1106 09:23:51.665358       1 utils.go:119] ID: 9 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:23:51.665472153Z"}
{"log":"I1106 09:23:51.665394       1 utils.go:120] ID: 9 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:23:51.666058423Z"}
{"log":"I1106 09:23:51.665973       1 utils.go:125] ID: 9 GRPC response: {}\n","stream":"stderr","time":"2019-11-06T09:23:51.666492808Z"}
{"log":"I1106 09:24:51.665365       1 utils.go:119] ID: 10 GRPC call: /csi.v1.Identity/Probe\n","stream":"stderr","time":"2019-11-06T09:24:51.665481312Z"}
{"log":"I1106 09:24:51.665385       1 utils.go:120] ID: 10 GRPC request: {}\n","stream":"stderr","time":"2019-11-06T09:24:51.666021264Z"}

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 6, 2019

i dont see any logs for command stuck ,can you try "ps -aux|grep rados" command in cephfs-plugin container

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 6, 2019

@Madhu-1 Thank you,Below is the result of executing the "ps -aux|grep rados" command in the csi-cephfsplugin container of the k8s node. Did not see the process of rados

[10.9.0.21:root@bj-jd-oam-k8snode-test-0012:/opt]# docker ps |grep cephfsplugin_csi-cephfsplugin-
a22942fa27cf        4ed9912287b5                                                     "/usr/local/bin/ceph…"    About an hour ago   Up About an hour                        k8s_csi-cephfsplugin_csi-cephfsplugin-p9jkw_rook-ceph_315c7ab6-0076-11ea-bcbe-ecf4bbd8d22c_0
[10.9.0.21:root@bj-jd-oam-k8snode-test-0012:/opt]# docker exec -it a22942fa27cf  /bin/bash       
bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory
[root@bj-jd-oam-k8snode-test-0012 /]# ps -aux|grep rados
root        131  0.0  0.0   9092   668 pts/1    S+   10:34   0:00 grep --color=auto rados

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 6, 2019

have you created filesystem?

the above provided in #4251 (comment) is not complete because request logging starts from ID: 70 we need logs from beginning to check what's going on

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 6, 2019

@Madhu-1

  1. I refer to cephfs created at https://github.com/rook/rook/blob/master/Documentation/ceph-filesystem.md

  2. I reposted the Pvc status is pending #4251 (comment) log, hiding some of the same duplicate log content.

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 6, 2019

form logs
I1106 10:07:41.065910 1 util.go:48] ID: 61 cephfs: EXEC ceph [fs subvolumegroup create myfs csi -m 10.9.63.15:6789,10.9.63.153:6789,10.9.60.115:6789 -c /etc/ceph/ceph.conf -n client.csi-cephfs-provisioner --keyfile=***stripped***]
commands are stuck
@ajarr any idea

@xiaoqge please provide ceph version

@Jamiejoin
Copy link
Author

Version is as follows

[root@bj-jd-oam-k8snode-test-0012 /]# ceph version
ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)

@ajarr ajarr added the ceph main ceph tag label Nov 6, 2019
@kevinfaveri
Copy link

kevinfaveri commented Nov 7, 2019

Same here... PVC stays in pending and throws this error:

Warning
ProvisioningFailed
2m19s               
rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-78d9994b5d-ssjtg_54376741-010c-11ea-95e4-d2289ae4f24a  
failed to provision volume with StorageClass "fs-app-homolog": rpc error: code = DeadlineExceeded desc = context deadline exceeded

@galexrt
Copy link
Member

galexrt commented Nov 7, 2019

@kevinfaguiar @xiaoqge please post the rook version command output from inside the rook-ceph-operator Pod. Also which Kubernetes version is the Kubernetes cluster running.

@xiaoqge please use the documentation for the Rook version you are running instead of master branch documentation, e.g., https://rook.io/docs/rook/v1.1/ and / or the documentation and manifests from the release-1.1 branch in the repo, for Rook version 1.1.

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 7, 2019

@galexrt Thank you

I use kubernetes 1.13.7,CentOS Linux release 7.5.1804, The rook version I am using 1.1.6

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# kubeadm versionkubeadm version: &version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.7-dirty", GitCommit:"4683545293d792934a7a7e12f2cc47d20b2dd01b", GitTreeState:"dirty", BuildDate:"2019-06-11T02:51:57Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# 
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.6/cluster/examples/kubernetes/ceph/csi/cephfs]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core) 
[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/root]# kubectl exec -it rook-ceph-operator-7975f875fc-k2rz6 /bin/bash -n rook-ceph
bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory
[root@rook-ceph-operator-7975f875fc-k2rz6 /]# rook version
rook: v1.1.6
go: go1.11

I compared https://rook.io/docs/rook/v1.1/ceph-filesystem.html and the content is no different from the master.

@kevinfaveri
Copy link

kevinfaveri commented Nov 7, 2019

rook version

My Rook version is:

rook: v1.1.6
go: go1.11

My Kubernetes version is (I'm using AWS EKS):

Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.7-eks-e9b1d0", GitCommit:"e9b1d0551216e1e8ace5ee4ca50161df34325ec2", GitTreeState:"clean", BuildDate:"2019-09-21T08:33:01Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}

Also, for the running pods. there is some that are not running in a pending state... could be it affecting PVC provisioning?

rook-ceph-osd-0-57f6c57b5b-qm8rk                0/1     Pending     0          8h
rook-ceph-osd-1-7f64675d8-vcs89                 0/1     Pending     0          8h
rook-ceph-osd-2-787ffb5b6c-rn2x9                0/1     Pending     0          8h

As for events on rock-ceph-osd-{ID} pods, the ones in pending state:

Events:
  Type     Reason            Age                      From               Message
  ----     ------            ----                     ----               -------
  Warning  FailedScheduling  3m48s (x324 over 7h59m)  default-scheduler  0/6 nodes are available: 1 Insufficient pods, 5 node(s) didn't match node selector, 6 Insufficient cpu, 6 Insufficient memory.

How much CPU + Memory is needed for Rook? (There are some minimum values, right? Like 2Gi of memory...)


For cluster creation I'm using cluster-on-pvc, which is basically this file: https://github.com/rook/rook/blob/release-1.1/cluster/examples/kubernetes/ceph/cluster-on-pvc.yaml

@Jamiejoin
Copy link
Author

@Madhu-1 Is there any good solution?

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 12, 2019

@kevinfaguiar Need to let your osd run normally, otherwise pvc will wait, your problem is different from what I encountered, yours is not a BUG

@Jamiejoin
Copy link
Author

What could be the cause of this problem? I can try to verify the problem, I don't have any ideas at the moment. thank you very much

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 12, 2019

@xiaoqge looks like its cephfs core issue can you provide ps -ef|grep ceph output from cephcsi-plugin container @ramana can you please check

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 12, 2019

Thank You,Enter the k8s_csi-cephfsplugin_csi-cephfsplugin-provisioner container to execute the ps results as follows:

[root@csi-cephfsplugin-provisioner-0 /]# ps -ef|grep ceph
root          1      0  0 03:11 ?        00:00:02 /usr/local/bin/cephcsi --nodeid=bj-jd-oam-k8snode-test-0013 --type=cephfs --endpoint=unix:///csi/csi-provisioner.sock --v=5 --controllerserver=true --drivername=rook-cep.cephfs.csi.ceph.com --metadatastorage=k8s_configmap --pidlimit=-1 --metricsport=9091 --metricspath=/metrics --enablegrpcmetrics=true
root        296      1  0 03:30 ?        00:00:08 /usr/bin/python2.7 /usr/bin/ceph fs subvolumegroup create myfs csi -m 10.9.62.215:6789,10.9.60.180:6789,10.9.61.73:6789 -c /etc/ceph/ceph.conf -n client.admin --keyfile=/tmp/csi/keys/keyfile-673916667
root        469      1  0 04:02 ?        00:00:07 /usr/bin/python2.7 /usr/bin/ceph fs subvolumegroup create myfs csi -m 10.9.62.215:6789,10.9.60.180:6789,10.9.61.73:6789 -c /etc/ceph/ceph.conf -n client.admin --keyfile=/tmp/csi/keys/keyfile-091148850
root        548    530  0 07:19 pts/1    00:00:00 grep --color=auto ceph

@Madhu-1
Copy link
Member

Madhu-1 commented Nov 12, 2019

@xiaoqge thanks for sharing the output, I can confirm fs subvolume create commands are stuck

are you able to create subvolume from toolbox pod? if yes can you please check the same from the cephfs-plugin container.

@Jamiejoin
Copy link
Author

Jamiejoin commented Nov 12, 2019

Thank you, the conclusion is correct

Below is the result of my command

[10.9.0.1:root@bj-jd-oam-k8smaster-test-0001:/opt/soft/ceph/rook-1.1.2]# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
[root@bj-jd-oam-k8snode-test-0014 /]# ceph osd pool create cephfs_data 20 
pool 'cephfs_data' created
[root@bj-jd-oam-k8snode-test-0014 /]# ceph osd pool create cephfs_metadata 100
pool 'cephfs_metadata' createddata
[root@bj-jd-oam-k8snode-test-0014 /]# ceph fs flag set enable_multiple true
Warning! This feature is experimental.It may cause problems up to and including data loss.Consult the documentation at ceph.com, and if unsure, do not proceed.Add --yes-i-really-mean-it if you are certain.
datat@bj-jd-oam-k8snode-test-0014 /]# ceph fs new cephfs cephfs_metadata cephfs_new fs with metadata pool 5 and data pool 4
[root@bj-jd-oam-k8snode-test-0014 /]# ceph fs ls
name: myfs, metadata pool: myfs-metadata, data pools: [myfs-data0 ]
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]

Below is the log of cephfs-plugin. I didn't see my cephfs created fs volume log.

I1112 07:38:17.675400       1 utils.go:120] ID: 339 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-3d2bce9e-0501-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1112 07:38:17.677159       1 controllerserver.go:82] ID: 339 an operation with the given Volume ID pvc-3d2bce9e-0501-11ea-bf71-246e965dfe0c already exists
E1112 07:38:17.677175       1 utils.go:123] ID: 339 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-3d2bce9e-0501-11ea-bf71-246e965dfe0c already exists
I1112 07:38:29.700722       1 utils.go:119] ID: 340 GRPC call: /csi.v1.Identity/Probe
I1112 07:38:29.700735       1 utils.go:120] ID: 340 GRPC request: {}
I1112 07:38:29.701362       1 utils.go:125] ID: 340 GRPC response: {}
I1112 07:39:29.700701       1 utils.go:119] ID: 341 GRPC call: /csi.v1.Identity/Probe
I1112 07:39:29.700714       1 utils.go:120] ID: 341 GRPC request: {}
I1112 07:39:29.701266       1 utils.go:125] ID: 341 GRPC response: {}
I1112 07:40:29.700744       1 utils.go:119] ID: 342 GRPC call: /csi.v1.Identity/Probe
I1112 07:40:29.700755       1 utils.go:120] ID: 342 GRPC request: {}
I1112 07:40:29.701329       1 utils.go:125] ID: 342 GRPC response: {}
I1112 07:41:29.700654       1 utils.go:119] ID: 343 GRPC call: /csi.v1.Identity/Probe
I1112 07:41:29.700667       1 utils.go:120] ID: 343 GRPC request: {}
I1112 07:41:29.701197       1 utils.go:125] ID: 343 GRPC response: {}
I1112 07:42:29.700747       1 utils.go:119] ID: 344 GRPC call: /csi.v1.Identity/Probe
I1112 07:42:29.700760       1 utils.go:120] ID: 344 GRPC request: {}
I1112 07:42:29.701296       1 utils.go:125] ID: 344 GRPC response: {}
I1112 07:43:17.687599       1 utils.go:119] ID: 345 GRPC call: /csi.v1.Controller/CreateVolume
I1112 07:43:17.687623       1 utils.go:120] ID: 345 GRPC request: {"capacity_range":{"required_bytes":107374182400},"name":"pvc-3d2bce9e-0501-11ea-bf71-246e965dfe0c","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1112 07:43:17.689461       1 controllerserver.go:82] ID: 345 an operation with the given Volume ID pvc-3d2bce9e-0501-11ea-bf71-246e965dfe0c already exists
E1112 07:43:17.689495       1 utils.go:123] ID: 345 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-3d2bce9e-0501-11ea-bf71-246e965dfe0c already exists
I1112 07:43:29.700736       1 utils.go:119] ID: 346 GRPC call: /csi.v1.Identity/Probe
I1112 07:43:29.700749       1 utils.go:120] ID: 346 GRPC request: {}
I1112 07:43:29.701379       1 utils.go:125] ID: 346 GRPC response: {}

@Jamiejoin
Copy link
Author

@Madhu-1 Is there any way I can rule out this problem, is my use incorrect or bug?

@MichaelZhao9824
Copy link

Got the same issue when use release-1.1. Any way to fix it?

@ajarr
Copy link
Contributor

ajarr commented Nov 20, 2019

@joscollin take a look

@satwikk
Copy link

satwikk commented Nov 20, 2019

I have deployed rook-ceph v1.1 on CentOS7 with mainline kernel 5.3.11

And I cannot successfully create PVC/PV on csi-cephfs

Please find the csi-cephfsplugin-provisioner logs below,

I1120 14:10:06.100226       1 leaderelection.go:282] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1120 14:10:10.931630       1 connection.go:183] GRPC response: {}
I1120 14:10:10.932455       1 connection.go:184] GRPC error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I1120 14:10:10.932548       1 controller.go:1023] Temporary error received, adding PVC 1763e959-23aa-4983-b360-9800d7790911 to claims in progress
W1120 14:10:10.932571       1 controller.go:887] Retrying syncing claim "1763e959-23aa-4983-b360-9800d7790911", failure 0
E1120 14:10:10.932604       1 controller.go:910] error syncing claim "1763e959-23aa-4983-b360-9800d7790911": failed to provision volume with StorageClass "csi-cephfs": rpc error: code = DeadlineExceeded desc = context deadline exceeded
I1120 14:10:10.932800       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"1763e959-23aa-4983-b360-9800d7790911", APIVersion:"v1", ResourceVersion:"20649", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs": rpc error: code = DeadlineExceeded desc = context deadline exceeded
I1120 14:10:11.277049       1 leaderelection.go:282] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1120 14:10:11.433098       1 controller.go:1199] provision "default/cephfs-pvc" class "csi-cephfs": started
I1120 14:10:11.437819       1 controller.go:494] CreateVolumeRequest {Name:pvc-1763e959-23aa-4983-b360-9800d7790911 CapacityRange:required_bytes:1073741824  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:myfs pool:myfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1120 14:10:11.438065       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"1763e959-23aa-4983-b360-9800d7790911", APIVersion:"v1", ResourceVersion:"20649", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/cephfs-pvc"
I1120 14:10:11.441631       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I1120 14:10:11.441668       1 connection.go:181] GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-1763e959-23aa-4983-b360-9800d7790911","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1120 14:10:11.449382       1 connection.go:183] GRPC response: {}
I1120 14:10:11.450293       1 connection.go:184] GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-1763e959-23aa-4983-b360-9800d7790911 already exists
I1120 14:10:11.450401       1 controller.go:1023] Temporary error received, adding PVC 1763e959-23aa-4983-b360-9800d7790911 to claims in progress
W1120 14:10:11.450428       1 controller.go:887] Retrying syncing claim "1763e959-23aa-4983-b360-9800d7790911", failure 1
E1120 14:10:11.450463       1 controller.go:910] error syncing claim "1763e959-23aa-4983-b360-9800d7790911": failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-1763e959-23aa-4983-b360-9800d7790911 already exists
I1120 14:10:11.450632       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"1763e959-23aa-4983-b360-9800d7790911", APIVersion:"v1", ResourceVersion:"20649", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-1763e959-23aa-4983-b360-9800d7790911 already exists

@satwikk
Copy link

satwikk commented Nov 20, 2019

For your reference - There are cephfs secrets present under rook-ceph namespace

kr get secrets | grep cephfs
rook-csi-cephfs-node                         kubernetes.io/rook                    2      45m
rook-csi-cephfs-plugin-sa-token-xn5k9        kubernetes.io/service-account-token   3      89m
rook-csi-cephfs-provisioner                  kubernetes.io/rook                    2      45m
rook-csi-cephfs-provisioner-sa-token-kd4df   kubernetes.io/service-account-token   3      89m
kr get secret/rook-csi-cephfs-provisioner -o jsonpath='{.data.adminKey}' | base64 --decode
AQBAQdVdC+BQBBAAbPJn7oNx6Fl9s4HMw1nxfA==
ceph auth get client.csi-cephfs-provisioner
exported keyring for client.csi-cephfs-provisioner
[client.csi-cephfs-provisioner]
        key = AQBAQdVdC+BQBBAAbPJn7oNx6Fl9s4HMw1nxfA==
        caps mgr = "allow rw"
        caps mon = "allow r"
        caps osd = "allow rw tag cephfs metadata=*"

@ajarr
Copy link
Contributor

ajarr commented Nov 22, 2019

@xiaoqge the root cause is that your Ceph cluster isn't healthy

You need to fix that,

 cluster:
    id:     6f9fa07f-b2fa-43c9-bd1b-09421d583c06
    health: HEALTH_WARN
            too few PGs per OSD (24 < min 30)
            clock skew detected on mon.b, mon.c

Please fix the too few PGs per OSD issue

Thanks @ShyamsundarR for this analysis.

@ajarr
Copy link
Contributor

ajarr commented Nov 22, 2019

I have deployed rook-ceph v1.1 on CentOS7 with mainline kernel 5.3.11

And I cannot successfully create PVC/PV on csi-cephfs

Please find the csi-cephfsplugin-provisioner logs below,

I1120 14:10:06.100226       1 leaderelection.go:282] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1120 14:10:10.931630       1 connection.go:183] GRPC response: {}
I1120 14:10:10.932455       1 connection.go:184] GRPC error: rpc error: code = DeadlineExceeded desc = context deadline exceeded
I1120 14:10:10.932548       1 controller.go:1023] Temporary error received, adding PVC 1763e959-23aa-4983-b360-9800d7790911 to claims in progress
W1120 14:10:10.932571       1 controller.go:887] Retrying syncing claim "1763e959-23aa-4983-b360-9800d7790911", failure 0
E1120 14:10:10.932604       1 controller.go:910] error syncing claim "1763e959-23aa-4983-b360-9800d7790911": failed to provision volume with StorageClass "csi-cephfs": rpc error: code = DeadlineExceeded desc = context deadline exceeded
I1120 14:10:10.932800       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"1763e959-23aa-4983-b360-9800d7790911", APIVersion:"v1", ResourceVersion:"20649", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs": rpc error: code = DeadlineExceeded desc = context deadline exceeded
I1120 14:10:11.277049       1 leaderelection.go:282] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1120 14:10:11.433098       1 controller.go:1199] provision "default/cephfs-pvc" class "csi-cephfs": started
I1120 14:10:11.437819       1 controller.go:494] CreateVolumeRequest {Name:pvc-1763e959-23aa-4983-b360-9800d7790911 CapacityRange:required_bytes:1073741824  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:myfs pool:myfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1120 14:10:11.438065       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"1763e959-23aa-4983-b360-9800d7790911", APIVersion:"v1", ResourceVersion:"20649", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/cephfs-pvc"
I1120 14:10:11.441631       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I1120 14:10:11.441668       1 connection.go:181] GRPC request: {"capacity_range":{"required_bytes":1073741824},"name":"pvc-1763e959-23aa-4983-b360-9800d7790911","parameters":{"clusterID":"rook-ceph","fsName":"myfs","pool":"myfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}
I1120 14:10:11.449382       1 connection.go:183] GRPC response: {}
I1120 14:10:11.450293       1 connection.go:184] GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID pvc-1763e959-23aa-4983-b360-9800d7790911 already exists
I1120 14:10:11.450401       1 controller.go:1023] Temporary error received, adding PVC 1763e959-23aa-4983-b360-9800d7790911 to claims in progress
W1120 14:10:11.450428       1 controller.go:887] Retrying syncing claim "1763e959-23aa-4983-b360-9800d7790911", failure 1
E1120 14:10:11.450463       1 controller.go:910] error syncing claim "1763e959-23aa-4983-b360-9800d7790911": failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-1763e959-23aa-4983-b360-9800d7790911 already exists
I1120 14:10:11.450632       1 event.go:255] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"cephfs-pvc", UID:"1763e959-23aa-4983-b360-9800d7790911", APIVersion:"v1", ResourceVersion:"20649", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-cephfs": rpc error: code = Aborted desc = an operation with the given Volume ID pvc-1763e959-23aa-4983-b360-9800d7790911 already exists

@satwikk, please share your ceph cluster status and ceph versions from within the rook tool box pod

# ceph versions
# ceph status

@Jamiejoin
Copy link
Author

The problem cannot be solved temporarily, it has been abandoned, thank you for your help。

@arminfelder
Copy link

arminfelder commented Jan 17, 2021

@xiaoqge I have the same issue, it turned out that it is caused by #5938

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ceph main ceph tag
Projects
None yet
Development

No branches or pull requests

9 participants