Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to create more than 2 working CephNFS resources #8231

Closed
jhoblitt opened this issue Jul 1, 2021 · 0 comments · Fixed by #8250
Closed

Unable to create more than 2 working CephNFS resources #8231

jhoblitt opened this issue Jul 1, 2021 · 0 comments · Fixed by #8250
Labels

Comments

@jhoblitt
Copy link
Contributor

jhoblitt commented Jul 1, 2021

Using a test cluster of 3 nodes with 4 OSDs configured per node, I am able to create at least 4 working CephFilesystem resources, which all show in the dashboard as working and the data/metadata pools are created successfully. All FS are created using the same basic template of:

---
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: scratch
  namespace: rook-ceph
spec:
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPools:
    - failureDomain: host
      replicated:
        size: 3
  metadataServer:
    activeCount: 3
    activeStandby: true
  preserveFilesystemOnDelete: false

However, when I attempt to create a CephNFS resource per CephFilessytem, only the first two created work. The 3rd+ resource always fails and no ganesha pods are created for that CephNFS instance. If reduce the number of CephNFS resources to 0 or 1, then I am able to create new [working] CephNFS instances regardless of the name of the instance. The template I'm using for CephNFS resources is:

---
apiVersion: ceph.rook.io/v1
kind: CephNFS
metadata:
  name: scratch
  namespace: rook-ceph
spec:
  rados:
    pool: scratch-data0
    namespace: nfs-ns
  server:
    active: 3

Attempting to debug a failed instance, in this case project, the operator logs show the same messages over and over again about trying to scale down the deployment:

2021-07-01 16:33:12.181582 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:33:13.404965 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:33:13.404984 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:33:13.412457 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:44:08.777902 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:44:10.085936 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:44:10.085952 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:44:10.092222 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 17:00:50.102039 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 17:00:51.297503 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 17:00:51.297521 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 17:00:51.302562 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
-bash-4.2$ kubectl -n rook-ceph get deploy | grep project
rook-ceph-mds-project-a             1/1     1            1           40m
rook-ceph-mds-project-b             1/1     1            1           40m
rook-ceph-mds-project-c             1/1     1            1           40m
rook-ceph-mds-project-d             1/1     1            1           40m
rook-ceph-mds-project-e             1/1     1            1           40m
rook-ceph-mds-project-f             1/1     1            1           40m
-bash-4.2$ kubectl -n rook-ceph get pod | grep project
rook-ceph-mds-project-a-59ff5646fc-chsj8             1/1     Running     0          40m
rook-ceph-mds-project-b-588fb6fb45-4p2x8             1/1     Running     0          40m
rook-ceph-mds-project-c-5d9f988777-tfklj             1/1     Running     0          40m
rook-ceph-mds-project-d-55df79ff77-nsg5p             1/1     Running     0          40m
rook-ceph-mds-project-e-864474f685-qmpq9             1/1     Running     0          40m
rook-ceph-mds-project-f-54bcb66965-bdxzf             1/1     Running     0          40m

What jumps out at me is that the deployments for the nfs pods aren't even being created. I am guessing that this is something to do with the operator and the CephNFS CRD rather than a ceph limitation.

How to reproduce it (minimal and precise):

File(s) to submit:

  • Cluster CR (custom resource), typically called cluster.yaml, if necessary
---
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: ceph/ceph:v16.2.4
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  mon:
    count: 3
    allowMultiplePerNode: false
  dashboard:
    enabled: true
    ssl: true  # tls between ingress and svc... doesn't seem to work unless enabled
  crashCollector:
    disable: false
  storage:
    useAllNodes: false
    useAllDevices: false
    config:
      osdsPerDevice: "4"
    nodes:
    - name: pillan01
      devices:
      - name: /dev/disk/by-id/nvme-Samsung_SSD_983_DCT_1.92TB_S48BNG0MB01685F
    - name: pillan02
      devices:
      - name: /dev/disk/by-id/nvme-Samsung_SSD_983_DCT_1.92TB_S48BNG0MB01695D
    - name: pillan03
      devices:
      - name: /dev/disk/by-id/nvme-Samsung_SSD_983_DCT_1.92TB_S48BNG0MB01690H
  placement:
    all:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: role
              operator: In
              values:
              - storage-node
      tolerations:
      - key: role
        operator: Equal
        value: storage-node
        effect: NoSchedule
  disruptionManagement:
    managePodBudgets: true
  cleanupPolicy:
    # unset only when all data should be destroyed
    #confirmation: "yes-really-destroy-data"
    method: quick
    dataSource: zero
    iteration: 1
  monitoring:
    enabled: true
    rulesNamespace: rook-ceph
  • Operator's logs, if necessary
2021-07-01 15:10:30.382173 I | ceph-cluster-controller: done reconciling ceph cluster in namespace "rook-ceph"
2021-07-01 15:10:30.401054 I | op-k8sutil: Reporting Event rook-ceph:rook-ceph Normal:ReconcileSucceeded:cluster has been configured successfully
2021-07-01 16:19:23.918083 I | ceph-spec: adding finalizer "cephfilesystem.ceph.rook.io" on "scratch"
2021-07-01 16:19:23.927815 I | ceph-spec: adding finalizer "cephnfs.ceph.rook.io" on "scratch"
2021-07-01 16:19:23.950401 E | ceph-nfs-controller: failed to set nfs "scratch" status to "Created". failed to update object "scratch" status: Operation cannot be fulfilled on cephnfses.ceph.rook.io "scratch": the object has been modified; please apply your changes to the latest version and try again
2021-07-01 16:19:23.952567 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:23.954336 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:27.577730 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:27.674091 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:30.082778 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:30.100016 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:32.294382 I | ceph-file-controller: creating filesystem "scratch"
2021-07-01 16:19:32.490629 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:32.516123 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:34.776982 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:34.820994 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:37.822670 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:37.909326 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:38.177961 I | cephclient: setting pool property "compression_mode" to "none" on pool "scratch-metadata"
2021-07-01 16:19:39.590722 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:39.756775 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:40.282484 I | cephclient: creating replicated pool scratch-metadata succeeded
2021-07-01 16:19:41.482554 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:41.807457 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:43.512449 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "scratch" arguments: pool "scratch-data0" not found: failed to get pool scratch-data0 details. Error ENOENT: unrecognized pool 'scratch-data0'
. : Error ENOENT: unrecognized pool 'scratch-data0'
. 
2021-07-01 16:19:44.157856 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:19:45.901887 I | ceph-nfs-controller: updating ceph nfs "scratch"
2021-07-01 16:19:46.005777 I | cephclient: getting or creating ceph auth key "client.nfs-ganesha.scratch.a"
2021-07-01 16:19:46.485502 I | cephclient: setting pool property "compression_mode" to "none" on pool "scratch-data0"
2021-07-01 16:19:46.905553 I | ceph-nfs-controller: ceph nfs deployment "rook-ceph-nfs-scratch-a" started
2021-07-01 16:19:46.920610 I | ceph-nfs-controller: ceph nfs service running at 10.43.3.201:2049
2021-07-01 16:19:46.920627 I | ceph-nfs-controller: adding ganesha "a" to grace db
2021-07-01 16:19:47.192146 I | cephclient: getting or creating ceph auth key "client.nfs-ganesha.scratch.b"
2021-07-01 16:19:48.094911 I | ceph-nfs-controller: ceph nfs deployment "rook-ceph-nfs-scratch-b" started
2021-07-01 16:19:48.111108 I | ceph-nfs-controller: ceph nfs service running at 10.43.236.90:2049
2021-07-01 16:19:48.111123 I | ceph-nfs-controller: adding ganesha "b" to grace db
2021-07-01 16:19:48.192148 I | cephclient: getting or creating ceph auth key "client.nfs-ganesha.scratch.c"
2021-07-01 16:19:48.585099 I | cephclient: creating replicated pool scratch-data0 succeeded
2021-07-01 16:19:48.585126 I | cephclient: creating filesystem "scratch" with metadata pool "scratch-metadata" and data pools [scratch-data0]
2021-07-01 16:19:49.400413 I | ceph-nfs-controller: ceph nfs deployment "rook-ceph-nfs-scratch-c" started
2021-07-01 16:19:49.419312 I | ceph-nfs-controller: ceph nfs service running at 10.43.146.137:2049
2021-07-01 16:19:49.419332 I | ceph-nfs-controller: adding ganesha "c" to grace db
2021-07-01 16:19:51.645272 I | ceph-file-controller: created filesystem "scratch" on 1 data pool(s) and metadata pool "scratch-metadata"
2021-07-01 16:19:52.278289 I | cephclient: setting allow_standby_replay for filesystem "scratch"
2021-07-01 16:19:54.722207 I | ceph-file-controller: start running mdses for filesystem "scratch"
2021-07-01 16:19:55.375638 I | cephclient: getting or creating ceph auth key "mds.scratch-a"
2021-07-01 16:19:56.065960 I | op-mds: setting mds config flags
2021-07-01 16:19:56.065983 I | op-config: setting "mds.scratch-a"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:19:56.624498 I | op-config: successfully set "mds.scratch-a"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:19:56.635807 I | cephclient: getting or creating ceph auth key "mds.scratch-b"
2021-07-01 16:19:56.826201 E | ceph-crashcollector-controller: node reconcile failed on op "unchanged": Operation cannot be fulfilled on deployments.apps "rook-ceph-crashcollector-pillan02": the object has been modified; please apply your changes to the latest version and try again
2021-07-01 16:19:57.355305 I | op-mds: setting mds config flags
2021-07-01 16:19:57.355326 I | op-config: setting "mds.scratch-b"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:19:57.925724 I | op-config: successfully set "mds.scratch-b"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:19:57.936219 I | cephclient: getting or creating ceph auth key "mds.scratch-c"
2021-07-01 16:19:58.648489 I | op-mds: setting mds config flags
2021-07-01 16:19:58.648506 I | op-config: setting "mds.scratch-c"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:19:59.180176 I | op-config: successfully set "mds.scratch-c"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:19:59.190779 I | cephclient: getting or creating ceph auth key "mds.scratch-d"
2021-07-01 16:19:59.890816 I | op-mds: setting mds config flags
2021-07-01 16:19:59.890835 I | op-config: setting "mds.scratch-d"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:20:00.441548 I | op-config: successfully set "mds.scratch-d"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:20:00.459033 I | cephclient: getting or creating ceph auth key "mds.scratch-e"
2021-07-01 16:20:01.149203 I | op-mds: setting mds config flags
2021-07-01 16:20:01.149225 I | op-config: setting "mds.scratch-e"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:20:01.728626 I | op-config: successfully set "mds.scratch-e"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:20:01.739705 I | cephclient: getting or creating ceph auth key "mds.scratch-f"
2021-07-01 16:20:02.443900 I | op-mds: setting mds config flags
2021-07-01 16:20:02.443917 I | op-config: setting "mds.scratch-f"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:20:02.948964 I | op-config: successfully set "mds.scratch-f"="mds_join_fs"="scratch" option to the mon configuration database
2021-07-01 16:21:42.794997 I | ceph-spec: adding finalizer "cephfilesystem.ceph.rook.io" on "project"
2021-07-01 16:21:42.805085 I | ceph-spec: adding finalizer "cephnfs.ceph.rook.io" on "project"
2021-07-01 16:21:42.816769 E | ceph-file-controller: failed to set filesystem "project" status to "Created". failed to update object "project" status: Operation cannot be fulfilled on cephfilesystems.ceph.rook.io "project": the object has been modified; please apply your changes to the latest version and try again
2021-07-01 16:21:42.824713 E | ceph-nfs-controller: failed to set nfs "project" status to "Created". failed to update object "project" status: Operation cannot be fulfilled on cephnfses.ceph.rook.io "project": the object has been modified; please apply your changes to the latest version and try again
2021-07-01 16:21:42.827122 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:42.829200 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:46.474410 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:21:46.495833 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:48.987998 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:21:49.004540 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:51.177055 I | ceph-file-controller: creating filesystem "project"
2021-07-01 16:21:51.477715 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:21:51.503041 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:53.081321 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:21:53.127035 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:56.387574 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:21:56.472899 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:57.186868 I | cephclient: setting pool property "compression_mode" to "none" on pool "project-metadata"
2021-07-01 16:21:58.194773 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:21:58.359896 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:21:59.282441 I | cephclient: creating replicated pool project-metadata succeeded
2021-07-01 16:21:59.782500 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:22:00.108563 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:01.876485 E | ceph-nfs-controller: failed to reconcile invalid ceph nfs "project" arguments: pool "project-data0" not found: failed to get pool project-data0 details. Error ENOENT: unrecognized pool 'project-data0'
. : Error ENOENT: unrecognized pool 'project-data0'
. 
2021-07-01 16:22:02.521964 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:03.486347 I | cephclient: setting pool property "compression_mode" to "none" on pool "project-data0"
2021-07-01 16:22:04.791651 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:22:04.791667 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:22:04.805309 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:22:05.442609 I | cephclient: creating replicated pool project-data0 succeeded
2021-07-01 16:22:05.442637 I | cephclient: creating filesystem "project" with metadata pool "project-metadata" and data pools [project-data0]
2021-07-01 16:22:06.090141 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:07.786211 I | ceph-file-controller: created filesystem "project" on 1 data pool(s) and metadata pool "project-metadata"
2021-07-01 16:22:08.000733 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:22:08.000750 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:22:08.006479 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:22:08.576699 I | cephclient: setting allow_standby_replay for filesystem "project"
2021-07-01 16:22:10.573451 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:10.682985 I | ceph-file-controller: start running mdses for filesystem "project"
2021-07-01 16:22:11.991039 I | cephclient: getting or creating ceph auth key "mds.project-a"
2021-07-01 16:22:12.900225 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:22:12.900241 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:22:12.906539 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:22:13.199323 I | op-mds: setting mds config flags
2021-07-01 16:22:13.199345 I | op-config: setting "mds.project-a"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:13.734971 I | op-config: successfully set "mds.project-a"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:13.745552 I | cephclient: getting or creating ceph auth key "mds.project-b"
2021-07-01 16:22:14.445336 I | op-mds: setting mds config flags
2021-07-01 16:22:14.445353 I | op-config: setting "mds.project-b"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:14.986207 I | op-config: successfully set "mds.project-b"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:14.996729 I | cephclient: getting or creating ceph auth key "mds.project-c"
2021-07-01 16:22:15.736658 I | op-mds: setting mds config flags
2021-07-01 16:22:15.736682 I | op-config: setting "mds.project-c"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:16.244163 I | op-config: successfully set "mds.project-c"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:16.255065 I | cephclient: getting or creating ceph auth key "mds.project-d"
2021-07-01 16:22:16.944503 I | op-mds: setting mds config flags
2021-07-01 16:22:16.944526 I | op-config: setting "mds.project-d"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:17.489055 I | op-config: successfully set "mds.project-d"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:17.499472 I | cephclient: getting or creating ceph auth key "mds.project-e"
2021-07-01 16:22:18.030727 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:18.294463 I | op-mds: setting mds config flags
2021-07-01 16:22:18.294481 I | op-config: setting "mds.project-e"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:19.485315 I | op-config: successfully set "mds.project-e"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:19.497330 I | cephclient: getting or creating ceph auth key "mds.project-f"
2021-07-01 16:22:20.393645 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:22:20.393661 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:22:20.399315 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:22:20.687353 I | op-mds: setting mds config flags
2021-07-01 16:22:20.687372 I | op-config: setting "mds.project-f"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:21.232191 I | op-config: successfully set "mds.project-f"="mds_join_fs"="project" option to the mon configuration database
2021-07-01 16:22:30.646500 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:31.826078 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:22:31.826098 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:22:31.831111 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:22:52.316351 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:22:53.494495 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:22:53.494514 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:22:53.502906 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:23:34.468132 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:23:35.693922 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:23:35.693938 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:23:35.701007 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:24:57.627424 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:24:59.376441 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:24:59.376456 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:24:59.382778 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:27:43.227217 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:27:44.488701 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:27:44.488715 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:27:44.494993 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:33:12.181582 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:33:13.404965 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:33:13.404984 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:33:13.412457 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 16:44:08.777902 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 16:44:10.085936 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 16:44:10.085952 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 16:44:10.092222 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
2021-07-01 17:00:50.102039 I | op-mon: parsing mon endpoints: b=10.43.115.151:6789,c=10.43.174.94:6789,a=10.43.141.110:6789
2021-07-01 17:00:51.297503 I | ceph-nfs-controller: scaling down ceph nfs "project" from 6 to 3
2021-07-01 17:00:51.297521 I | ceph-nfs-controller: removing deployment "rook-ceph-nfs-project-f"
2021-07-01 17:00:51.302562 E | ceph-nfs-controller: failed to reconcile failed to create ceph nfs deployments: failed to scale down ceph nfs "project": failed to delete ceph nfs deployment: deployments.apps "rook-ceph-nfs-project-f" not found
  • Crashing pod(s) logs, if necessary

N/A

Environment:

  • OS (e.g. from /etc/os-release):
-bash-4.2$ cat /etc/redhat-release 
CentOS Linux release 7.9.2009 (Core)
  • Kernel (e.g. uname -a):
    Linux pillan01.xxx 3.10.0-1160.25.1.el7.x86_64 #1 SMP Wed Apr 28 21:49:45 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Cloud provider or hardware configuration:
# dmidecode | grep "Product Name"
	Product Name: AS -1114S-WN10RT
	Product Name: H12SSW-NTR
  • Rook version (use rook version inside of a Rook Pod):
-bash-4.2$ kubectl exec -ti -n rook-ceph rook-ceph-operator-68f5994dd4-td75v -- rook version
rook: v1.6.5
go: go1.16.3
  • Storage backend version (e.g. for ceph do ceph -v):
-bash-4.2$ kubectl exec -ti -n rook-ceph rook-ceph-operator-68f5994dd4-td75v -- ceph -v
ceph version 16.2.2 (e8f22dde28889481f4dda2beb8a07788204821d3) pacific (stable)
  • Kubernetes version (use kubectl version):
-bash-4.2$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:18:45Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.6", GitCommit:"8a62859e515889f07e3e3be6a1080413f17cf2c3", GitTreeState:"clean", BuildDate:"2021-04-15T03:19:55Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
    rke 1.2.8
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):
-bash-4.2$ kubectl exec -ti -n rook-ceph rook-ceph-tools-7f4fcd8448-kknp7 -- ceph health
HEALTH_OK
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant