Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rook rgw failed to apply for pvc #14219

Closed
kubecto opened this issue May 16, 2024 · 7 comments
Closed

rook rgw failed to apply for pvc #14219

kubecto opened this issue May 16, 2024 · 7 comments
Labels

Comments

@kubecto
Copy link

kubecto commented May 16, 2024

Is this a bug report or feature request?

  • Bug Report
rook status

# kubectl get po -n rook-ceph
NAME                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-bdmdw                            2/2     Running     0          21h
csi-cephfsplugin-nv5nt                            2/2     Running     0          21h
csi-cephfsplugin-provisioner-54b6c886c7-kxv88     5/5     Running     0          21h
csi-cephfsplugin-provisioner-54b6c886c7-qh5wh     5/5     Running     0          21h
csi-cephfsplugin-t68sg                            2/2     Running     0          21h
csi-rbdplugin-pmg5v                               2/2     Running     0          21h
csi-rbdplugin-provisioner-5685d999c4-d7b4r        5/5     Running     0          21h
csi-rbdplugin-provisioner-5685d999c4-mk4cq        5/5     Running     0          21h
csi-rbdplugin-q5skn                               2/2     Running     0          21h
csi-rbdplugin-s2g2w                               2/2     Running     0          21h
rook-ceph-crashcollector-node1-677774f9ff-99jwc   1/1     Running     0          21h
rook-ceph-crashcollector-node2-6df895c49f-rbhxs   1/1     Running     0          21h
rook-ceph-crashcollector-node3-55548c4d64-5sthg   1/1     Running     0          21h
rook-ceph-mgr-a-5cdf84ffdf-tdr4s                  3/3     Running     0          21h
rook-ceph-mgr-b-775f469564-5l2lb                  3/3     Running     0          21h
rook-ceph-mon-a-5bbbbff57f-dnzs8                  2/2     Running     0          21h
rook-ceph-mon-b-864f57d4b9-qvxhk                  2/2     Running     0          21h
rook-ceph-mon-c-64df454c4c-xtqbr                  2/2     Running     0          21h
rook-ceph-operator-6fc6c6d985-2stlc               1/1     Running     0          21h
rook-ceph-osd-0-869c56c74d-v9425                  2/2     Running     0          21h
rook-ceph-osd-1-7f5f4b446b-99qbh                  2/2     Running     0          21h
rook-ceph-osd-2-5c6c5f8fb9-2tp2w                  2/2     Running     0          21h
rook-ceph-osd-prepare-node1-9tcgl                 0/1     Completed   0          55m
rook-ceph-osd-prepare-node2-fm6rx                 0/1     Completed   0          55m
rook-ceph-osd-prepare-node3-8d77m                 0/1     Completed   0          55m
rook-ceph-rgw-my-store-a-78dccf54cd-v7mxv         2/2     Running     0          21h
rook-ceph-tools-5b77b8f655-mbbgp                  1/1     Running     0          20h
rook-ceph-tools-operator-image-5d66b99dc-ljbx7    1/1     Running     0          4h23m

ceph status

# kubectl exec -it rook-ceph-tools-operator-image-5d66b99dc-ljbx7 -n rook-ceph bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[rook@rook-ceph-tools-operator-image-5d66b99dc-ljbx7 /]$ ceph -s
  cluster:
    id:     ac71f3f4-50c9-4562-ae10-a98ae78870c7
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 21h)
    mgr: a(active, since 21h), standbys: b
    osd: 3 osds: 3 up (since 21h), 3 in (since 21h)
    rgw: 1 daemon active (1 hosts, 1 zones)

  data:
    pools:   9 pools, 89 pgs
    objects: 423 objects, 481 KiB
    usage:   317 MiB used, 48 GiB / 48 GiB avail
    pgs:     89 active+clean

  io:
    client:   5.2 KiB/s rd, 597 B/s wr, 5 op/s rd, 1 op/s wr

According to

https://rook.io/docs/rook/v1.10/Storage-Configuration/Object-Storage-RGW/ceph-object-bucket-claim/

to create a storage class, But pvc is in a waiting state

# cat storgae.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-ceph-bucket
provisioner: rook-ceph.ceph.rook.io/bucket
parameters:
  objectStoreName: my-store
  objectStoreNamespace: rook-ceph
  bucketName: ceph-bucket
reclaimPolicy: Delete
---
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: ceph-bucket
  namespace: rook-ceph
spec:
  bucketName: ceph-bucket
  generateBucketName: photo-booth
  storageClassName: rook-ceph-bucket
  additionalConfig:
    maxObjects: "1000"
    maxSize: "2G"

kubectl get sc
NAME               PROVISIONER                     RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
openebs-device     openebs.io/local                Delete          WaitForFirstConsumer   false                  43h
openebs-hostpath   openebs.io/local                Delete          WaitForFirstConsumer   false                  43h
rook-ceph-bucket   rook-ceph.ceph.rook.io/bucket   Delete          Immediate              false                  28m

my pod

cat sts.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nginx-statefulset
spec:
  serviceName: "nginx-service"
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      tolerations:
      - key: "node-role.kubernetes.io/storage-node"
        operator: "Exists"
        effect: "NoSchedule"
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-storage
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: nginx-storage
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: rook-ceph-bucket
      resources:
        requests:
          storage: 1G

describe pvc

# kubectl describe pvc nginx-storage-nginx-statefulset-0
Name:          nginx-storage-nginx-statefulset-0
Namespace:     default
StorageClass:  rook-ceph-bucket
Status:        Pending
Volume:
Labels:        app=nginx
Annotations:   volume.beta.kubernetes.io/storage-provisioner: rook-ceph.ceph.rook.io/bucket
               volume.kubernetes.io/storage-provisioner: rook-ceph.ceph.rook.io/bucket
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       nginx-statefulset-0
Events:
  Type    Reason                Age                 From                         Message
  ----    ------                ----                ----                         -------
  Normal  ExternalProvisioning  38s (x42 over 10m)  persistentvolume-controller  Waiting for a volume to be created either by the external provisioner 'rook-ceph.ceph.rook.io/bucket' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.

describe po

kubectl get po
NAME                  READY   STATUS    RESTARTS   AGE
nginx-statefulset-0   0/1     Pending   0          10m

  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  39s (x3 over 10m)  default-scheduler  0/4 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling..

rook-ceph-rgw logs

# kubectl logs rook-ceph-rgw-my-store-a-78dccf54cd-v7mxv -n rook-ceph -f
======
debug 2024-05-16T07:09:13.968+0000 7fc8d9a21700  1 beast: 0x7fc86783a730: 10.102.28.61 - - [16/May/2024:07:09:13.968 +0000] "GET /swift/healthcheck HTTP/1.1" 200 0 - "kube-probe/1.28" - latency=0.000000000s
debug 2024-05-16T07:09:23.968+0000 7fc8ea242700  1 ====== starting new request req=0x7fc86783a730 =====
debug 2024-05-16T07:09:23.968+0000 7fc8ea242700  1 ====== req done req=0x7fc86783a730 op status=0 http_status=200 latency=0.000000000s ======
debug 2024-05-16T07:09:23.968+0000 7fc8ea242700  1 beast: 0x7fc86783a730: 10.102.28.61 - - [16/May/2024:07:09:23.968 +0000] "GET /swift/healthcheck HTTP/1.1" 200 0 - "kube-probe/1.28" - latency=0.000000000s
debug 2024-05-16T07:09:30.881+0000 7fc99a3c3700  0 rgw UsageLogger: WARNING: RGWRados::log_usage(): user name empty (bucket=), skipping
debug 2024-05-16T07:09:33.967+0000 7fc87f16c700  1 ====== starting new request req=0x7fc86783a730 =====
debug 2024-05-16T07:09:33.968+0000 7fc87f16c700  1 ====== req done req=0x7fc86783a730 op status=0 http_status=200 latency=0.001000005s ======
debug 2024-05-16T07:09:33.968+0000 7fc87f16c700  1 beast: 0x7fc86783a730: 10.102.28.61 - - [16/May/2024:07:09:33.967 +0000] "GET /swift/healthcheck HTTP/1.1" 200 0 - "kube-probe/1.28" - latency=0.001000005s

operator log

# kubectl logs rook-ceph-operator-6fc6c6d985-2stlc -n rook-ceph
I0516 07:08:07.703753       1 controller.go:217]  "msg"="reconciling claim" "key"="rook-ceph/ceph-bucket"
I0516 07:08:07.703783       1 helpers.go:107]  "msg"="getting claim for key" "key"="rook-ceph/ceph-bucket"
I0516 07:08:07.706808       1 helpers.go:213]  "msg"="getting ObjectBucketClaim's StorageClass" "key"="rook-ceph/ceph-bucket"
I0516 07:08:07.709206       1 helpers.go:218]  "msg"="got StorageClass" "key"="rook-ceph/ceph-bucket" "name"="rook-ceph-bucket"
I0516 07:08:07.709256       1 controller.go:270]  "msg"="syncing obc creation" "key"="rook-ceph/ceph-bucket"
I0516 07:08:07.709283       1 controller.go:552]  "msg"="updating OBC metadata" "key"="rook-ceph/ceph-bucket"
I0516 07:08:07.709301       1 resourcehandlers.go:277]  "msg"="updating" "key"="rook-ceph/ceph-bucket" "obc"="rook-ceph/ceph-bucket"
I0516 07:08:07.715011       1 controller.go:341]  "msg"="granting access to" "bucket"="ceph-bucket" "key"="rook-ceph/ceph-bucket"
2024-05-16 07:08:07.715020 I | op-bucket-prov: initializing and setting CreateOrGrant services
2024-05-16 07:08:07.715034 I | op-bucket-prov: getting storage class "rook-ceph-bucket"
2024-05-16 07:08:07.999607 I | op-bucket-prov: Grant: allowing access to bucket "ceph-bucket" for OBC "ceph-bucket"
2024-05-16 07:08:07.999630 I | op-bucket-prov: Checking for existing bucket "ceph-bucket"
E0516 07:08:08.003638       1 controller.go:204] error syncing 'rook-ceph/ceph-bucket': provisioner returned empty object bucket, requeuing
2024-05-16 07:08:13.086219 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:08:23.087203 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:08:33.087689 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:08:43.088878 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:08:53.089564 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:09:03.089777 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:09:13.090830 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:09:23.091353 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:09:33.091673 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:09:43.091874 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:09:53.092900 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:10:03.093341 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:10:13.093482 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:10:23.094258 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry
2024-05-16 07:10:33.095410 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry

Deviation from expected behavior:
I check the operator log and it shows that no bucket is created. Does this need to be created in advance? Will cr not automatically create it

Expect normal use of pvc

Expected behavior:

How to reproduce it (minimal and precise):

File(s) to submit:

  • Cluster CR (custom resource), typically called cluster.yaml, if necessary

Logs to submit:

  • Operator's logs, if necessary

  • Crashing pod(s) logs, if necessary

    To get logs, use kubectl -n <namespace> logs <pod name>
    When pasting logs, always surround them with backticks or use the insert code button from the Github UI.
    Read GitHub documentation if you need help.

Cluster Status to submit:

  • Output of kubectl commands, if necessary

    To get the health of the cluster, use kubectl rook-ceph health
    To get the status of the cluster, use kubectl rook-ceph ceph status
    For more details, see the Rook kubectl Plugin

Environment:

  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Cloud provider or hardware configuration:
  • Rook version (use rook version inside of a Rook Pod): rook-1.10.12
  • Storage backend version (e.g. for ceph do ceph -v): ceph version 17.2.5
  • Kubernetes version (use kubectl version): 1.28.6
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):
@kubecto kubecto added the bug label May 16, 2024
@BlaineEXE
Copy link
Member

BlaineEXE commented May 16, 2024

StorageClasses used for OBCs cannot be mounted to pods as Kubernetes PVCs. OBCs are consumed using the S3 credentials present in the generated Secret.

Please refer to the Ceph object bucket claim documentation for moredetails about how to create and consume bucket claims.

@BlaineEXE BlaineEXE closed this as not planned Won't fix, can't repro, duplicate, stale May 16, 2024
@BlaineEXE BlaineEXE reopened this May 16, 2024
@BlaineEXE
Copy link
Member

Accidentally, closed, sorry.

In the OBC, I see these 2 values, which are not compatible together. Use bucketName to specify a particular bucket name. Or generateBucketName to generate a bucket name with the prefix given in the config.

  bucketName: ceph-bucket
  generateBucketName: photo-booth

Use of both of these fields may be confusing the OBC controller, leading to this message:

debug 2024-05-16T07:09:30.881+0000 7fc99a3c3700  0 rgw UsageLogger: WARNING: RGWRados::log_usage(): user name empty (bucket=), skipping

Based on this operator log, it seems that the bucket ceph-bucket already exists in the object store. Depending on how the bucket was created, the OBC controller may be unable to connect to it.

2024-05-16 07:08:07.999607 I | op-bucket-prov: Grant: allowing access to bucket "ceph-bucket" for OBC "ceph-bucket"

As for these log messages, it's possible (but unlikely) that trying to consume the OBC storageclass via PVC may have affected the OBC controller's ability to provision the bucket.

2024-05-16 07:08:13.086219 I | ceph-bucket-notification: ObjectBucketClaim "rook-ceph/ceph-bucket" resource did not create the bucket yet. will retry

I would suggest deleting the OBC, and then using radosgw-admin or the RGW Admin Ops API to list all users and buckets, and then delete anything that might be left over from this OBC's stuck state. Then re-create the OBC using only one of bucketName/generateBucketName to see if provisioning succeeds.

I should also note that Rook v1.10 is now unsupported. If there is a bug here, we won't be making any code changes to resolve it. Please upgrade to v1.13 or v1.14 for support.

@kubecto
Copy link
Author

kubecto commented May 17, 2024

Does rook not support rgw to apply for pvc? I am still confused. Will you support RGW in the future? Is the current rgw in rook only used to create buckets

@BlaineEXE
Copy link
Member

RGW provides object storage via an S3 interface. PVCs are for block and file storage only, so there is no way to add PVC support for RGW.

Block storage and single-user file storage is provided by Ceph natively. CephFilesystem allows for shared-user file storage similarly to how RGW (CephObjectStore) provides S3 object storage.

Rook integrated with the OBC project many years back to allow users to have a PVC-like experience with object storage.

@kubecto
Copy link
Author

kubecto commented May 20, 2024

https://github.com/yandex-cloud/k8s-csi-s3

This project can be combined with csi to achieve the mounting use of pvc, the bottom is also the use of ceph storage, why not refer to this project to achieve?

@BlaineEXE
Copy link
Member

BlaineEXE commented May 20, 2024

We weren't aware of that project until now. If you or any other users want to use that project to mount S3 storage to pods, you are welcome to, but the Rook project doesn't have time to vet every possible integration.

Currently, Rook uses the OBC project for bucket claims. Rook integrated with it early, but the OBC project was largely abandoned after the v1alpha1 release. Rook is now stuck maintaining support for it until we can replace it with COSI, which is our long-term strategy.

The Container Object Storage Interface (COSI) Kubernetes Enhancement Project is outlined here: https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1979-object-storage-support

Since COSI is on its way to become a K8s standard, and is starting to have wider industry involvement, we don't see any reason for Rook to pivot to the yandex project. We believe COSI is the future of self-service object storage on Kubernetes.

Neither OBCs nor COSI support mounting buckets as a filesystem into pods. COSI considered this early in its development but opted not to do so since nearly all object storage systems are mounted and consumed via HTTP-based APIs.

@kubecto
Copy link
Author

kubecto commented May 21, 2024

Ok, thank you for your answer, I believe rook will be better

@kubecto kubecto closed this as completed May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants