Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: OBC is not removed if object store CR is in deleted state #10702

Open
degorenko opened this issue Aug 11, 2022 · 15 comments
Open

rgw: OBC is not removed if object store CR is in deleted state #10702

degorenko opened this issue Aug 11, 2022 · 15 comments
Labels

Comments

@degorenko
Copy link
Contributor

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:
CephObjectStore is not removed when objectbucketclaim resources are exists

Expected behavior:
CephObjectStore is removed

How to reproduce it (minimal and precise):
Create some objectbucketclaim resources, remove CephObjectStore.

2022-08-11 09:04:02.019100 I | ceph-object-controller: CephObjectStore "rook-ceph/openstack-store" will not be deleted until all dependents are removed: buckets in the object store (could be from ObjectBucketClaims or COSI Buckets): [test-bucket-08e2b560-21c9-4787-93bf-c2cf913a26fe]
2022-08-11 09:04:02.224611 E | ceph-object-controller: failed to reconcile CephObjectStore "rook-ceph/openstack-store". CephObjectStore "rook-ceph/openstack-store" will not be deleted until all dependents are removed: buckets in the object store (could be from ObjectBucketClaims or COSI Buckets): [test-bucket-08e2b560-21c9-4787-93bf-c2cf913a26fe]

Related ObjectBucketClaim resource is:

kubectl get objectbucketclaim -n rook-ceph -o yaml test-bucket
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  creationTimestamp: "2022-08-11T08:30:46Z"
  deletionGracePeriodSeconds: 0
  deletionTimestamp: "2022-08-11T08:57:45Z"
  finalizers:
  - objectbucket.io/finalizer
  generation: 4
  labels:
    bucket-provisioner: rook-ceph.ceph.rook.io-bucket
  managedFields:
  - apiVersion: objectbucket.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:additionalConfig:
          .: {}
          f:maxObjects: {}
          f:maxSize: {}
        f:generateBucketName: {}
        f:storageClassName: {}
    manager: ceph-controller
    operation: Update
    time: "2022-08-11T08:30:46Z"
  name: test-bucket
  namespace: rook-ceph
  resourceVersion: "6309906"
  uid: 844fb858-816c-4d56-b0c2-f52ab409bc44
spec:
  additionalConfig:
    maxObjects: "10"
    maxSize: 2G
  bucketName: test-bucket-08e2b560-21c9-4787-93bf-c2cf913a26fe
  generateBucketName: test-bucket
  objectBucketName: obc-rook-ceph-test-bucket
  storageClassName: rgw-storage-class
status:
  phase: Bound

Rook version is 1.8.5

@degorenko degorenko added the bug label Aug 11, 2022
@travisn
Copy link
Member

travisn commented Aug 11, 2022

This is by design to prevent accidental data loss. You can remove the finalizer on the CR to force removal. Does that work for you?

@degorenko
Copy link
Contributor Author

remove finalizer on the which CR? objectbucketclaim? If yes, then it will not work, i tried. Rook waits until real bucket is removed (i did that with radosgw-admin bucket rm ...) - then it continued deletion process. If remove finalizer for object store - no, that will not work.

@BlaineEXE
Copy link
Member

Rook intentionally doesn't delete object stores when buckets exist (including OBCs) to prevent data loss. It should be possible to remove the finalizer on the CephObjectStore to have the resource removed by Kubernetes.

@degorenko
Copy link
Contributor Author

but just remove finalizer from cephojbectstore - it will not remove correctly objectstore from ceph itself, in case if ceph cluster itself is going be used, but without rgw

@BlaineEXE
Copy link
Member

That's correct. To safely remove the store, Rook requires the the user to remove all buckets. This prevents accidental data loss when users accidentally delete a resource or the whole namespace, giving them time to back up their cluster data.

@degorenko
Copy link
Contributor Author

yes, my point is that when someone forgot to remove user/bucket (in my case bucket) it is impossible to properly continue remove process:

  • started objectstore remove;
  • error - bucket present;
  • i tried to remove that bucket (objectbucketclaim resource) via k8s api;
  • it hangs and Rook operator can't properly manage;
    I'm concerned about such situation. Why in that case, if objectbucketclaim resource has deletion timestamp Rook can't proceed and remove it?

@travisn
Copy link
Member

travisn commented Aug 11, 2022

Ah, so you're seeing that rook won't delete the obc if the object store is already in a deleted state. In that case, agreed that rook should continue with the obc deletion.

@BlaineEXE
Copy link
Member

Agreed. The OBC should still attempt to delete, even if the object store is in deleting state.

@degorenko
Copy link
Contributor Author

Exactly. Rook just looping over check bucket is still present in rgw and not checking that objectbucketclaim resource has now deletion timestamp. That's the case.

@travisn travisn changed the title rgw: objectstore is not removed if bucket resource present rgw: OBC is not removed if object store CR is in deleted state Aug 11, 2022
@BlaineEXE
Copy link
Member

The object store controller isn't aware of OBCs specifically. It only checks buckets.

But if the OBC controller isn't deleting an OBC because the store is deleting, that is an issue. Are you seeing an issue with the OBC controller not removing a bucket?

@degorenko
Copy link
Contributor Author

yep, for some reason it won't delete OBC when objectstore is also deleting

@thotz
Copy link
Contributor

thotz commented Sep 20, 2022

@degorenko have u deleted the bucket from the backend before deleting OBC, then OBC deletion may stuck.

Same happens if object store CR is deleted before removing OBC(checks for object store repeatedly)

@travisn and @BlaineEXE do we need to handle the above situation gracefully for OBC CR deletion scenarios ??

@BlaineEXE
Copy link
Member

I believe there are 2 distinct cases you outlined above. To make sure we are on the same page, I think the behavior we want is as follows:

The underlying code that does deletion of buckets for OBCs should be idempotent. If the bucket doesn't exist, then the delete operation should be a success.

If the CephObjectStore that an OBC references gets deleted forcibly, then we should allow the OBCs that reference it to be deleted gracefully.

Does that sound right to you @thotz ?

@thotz
Copy link
Contributor

thotz commented Sep 22, 2022

@BlaineEXE : Yeah above are the exact cases which I am referring to

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants