Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: fix radosgw-admin bucket rm with --purge-objects and --bypass-gc #18922

Merged
merged 1 commit into from Nov 17, 2017

Conversation

agutikov
Copy link

Call RGWRados::delete_bucket() from rgw_remove_bucket_bypass_gc()
instead of partial copy of code of RGWRados::delete_bucket().
Fixes: http://tracker.ceph.com/issues/22122

Due to rgw_user(const std::string& s) was called incorrect version of rgw_bucket_sync_user_stats().
Fixes: http://tracker.ceph.com/issues/19959

if (ret < 0) {
return ret;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this call need to move up? i think we want it to come after the list_objects/rgw_remove_object loop in case we pass delete_children=false, so we can return -ENOTEMPTY without mutating the bucket

}

ret = rgw_bucket_sync_user_stats(store, bucket.tenant, info);
ret = rgw_bucket_sync_user_stats(store, bucket.tenant, bucket.name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i see that bucket.tenant was getting converted to rgw_user for the (user_id, bucket_info) overload. but the (tenant_name, bucket_name) overload has to call get_bucket_info() again to look up the info that we already have here. it looks like we can take a shortcut:

  ret = rgw_bucket_sync_user_stats(store, info.owner, info);

src/vstart.sh Outdated
@@ -459,6 +459,8 @@ $extra_conf
rgw crypt s3 kms encryption keys = testkey-1=YmluCmJvb3N0CmJvb3N0LWJ1aWxkCmNlcGguY29uZgo= testkey-2=aWIKTWFrZWZpbGUKbWFuCm91dApzcmMKVGVzdGluZwo=
rgw crypt require ssl = false
rgw lc debug interval = 10
rgw gc processor period = 10
rgw gc obj min wait = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd prefer not to include these changes

@cbodley cbodley changed the title Fix 22122 rgw: fix radosgw-admin bucket rm with --purge-objects and --bypass-gc Nov 14, 2017
@agutikov agutikov force-pushed the fix-22122 branch 2 times, most recently from d268188 to 835ad93 Compare November 14, 2017 16:49
@agutikov
Copy link
Author

Please, do not merge.

@agutikov
Copy link
Author

Now seems working as expected.

Btw, it's not very obvious that --rgw-gc-obj-min-wait should be passed to radosgw-admin bucket rm when not using option --bypass-gc In other case default value is 2 hours and seems that gc not working or seems for example that async senging chain to gc not finished.
Really, can we be sure that task started by gc_aio_operate will be finished before radosgw-admin exited?

  1. #0 RGWRados::gc_aio_operate (this=0x555555e5aff0, oid="gc.19", op=0x7fffffff6810) at /home/aleksei/ceph/src/rgw/rgw_rados.cc:12583
  2. #1 0x00005555558f9845 in RGWGC::send_chain (this=0x555555f134a0, chain=..., tag="357b9f06-d5c4-406e-98e8-32c5ce7cef74.4119.52", sync=<optimized out>) at /home/aleksei/ceph/src/rgw/rgw_gc.cc:68
  3. #2 0x00005555558059a3 in RGWRados::Object::complete_atomic_modification (this=0x7fffffff76b0) at /home/aleksei/ceph/src/rgw/rgw_rados.cc:8441
  4. #3 0x000055555583d9be in RGWRados::Object::Delete::delete_obj (this=0x7fffffff7570) at /home/aleksei/ceph/src/rgw/rgw_rados.cc:8864
  5. #4 0x000055555583e570 in RGWRados::delete_obj (this=this@entry=0x555555e5aff0, obj_ctx=..., bucket_info=..., obj=..., versioning_status=0, bilog_flags=bilog_flags@entry=0, expiration_time=..., zones_trace=0x0) at /home/aleksei/ceph/src/rgw/rgw_rados.cc:8906
  6. #5 0x00005555556e0143 in rgw_remove_object (store=store@entry=0x555555e5aff0, bucket_info=..., bucket=..., key=...) at /home/aleksei/ceph/src/rgw/rgw_bucket.cc:513
  7. #6 0x00005555556e4911 in rgw_remove_bucket (store=0x555555e5aff0, bucket=..., delete_children=delete_children@entry=true) at /home/aleksei/ceph/src/rgw/rgw_bucket.cc:560
  8. #7 0x00005555556e6d20 in RGWBucket::remove (this=this@entry=0x7fffffff93d0, op_state=..., bypass_gc=bypass_gc@entry=false, keep_index_consistent=keep_index_consistent@entry=true, err_msg=err_msg@entry=0x7fffffff92a0) at /home/aleksei/ceph/src/rgw/rgw_bucket.cc:938
  9. #8 0x00005555556e71ca in RGWBucketAdminOp::remove_bucket (store=0x555555e5aff0, op_state=..., bypass_gc=<optimized out>, keep_index_consistent=<optimized out>) at /home/aleksei/ceph/src/rgw/rgw_bucket.cc:1366
  10. #9 0x00005555556905ef in main (argc=<optimized out>, argv=<optimized out>) at /home/aleksei/ceph/src/rgw/rgw_admin.cc:5864

ret = abort_bucket_multiparts(store, cct, info, prefix, delimiter);
if (ret < 0) {
return ret;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, this covers the ENOTEMPTY case. but i'm still not sure why abort_bucket_multiparts() needs to happen before object removal now - some clarifying comments would help here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, It will not help with concurrent delete bucket and uploads either multipart or simple.
store->delete_bucket calls check_bucket_empty
but even this will not guarantee that no garbage (from running uploads) will left after bucket been deleted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right. this reminds me of some earlier discussions in #15630 (review)

@cbodley
Copy link
Contributor

cbodley commented Nov 15, 2017

Really, can we be sure that task started by gc_aio_operate will be finished before radosgw-admin exited?

that's a good point. we don't even know whether the rados client has sent that osd op before shutdown. it looks like RGWGC::finalize() will need a way to wait for all completions before returning. i opened http://tracker.ceph.com/issues/22133 to track this one

Call RGWRados::delete_bucket() from rgw_remove_bucket_bypass_gc()
instead of partial copy of code of RGWRados::delete_bucket().

Fix updating user stats after radosgw-admin bucket rm --purge-objects and --bypass-gc
Due to rgw_user(const std::string& s) was called incorrect version of rgw_bucket_sync_user_stats().

Fixes: http://tracker.ceph.com/issues/22122
Fixes: http://tracker.ceph.com/issues/19959
Signed-off-by: Aleksei Gutikov <aleksey.gutikov@synesis.ru>
@agutikov
Copy link
Author

Please, remove DNM label

@cbodley cbodley removed the DNM label Nov 16, 2017
@yuriw
Copy link
Contributor

yuriw commented Nov 16, 2017

1 similar comment
@yuriw
Copy link
Contributor

yuriw commented Nov 16, 2017

@yuriw yuriw merged commit 383dd57 into ceph:master Nov 17, 2017
@agutikov agutikov deleted the fix-22122 branch December 13, 2017 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants