New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cls/rgw: index cancelation still cleans up remove_objs #43854
Conversation
with this fix, the bucket stats correctly show a single 32M object using the reproducer from https://tracker.ceph.com/issues/53199, even with 36 concurrent multipart uploads (30 of which were canceled):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, my previous review was not the approve that I intended
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
when multipart uploads complete their final bucket index transaction, they pass the list of part objects in 'remove_objs' for bulk removal - the part objects, along with their bucket stats, get replaced by the head object but if CompleteMultipart races with another upload, the head object write will fail with ECANCELED and the bucket index transaction gets canceled with CLS_RGW_OP_CANCEL. these canceled uploads still need to clean up their 'remove_objs', but cancelation was returning too early. as a result, these bucket index entries get orphaned and leave the bucket stats inconsistent this commit reworks rgw_bucket_complete_op() so that CLS_RGW_OP_CANCEL is handled the same way as OP_ADD and OP_DEL, so always runs the loop to clean up 'remove_objs' Fixes: https://tracker.ceph.com/issues/53199 Signed-off-by: Casey Bodley <cbodley@redhat.com>
whenever an index transaction uses remove_objs for complete(), it also needs to pass them for cancel() to avoid leaking index entries Signed-off-by: Casey Bodley <cbodley@redhat.com>
rebased over #43103 |
jenkins test api |
1 similar comment
jenkins test api |
when multipart uploads complete their final bucket index transaction, they pass the list of part objects in 'remove_objs' for bulk removal - the part objects, along with their bucket stats, get replaced by the head object
but if CompleteMultipart races with another upload, the head object write will fail with ECANCELED and the bucket index transaction gets canceled with CLS_RGW_OP_CANCEL. these canceled uploads still need to clean up their 'remove_objs', but cancelation was returning too early. as a result, these bucket index entries get orphaned and leave the bucket stats inconsistent
this commit reworks rgw_bucket_complete_op() so that CLS_RGW_OP_CANCEL is handled the same way as OP_ADD and OP_DEL, so always runs the loop to clean up 'remove_objs'
Fixes: https://tracker.ceph.com/issues/53199
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox