New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw/multisite: handle shard_id correctly in data sync when bucket num_shards is 0 #51085
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change to init_index() looks correct..
src/rgw/services/svc_bi_rados.cc
Outdated
| get_bucket_index_objects(bucket_oid_base, idx_layout.layout.normal.num_shards, | ||
| get_bucket_index_objects(bucket_oid_base, rgw::num_shards(idx_layout.layout.normal), | ||
| idx_layout.gen, bucket_objs, shard_id); | ||
| if (bucket_instance_ids) { | ||
| get_bucket_instance_ids(bucket_info, idx_layout.layout.normal.num_shards, | ||
| get_bucket_instance_ids(bucket_info, rgw::num_shards(idx_layout.layout.normal), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
..but won't this break access to existing buckets that had num_shards=0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I am still testing this.
@cbodley for a simple case of a bucket with num_shards 0 (without this fix), an incremental sync causes radosgw to crash: but you are right that the older buckets will still have the same shard object names and an incremental sync can still result in the crash. Looking further into what's causing it. |
seems to be crashing when CheckBucketShardStatusIsIncremental() is called. specifically at: |
huh actually no. it happens right when we enter the incremental sync at https://github.com/ceph/ceph/blob/main/src/rgw/driver/rados/rgw_data_sync.cc#L5864 |
|
so it comes down to the way we parse bucket instance without the shard id delimiter in |
|
yeah, i think it makes sense for data sync to treat -1 as 1 shard |
…case For buckets that have num_shards set to 0, bucket instance will not have a shard_id delimiter. When this bucket instance is parsed, we end up assigning a value of -1 to shard_id, which is invalid in data sync. This change ensures that we represent the shard_id correctly by giving it a valid number Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
ec576b0
to
f11b3dc
Compare
|
@cbodley please review. |
|
@cbodley could you review this one please? |
|
jenkins test api |
|
@smanjara can you please make sure these fixes make it to reef? |
For buckets that have num_shards set to 0, bucket instance
will not have a shard_id delimiter. When this bucket instance is parsed,
we end up assigning a value of -1 to shard_id, which is invalid
in data sync. This change ensures that we represent the shard_id correctly
by giving it a valid number
https://tracker.ceph.com/issues/59489
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows