Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: Multiple Data Pool Support for a Bucket #19890

Closed
wants to merge 8 commits into from

Conversation

Jeegn-Chen
Copy link
Contributor

@Jeegn-Chen
Copy link
Contributor Author

@yehudasa @mattbenjamin @robbat2 @fangyuxiangGL
Discussion on https://etherpad.net/p/multiple-data-pool-support-for-a-bucket through ceph-devel mailing list may be somewhat abstract.
I worked out an initial implementation for atomic upload (multi-part upload and obj copy are to be addressed).
Could you take a look and comment?

@liewegas liewegas changed the title [WIP] Multiple Data Pool Support for a Bucket rgw: [WIP] Multiple Data Pool Support for a Bucket Jan 10, 2018
@ZVampirEM77
Copy link
Contributor

@Jeegn-Chen /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/radosgw-admin/help.t: failed

@ZVampirEM77
Copy link
Contributor

@Jeegn-Chen the command description should also be added into src/test/cli/radosgw-admin/help.t

@@ -10608,6 +10674,9 @@ int RGWRados::get_obj_iterate_cb(RGWObjectCtx *ctx, RGWObjState *astate,
if (!len)
return 0;
}
effective_io_ctx.dup(d->io_ctx);
}else {
effective_io_ctx.dup(d->tail_io_ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jeegn-Chen

dup here is dangerous for you trigger aio_operate, effective_io_ctx is on the stack, its private io_ctx_impl will be released when function return. You can use effective_io_ctx = d->tail_io_ctx;, which share io_ctx_impl with d->tail_io_ctx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch. thx

Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

inline ostream& operator<<(ostream& out, const RGWBucketDataLayoutType &data_layout_type)
{
switch (data_layout_type) {
case RGWDLType_Normal:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "normal" and "splitted" are the right names for these cases, if only because for the latter you'd write "split." Maybe single-pool and split-pool? There still hasn't been organized discussion in RGW standups, that would be helpful.

@@ -2111,6 +2129,39 @@ struct rgw_cache_entry_info {
rgw_cache_entry_info() : gen(0) {}
};

struct rgw_data_placement_volatile_config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a typical RGW name for a class/struct type. Could we rename this? More concretely, what makes it volatile, and how does that relate to similar info in RGW?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To tell the truth, I'm also worry about the proper name for this struct. I just use it for now and prefer to change if a proper one is found.

“volatile” here means unlike index_pool, data_pool and data_extra_pool of a bucket, each object in a bucket may have different rgw_data_placement_volatile_config.
And if we change current_tail_pools and upload the same object again, rgw_data_placement_volatile_config of the same object will be changed.

Any recommendation?

}

bool empty() const {
return data_layout_type == RGWDLType_Normal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would put boolean expressions in parenthesis, but then, I like style(9)

@Jeegn-Chen Jeegn-Chen force-pushed the wip-bucket-multi-pool branch 3 times, most recently from d761802 to 9a263a1 Compare January 12, 2018 04:12
@Jeegn-Chen Jeegn-Chen changed the title rgw: [WIP] Multiple Data Pool Support for a Bucket rgw: Multiple Data Pool Support for a Bucket Jan 16, 2018
@fangyuxiangGL
Copy link
Contributor

fangyuxiangGL commented Jan 17, 2018

@Jeegn-Chen

I remembered one issue while I did #16325, which was related to multipart upload but I'm not sure if it exist here, you'd better verify it.

Given that we upload an object using multipart, and for one part of it, there are 3 rados objects, which we named as: multipart-object, shadow-1, shadow-2. At last, in RGWPutObjProcessor_Multipart::do_complete, please pay attention to codes:

  RGWRados::Object op_target(store, s->bucket_info, obj_ctx, head_obj);
  op_target.set_versioning_disabled(true);
  RGWRados::Object::Write head_obj_op(&op_target);

  head_obj_op.meta.set_mtime = set_mtime;
  head_obj_op.meta.mtime = mtime;
  head_obj_op.meta.owner = s->owner.get_id();
  head_obj_op.meta.delete_at = delete_at;
  head_obj_op.meta.zones_trace = zones_trace;
  head_obj_op.meta.modify_tail = true;

  int r = head_obj_op.write_meta(obj_len, accounted_size, attrs);
  if (r < 0)
    return r;

we must make sure that write_meta is called at the proper rados object, which should just be the multipart-object mentioned above.

@fangyuxiangGL
Copy link
Contributor

ok, seems you have introduced head_obj_op.meta.track_multipart to avoid it!

if (current_tail_pool) {
info.current_tail_pool = *current_tail_pool;
}
if (tail_pools) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Jeegn-Chen
Seems we have no way to remove anything from data_tail_pools.
how about call info.data_tail_pools.clear(); here?

@stale
Copy link

stale bot commented Oct 17, 2018

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If you are a maintainer or core committer, please follow-up on this issue to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Oct 17, 2018
@stale
Copy link

stale bot commented Apr 22, 2019

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@stale stale bot closed this Apr 22, 2019
@cugwangyang
Copy link

The following URL can not be connected!Why was this request closed?Thank you
https://etherpad.net/p/multiple-data-pool-support-for-a-bucket

@cugwangyang
Copy link

http://tracker.ceph.com/issues/22565
https://etherpad.net/p/multiple-data-pool-support-for-a-bucket

Why was this request closed?There are big bugs in this request? Thank you!

@tchaikov
Copy link
Contributor

@cugwangyang it was closed by the bot. i am reopening it in hope it can have more attention from rgw maintainers.

@tchaikov tchaikov reopened this Feb 24, 2021
@stale stale bot removed the stale label Feb 24, 2021
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@github-actions github-actions bot added the tests label Feb 24, 2021
@cugwangyang
Copy link

This URL can not be connected. Can you give me another URL to share your mind?
https://etherpad.net/p/multiple-data-pool-support-for-a-bucket

@stale
Copy link

stale bot commented Jul 21, 2021

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Jul 21, 2021
@github-actions
Copy link

github-actions bot commented Jul 7, 2022

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants