Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw/rgw_op.cc: support async md5 calculation #42435

Closed
wants to merge 1 commit into from

Conversation

yanghonggang
Copy link
Contributor

The current putobject procedure is:

  • LOOP
    • read data from client
    • calculate md5
    • async write
  • drain write ops
  • update index and write header

If we change 'md5 calculation' from sync op to async op, the 'async write' can start earlier, which will reduce put latency.

In my test environment, put latency is decreased from 91ms to 82ms(4M s3 object, test tool is hsbench). Of cause, the rados cluster performence and the rgw node's CPU should not be a bottleneck.

In order to change 'md5 calculation' to async op, one copy of user data chunk is kept until the calculation is finished. I don't know if there is a smart way to handle this.

Any suggestions would be greatly appreciated.

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Signed-off-by: Yang Honggang <yanghonggang_yewu@cmss.chinamobile.com>
@mattbenjamin
Copy link
Contributor

hi @yanghonggang! @mdw-at-linuxbox has thought about this a bit, I'm adding him as a reviewer

@alimaredia alimaredia self-requested a review July 21, 2021 19:18
@mkogan1 mkogan1 requested review from ofriedma and mkogan1 July 22, 2021 16:04
Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very cool, thanks!

@@ -3958,7 +3995,15 @@ void RGWPutObj::execute(optional_yield y)
}

if (need_calc_md5) {
hash.Update((const unsigned char *)data.c_str(), data.length());
int data_len = data.length();
char* buf = new char[data_len];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to allocate and copy the buffer, you can just pass a copy of bufferlist data through hash.Update() and its lambda

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

void Update(char *data, size_t len) {
ongoing_ops.get(1);

last = std::async([&](std::future<void>&& last, char* data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this pattern with std::async() and std::future looks simple and elegant, but i do have some concerns here

first is that rgw requests can run asynchronously as coroutines in a boost::asio::io_context, and we want to avoid blocking on a condition variable (either in Throttle::get() or std::future::get()) - instead, we should suspend the coroutine so this thread can resume work on something else until the result is ready. you can find some examples of this in RGWReshardWait::wait(optional_yield y) and RGWHTTPClient::wait(optional_yield y)

this overload of std::async() doesn't take a launch policy, and

Behaves as if called with policy being std::launch::async | std::launch::deferred. In other words, f might be executed in another thread or it might be run synchronously when the resulting std::future is queried for a value.

as far as i know, std::launch::async doesn't give us any guarantees about the number or lifetime of its background threads, or the order of execution for these tasks. if it does limit the number of threads and allows out-of-order execution, this pattern could lead to deadlocks because each task blocks on the result of the previous task with std::future::get()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can find some examples of this in RGWReshardWait::wait(optional_yield y) and RGWHTTPClient::wait(optional_yield y)

ok, thank you.

if it does limit the number of threads and allows out-of-order execution, this pattern could lead to deadlocks because each task blocks on the result of the previous task with std::future::get()

@cbodley
I don't know under which condition this will lead to deadlocks. Can you give an example?
thank you.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it does limit the number of threads and allows out-of-order execution, this pattern could lead to deadlocks because each task blocks on the result of the previous task with std::future::get()

I don't know under which condition this will lead to deadlocks. Can you give an example?

taking this example to an extreme, consider an implementation that uses a thread pool with a single thread. we upload an object, and AsyncMD5 creates a sequence of tasks A->B->C->D. if this thread pool allows out-of-order execution, then it may execute B before A. task B will block waiting for the result of A, but task A can never run because there's only the single thread

as the number of threads incease, deadlock becomes far less likely. but i think we're better off handling the threading manually to guarantee that it won't. for example, with a scheduler that's aware of these dependencies, and doesn't schedule a task until it's ready to run. we should also be able to combine the two separate locks (Throttle and std::future) into one

ultimately i think we're either going to need SIMD or large batches to see real wins here, to make up for the added overhead of thread synchronization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultimately i think we're either going to need SIMD or large batches to see real wins here, to make up for the added overhead of thread synchronization

I'm with you on that. Thank you for your suggestions.

@mkogan1
Copy link
Contributor

mkogan1 commented Aug 2, 2021

in a vstart test the performance seems to be lower and the cpu higher with this PR
screenshots for comparison,
without this PR- top, with this PR - bottom
image
without this PR- left, with this PR - right
image

@yanghonggang
Copy link
Contributor Author

@mkogan1 thank you for your response. It seems that the performance is s3 object size dependent(your -z is 4K).
I can share my test results(s3 object size 4M, duration 500s). And my ceph version is 14.2.8.

without this PR:
图片

with this PR:
图片

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@stale
Copy link

stale bot commented Jan 9, 2022

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Jan 9, 2022
@djgalloway djgalloway changed the base branch from master to main July 2, 2022 00:00
@github-actions github-actions bot removed the stale label Jul 28, 2022
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Nov 28, 2022
@github-actions
Copy link

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Dec 28, 2022
@alimaredia alimaredia reopened this Jan 6, 2023
@github-actions github-actions bot removed the stale label Jan 6, 2023
@github-actions
Copy link

github-actions bot commented Mar 7, 2023

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Mar 7, 2023
@github-actions
Copy link

github-actions bot commented Apr 6, 2023

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Apr 6, 2023
@cbodley cbodley mentioned this pull request Jun 12, 2023
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants