-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw/rgw_op.cc: support async md5 calculation #42435
Conversation
Signed-off-by: Yang Honggang <yanghonggang_yewu@cmss.chinamobile.com>
hi @yanghonggang! @mdw-at-linuxbox has thought about this a bit, I'm adding him as a reviewer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very cool, thanks!
@@ -3958,7 +3995,15 @@ void RGWPutObj::execute(optional_yield y) | |||
} | |||
|
|||
if (need_calc_md5) { | |||
hash.Update((const unsigned char *)data.c_str(), data.length()); | |||
int data_len = data.length(); | |||
char* buf = new char[data_len]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to allocate and copy the buffer, you can just pass a copy of bufferlist data
through hash.Update()
and its lambda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
void Update(char *data, size_t len) { | ||
ongoing_ops.get(1); | ||
|
||
last = std::async([&](std::future<void>&& last, char* data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this pattern with std::async()
and std::future
looks simple and elegant, but i do have some concerns here
first is that rgw requests can run asynchronously as coroutines in a boost::asio::io_context
, and we want to avoid blocking on a condition variable (either in Throttle::get()
or std::future::get()
) - instead, we should suspend the coroutine so this thread can resume work on something else until the result is ready. you can find some examples of this in RGWReshardWait::wait(optional_yield y)
and RGWHTTPClient::wait(optional_yield y)
this overload of std::async()
doesn't take a launch policy, and
Behaves as if called with
policy
beingstd::launch::async | std::launch::deferred
. In other words, f might be executed in another thread or it might be run synchronously when the resultingstd::future
is queried for a value.
as far as i know, std::launch::async
doesn't give us any guarantees about the number or lifetime of its background threads, or the order of execution for these tasks. if it does limit the number of threads and allows out-of-order execution, this pattern could lead to deadlocks because each task blocks on the result of the previous task with std::future::get()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can find some examples of this in RGWReshardWait::wait(optional_yield y) and RGWHTTPClient::wait(optional_yield y)
ok, thank you.
if it does limit the number of threads and allows out-of-order execution, this pattern could lead to deadlocks because each task blocks on the result of the previous task with std::future::get()
@cbodley
I don't know under which condition this will lead to deadlocks. Can you give an example?
thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it does limit the number of threads and allows out-of-order execution, this pattern could lead to deadlocks because each task blocks on the result of the previous task with std::future::get()
I don't know under which condition this will lead to deadlocks. Can you give an example?
taking this example to an extreme, consider an implementation that uses a thread pool with a single thread. we upload an object, and AsyncMD5
creates a sequence of tasks A->B->C->D
. if this thread pool allows out-of-order execution, then it may execute B
before A
. task B
will block waiting for the result of A
, but task A
can never run because there's only the single thread
as the number of threads incease, deadlock becomes far less likely. but i think we're better off handling the threading manually to guarantee that it won't. for example, with a scheduler that's aware of these dependencies, and doesn't schedule a task until it's ready to run. we should also be able to combine the two separate locks (Throttle
and std::future
) into one
ultimately i think we're either going to need SIMD or large batches to see real wins here, to make up for the added overhead of thread synchronization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ultimately i think we're either going to need SIMD or large batches to see real wins here, to make up for the added overhead of thread synchronization
I'm with you on that. Thank you for your suggestions.
@mkogan1 thank you for your response. It seems that the performance is s3 object size dependent(your -z is 4K). |
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
The current putobject procedure is:
If we change 'md5 calculation' from sync op to async op, the 'async write' can start earlier, which will reduce put latency.
In my test environment, put latency is decreased from 91ms to 82ms(4M s3 object, test tool is hsbench). Of cause, the rados cluster performence and the rgw node's CPU should not be a bottleneck.
In order to change 'md5 calculation' to async op, one copy of user data chunk is kept until the calculation is finished. I don't know if there is a smart way to handle this.
Any suggestions would be greatly appreciated.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox