New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: Implement asynchronous recovery sleep #15212

Merged
merged 4 commits into from Jun 1, 2017

Conversation

Projects
None yet
4 participants
@neha-ojha
Member

neha-ojha commented May 22, 2017

Rather than blocking the main op queue, do an asynchronous recovery sleep.

Signed-off-by: Neha Ojha nojha@redhat.com

osd: Implement asynchronous recovery sleep
Signed-off-by: Neha Ojha <nojha@redhat.com>
<< ", re-queuing recovery" << dendl;
service.recovery_needs_sleep = false;
pg->recovery_queued = false;
pg->queue_recovery();

This comment has been minimized.

@jdurgin

jdurgin May 22, 2017

Member

At this point this PGQueuable already went through and was accounted for in recovery_ops_reserved when it was originally enqueued, but pg->queue_recovery() will end up calling OSDService::queue_for_recovery()
->OSDService::_maybe_queue_recovery() which will try to account it in recovery_ops_reserved again.

It seems like we can avoid taking the pg lock and changing any pg state here, and re-add the pg directly via OSDService::_queue_for_recovery() under the recovery_lock.

This comment has been minimized.

@neha-ojha

neha-ojha May 23, 2017

Member

I'll try making these changes and let you know how it goes.

dout(20) << __func__ << " slept for " << t << dendl;
pg->lock();
handle.reset_tp_timeout();
if (cct->_conf->osd_recovery_sleep > 0 && service.recovery_needs_sleep) {

This comment has been minimized.

@jdurgin

jdurgin May 22, 2017

Member

I think your requeuing algorithm makes sense - could you add a comment here explaining it for future code readers?

This comment has been minimized.

@neha-ojha
@jdurgin

This comment has been minimized.

Member

jdurgin commented May 22, 2017

right now nothing in the docs mentions this option - it should be added to doc/rados/configuration/osd-config-ref.rst

I'm thinking the default for this setting should be a small value to (a) get tested with all the existing suites and (b) impact client i/o less during default recovery. Something like 0.01 perhaps? This is one of those settings where different defaults for ssds vs hdds would be ideal.

neha-ojha added some commits May 25, 2017

doc: add description about osd_recovery_sleep
Signed-off-by: Neha Ojha <nojha@redhat.com>
osd: change default value of osd_recovery_sleep
Signed-off-by: Neha Ojha <nojha@redhat.com>
osd: re-queue recovery without pg lock
Signed-off-by: Neha Ojha <nojha@redhat.com>

@jdurgin jdurgin added the needs-qa label May 25, 2017

@liewegas liewegas merged commit 8bcf3e5 into ceph:master Jun 1, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment