Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jewel: systemd/ceph-disk: reduce ceph-disk flock contention #12210

Merged

Conversation

ddiss
Copy link
Contributor

@ddiss ddiss commented Nov 29, 2016

"ceph-disk trigger" invocation is currently performed in a mutually
exclusive fashion, with each call first taking an flock on the path
/var/lock/ceph-disk. On systems with a lot of osds, this leads to a
large amount of lock contention during boot-up, and can cause some
service instances to trip the 120 second timeout.

Take an flock on a device specific path instead of /var/lock/ceph-disk,
so that concurrent "ceph-disk trigger" invocations are permitted for
independent osds. This greatly reduces lock contention and consequently
the chance of service timeout. Per-device concurrency restrictions
required for http://tracker.ceph.com/issues/13160 are maintained.

Fixes: http://tracker.ceph.com/issues/18060

Signed-off-by: David Disseldorp ddiss@suse.de
(cherry picked from commit 8a62cbc)

"ceph-disk trigger" invocation is currently performed in a mutually
exclusive fashion, with each call first taking an flock on the path
/var/lock/ceph-disk. On systems with a lot of osds, this leads to a
large amount of lock contention during boot-up, and can cause some
service instances to trip the 120 second timeout.

Take an flock on a device specific path instead of /var/lock/ceph-disk,
so that concurrent "ceph-disk trigger" invocations are permitted for
independent osds. This greatly reduces lock contention and consequently
the chance of service timeout. Per-device concurrency restrictions
required for http://tracker.ceph.com/issues/13160 are maintained.

Fixes: http://tracker.ceph.com/issues/18060

Signed-off-by: David Disseldorp <ddiss@suse.de>
(cherry picked from commit 8a62cbc)
@ddiss ddiss changed the title systemd/ceph-disk: reduce ceph-disk flock contention Backport: systemd/ceph-disk: reduce ceph-disk flock contention Nov 29, 2016
@ghost ghost added bug-fix core labels Nov 29, 2016
@ghost ghost self-assigned this Nov 29, 2016
@ghost ghost added this to the jewel milestone Nov 29, 2016
@tchaikov tchaikov changed the title Backport: systemd/ceph-disk: reduce ceph-disk flock contention jewel: systemd/ceph-disk: reduce ceph-disk flock contention Nov 30, 2016
ghost pushed a commit that referenced this pull request Dec 5, 2016
… flock contention

Reviewed-by: Loic Dachary <ldachary@redhat.com>
@ghost ghost changed the base branch from jewel-next to jewel December 21, 2016 23:27
@ddiss
Copy link
Contributor Author

ddiss commented Jan 9, 2017

Ping - anything I can do to help move this along?

@smithfarm
Copy link
Contributor

@dachary I guess all this needs is a ceph-disk suite? I'll schedule one.

@smithfarm
Copy link
Contributor

smithfarm commented Jan 18, 2017

This PR was included in this [1] integration branch which already passed a ceph-disk suite at [2].

[1] http://tracker.ceph.com/issues/17851#note-17
[2] http://tracker.ceph.com/issues/17851#note-18

@ghost ghost merged commit 25a9e5f into ceph:jewel Jan 18, 2017
@vasukulkarni
Copy link
Contributor

there is a systemd test in smoke which can use used as well, FYI

@smithfarm
Copy link
Contributor

@vasukulkarni Thanks for the heads-up, but I couldn't find it in the jewel branch.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants