jewel: systemd/ceph-disk: reduce ceph-disk flock contention #12210

ddiss · 2016-11-29T10:16:56Z

"ceph-disk trigger" invocation is currently performed in a mutually
exclusive fashion, with each call first taking an flock on the path
/var/lock/ceph-disk. On systems with a lot of osds, this leads to a
large amount of lock contention during boot-up, and can cause some
service instances to trip the 120 second timeout.

Take an flock on a device specific path instead of /var/lock/ceph-disk,
so that concurrent "ceph-disk trigger" invocations are permitted for
independent osds. This greatly reduces lock contention and consequently
the chance of service timeout. Per-device concurrency restrictions
required for http://tracker.ceph.com/issues/13160 are maintained.

Fixes: http://tracker.ceph.com/issues/18060

Signed-off-by: David Disseldorp ddiss@suse.de
(cherry picked from commit 8a62cbc)

"ceph-disk trigger" invocation is currently performed in a mutually exclusive fashion, with each call first taking an flock on the path /var/lock/ceph-disk. On systems with a lot of osds, this leads to a large amount of lock contention during boot-up, and can cause some service instances to trip the 120 second timeout. Take an flock on a device specific path instead of /var/lock/ceph-disk, so that concurrent "ceph-disk trigger" invocations are permitted for independent osds. This greatly reduces lock contention and consequently the chance of service timeout. Per-device concurrency restrictions required for http://tracker.ceph.com/issues/13160 are maintained. Fixes: http://tracker.ceph.com/issues/18060 Signed-off-by: David Disseldorp <ddiss@suse.de> (cherry picked from commit 8a62cbc)

… flock contention Reviewed-by: Loic Dachary <ldachary@redhat.com>

ddiss · 2017-01-09T10:32:10Z

Ping - anything I can do to help move this along?

smithfarm · 2017-01-17T20:28:54Z

@dachary I guess all this needs is a ceph-disk suite? I'll schedule one.

smithfarm · 2017-01-18T13:02:28Z

This PR was included in this [1] integration branch which already passed a ceph-disk suite at [2].

[1] http://tracker.ceph.com/issues/17851#note-17
[2] http://tracker.ceph.com/issues/17851#note-18

vasukulkarni · 2017-01-18T20:47:16Z

there is a systemd test in smoke which can use used as well, FYI

smithfarm · 2017-01-20T11:13:27Z

@vasukulkarni Thanks for the heads-up, but I couldn't find it in the jewel branch.

ddiss changed the title ~~systemd/ceph-disk: reduce ceph-disk flock contention~~ Backport: systemd/ceph-disk: reduce ceph-disk flock contention Nov 29, 2016

ghost added bug-fix core labels Nov 29, 2016

ghost self-assigned this Nov 29, 2016

ghost added this to the jewel milestone Nov 29, 2016

tchaikov changed the title ~~Backport: systemd/ceph-disk: reduce ceph-disk flock contention~~ jewel: systemd/ceph-disk: reduce ceph-disk flock contention Nov 30, 2016

ghost pushed a commit that referenced this pull request Dec 5, 2016

Merge pull request #12210: jewel: systemd/ceph-disk: reduce ceph-disk…

8246c53

… flock contention Reviewed-by: Loic Dachary <ldachary@redhat.com>

ghost changed the base branch from jewel-next to jewel December 21, 2016 23:27

ghost merged commit 25a9e5f into ceph:jewel Jan 18, 2017

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jewel: systemd/ceph-disk: reduce ceph-disk flock contention #12210

jewel: systemd/ceph-disk: reduce ceph-disk flock contention #12210

ddiss commented Nov 29, 2016

ddiss commented Jan 9, 2017

smithfarm commented Jan 17, 2017

smithfarm commented Jan 18, 2017 •

edited

vasukulkarni commented Jan 18, 2017

smithfarm commented Jan 20, 2017

jewel: systemd/ceph-disk: reduce ceph-disk flock contention #12210

jewel: systemd/ceph-disk: reduce ceph-disk flock contention #12210

Conversation

ddiss commented Nov 29, 2016

ddiss commented Jan 9, 2017

smithfarm commented Jan 17, 2017

smithfarm commented Jan 18, 2017 • edited

vasukulkarni commented Jan 18, 2017

smithfarm commented Jan 20, 2017

smithfarm commented Jan 18, 2017 •

edited