New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jewel: ceph-disk: systemd unit timesout too quickly #17133

Merged
merged 1 commit into from Sep 7, 2017

Conversation

Projects
None yet
5 participants
@smithfarm
Contributor

smithfarm commented Aug 22, 2017

ceph-disk: set the default systemd unit timeout to 3h
There needs to be a timeout to prevent ceph-disk from hanging
forever. But there is no good reason to set it to a value that is less
than a few hours.

Each OSD activation needs to happen in sequence and not in parallel,
reason why there is a global activation lock.

It would be possible, when an OSD is using a device that is not
otherwise used by another OSD (i.e. they do not share an SSD journal
device etc.), to run all activations in parallel. It would however
require a more extensive modification of ceph-disk to avoid any chances
of races.

Fixes: http://tracker.ceph.com/issues/20229

Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit a9eb52e)

@smithfarm smithfarm self-assigned this Aug 22, 2017

@smithfarm smithfarm added this to the jewel milestone Aug 22, 2017

@smithfarm smithfarm added bug fix core tools and removed core labels Aug 22, 2017

@smithfarm

This comment has been minimized.

Show comment
Hide comment
@smithfarm
Contributor

smithfarm commented Aug 22, 2017

@smithfarm smithfarm changed the title from jewel: ceph-disk systemd unit timesout too quickly to jewel: ceph-disk: systemd unit timesout too quickly Aug 22, 2017

@smithfarm smithfarm requested review from asheplyakov and ddiss Aug 22, 2017

@smithfarm

This comment has been minimized.

Show comment
Hide comment
@smithfarm
Contributor

smithfarm commented Aug 22, 2017

@ddiss

ddiss approved these changes Aug 22, 2017

@smithfarm

This comment has been minimized.

Show comment
Hide comment
@smithfarm

smithfarm Aug 22, 2017

Contributor

Jenkins re-test this please (timeout on unittest_pglog.log)

Contributor

smithfarm commented Aug 22, 2017

Jenkins re-test this please (timeout on unittest_pglog.log)

@smithfarm

This comment has been minimized.

Show comment
Hide comment
@smithfarm

smithfarm Aug 22, 2017

Contributor

Passed a ceph-disk suite at http://tracker.ceph.com/issues/20613#note-13

Contributor

smithfarm commented Aug 22, 2017

Passed a ceph-disk suite at http://tracker.ceph.com/issues/20613#note-13

@smithfarm smithfarm requested review from tchaikov and badone Aug 22, 2017

@tchaikov tchaikov merged commit b014f39 into ceph:jewel Sep 7, 2017

4 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
@tchaikov

This comment has been minimized.

Show comment
Hide comment
@tchaikov

tchaikov Sep 7, 2017

Contributor

@smithfarm as the systemd scripts are not exercised by the qa suites, i am merging it!

Contributor

tchaikov commented Sep 7, 2017

@smithfarm as the systemd scripts are not exercised by the qa suites, i am merging it!

@smithfarm smithfarm deleted the smithfarm:wip-21035-jewel branch Sep 7, 2017

@vasukulkarni

This comment has been minimized.

Show comment
Hide comment
@vasukulkarni

vasukulkarni Sep 7, 2017

Member

@tchaikov @smithfarm FYI we have 2 systemd test in smoke which we can run for systemd changes ('--suite smoke + --filter systemd') http://pulpito.ceph.com/vasu-2017-09-07_01:18:41-smoke-master-distro-basic-vps/

Member

vasukulkarni commented Sep 7, 2017

@tchaikov @smithfarm FYI we have 2 systemd test in smoke which we can run for systemd changes ('--suite smoke + --filter systemd') http://pulpito.ceph.com/vasu-2017-09-07_01:18:41-smoke-master-distro-basic-vps/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment