New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jewel: osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid missing wakeup signal #15947

Merged
merged 1 commit into from Aug 29, 2017

Conversation

Projects
None yet
6 participants
@asheplyakov

asheplyakov commented Jun 27, 2017

Alexey Sheplyakov
jewel: osd: unlock sdata_op_ordering_lock with sdata_lock hold to avo…
…id missing wakeup signal

Based on commit bc68338. The OSD code has
been refactored a lot since Jewel, hence cherry-picking that patch introduces
a lot of unrelated changes, and is much more difficult than reusing the idea.

Fixes: http://tracker.ceph.com/issues/20428

Signed-off-by: Alexey Sheplyakov <asheplyakov@mirantis.com>

@asheplyakov asheplyakov requested review from jdurgin and xiexingguo Jun 27, 2017

@xiexingguo

This comment has been minimized.

Show comment
Hide comment
@xiexingguo

xiexingguo Jun 27, 2017

Member

lgtm

Member

xiexingguo commented Jun 27, 2017

lgtm

@tchaikov tchaikov added this to the jewel milestone Jun 27, 2017

@@ -8727,9 +8727,9 @@ void OSD::ShardedOpWQ::_process(uint32_t thread_index, heartbeat_handle_d *hb )
assert(NULL != sdata);
sdata->sdata_op_ordering_lock.Lock();
if (sdata->pqueue->empty()) {
sdata->sdata_op_ordering_lock.Unlock();
osd->cct->get_heartbeat_map()->reset_timeout(hb, 4, 0);
sdata->sdata_lock.Lock();

This comment has been minimized.

@liu-chunmei

liu-chunmei Jul 14, 2017

Contributor

these two locks have different functions, and if sdata->pqueue is empty, the earlier the better to release sdata_op_oerdering_lock, Don't know why just release sdata_op_oerdering_lock after hold stada_lock will solve "missing wakeup signal"

@liu-chunmei

liu-chunmei Jul 14, 2017

Contributor

these two locks have different functions, and if sdata->pqueue is empty, the earlier the better to release sdata_op_oerdering_lock, Don't know why just release sdata_op_oerdering_lock after hold stada_lock will solve "missing wakeup signal"

This comment has been minimized.

@asheplyakov

asheplyakov Jul 15, 2017

Basically because releasing sdata_op_ordering lock without sdata_lock hold allows messenger worker thread (which tries to insert an op) to race with osd_tp thread (which is about to start waiting for the wakeup signal). See the commit message of the original commit for more details: bc68338

@asheplyakov

asheplyakov Jul 15, 2017

Basically because releasing sdata_op_ordering lock without sdata_lock hold allows messenger worker thread (which tries to insert an op) to race with osd_tp thread (which is about to start waiting for the wakeup signal). See the commit message of the original commit for more details: bc68338

This comment has been minimized.

@liu-chunmei

liu-chunmei Jul 17, 2017

Contributor

The explain is very clear, thanks!

@liu-chunmei

liu-chunmei Jul 17, 2017

Contributor

The explain is very clear, thanks!

@smithfarm

This comment has been minimized.

Show comment
Hide comment
@smithfarm

smithfarm Aug 29, 2017

Contributor

This passed the following relevant suites:

Contributor

smithfarm commented Aug 29, 2017

This passed the following relevant suites:

@smithfarm smithfarm merged commit 2a8e0f6 into ceph:jewel Aug 29, 2017

3 of 4 checks passed

Docs: build check Docs: building
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment