New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid miss… #15891

Merged
merged 1 commit into from Jun 25, 2017

Conversation

Projects
None yet
7 participants
@minggr

minggr commented Jun 23, 2017

…ing wakeup signal

We are running mysql on top of rbd. sysbench qps occasionally drops to zero
with the INSERT benchmark.

Debug code captured >2s latency between PG::queue_op() and OSD::dequeue_op().
We finally found out that the latency came from below code in OSD::ShardedOpWQ::_process(),

sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));

"threadpool_empty_queue_max_wait" is 2s by default.

Normally, it should not sleep for 2s since the comming IO requests will wakeup it.
But there is a small timing window that it missed the wakeup signal actually.
For example,

 msgr-worker-0 thread                    tp_osd_tp thread
 OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
 ---------------------------             ---------------------------

T1: sdata_op_ordering_lock.Lock()
T2: sdata_op_ordering_lock.Lock()
"queue empty"
sdata_op_ordering_lock.Unlock()
"insert op"
sdata_op_ordering_lock.Unlock()

T3: sdata_lock.Lock()
T4: sdata_lock.Lock()
"send wakeup signal"
sdata_lock.Unock()
// here the wakeup signal has no effect actually
// becuase it has not slept yet.

                                         // then, it sleeps.
                                         WaitInterval(2s)

This patch unlocks sdata_op_ordering_lock with sdata_lock hold in OSD::ShardedOpWQ::_process(),
then the timeline becomes,

 msgr-worker-0 thread                    tp_osd_tp thread
 OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
 ---------------------------             ---------------------------

T1: sdata_op_ordering_lock.Lock()
T2: sdata_op_ordering_lock.Lock()
"queue empty"
sdata_lock.Lock()
T3: sdata_op_ordering_lock.Unlock()
"insert op"
sdata_op_ordering_lock.Unlock()
sdata_lock.Lock()

T4: WaitInterval(2s) -> it actually unlocks sdata_lock
"send wakeup signal"
sdata_lock.Unock()
//got signal, wakeup immediately

With this one line change, we can avoid occasional high latency.

Signed-off-by: Ming Lin ming.lin@alibaba-inc.com

osd: unlock sdata_op_ordering_lock with sdata_lock hold to avoid miss…
…ing wakeup signal

We are running mysql on top of rbd. sysbench qps occasionally drops to zero
with the INSERT benchmark.

Debug code captured >2s latency between PG::queue_op() and OSD::dequeue_op().
We finally found out that the latency came from below code in OSD::ShardedOpWQ::_process(),

sdata->sdata_cond.WaitInterval(sdata->sdata_lock,
      utime_t(osd->cct->_conf->threadpool_empty_queue_max_wait, 0));

"threadpool_empty_queue_max_wait" is 2s by default.

Normally, it should not sleep for 2s since the comming IO requests will wakeup it.
But there is a small timing window that it missed the wakeup signal actually.
For example,

     msgr-worker-0 thread                    tp_osd_tp thread
     OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
     ---------------------------             ---------------------------
T1:                                          sdata_op_ordering_lock.Lock()
T2:  sdata_op_ordering_lock.Lock()
                                             "queue empty"
                                             sdata_op_ordering_lock.Unlock()
     "insert op"
     sdata_op_ordering_lock.Unlock()

T3:  sdata_lock.Lock()
T4:                                          sdata_lock.Lock()
     "send wakeup signal"
     sdata_lock.Unock()
                                             // here the wakeup signal has no effect actually
                                             // becuase it has not slept yet.

                                             // then, it sleeps.
                                             WaitInterval(2s)

This patch unlocks sdata_op_ordering_lock with sdata_lock hold in OSD::ShardedOpWQ::_process(),
then the timeline becomes,

     msgr-worker-0 thread                    tp_osd_tp thread
     OSD::ShardedOpWQ::_enqueue              OSD::ShardedOpWQ::_process
     ---------------------------             ---------------------------
T1:                                          sdata_op_ordering_lock.Lock()
T2:  sdata_op_ordering_lock.Lock()
                                             "queue empty"
                                             sdata_lock.Lock()
T3:                                          sdata_op_ordering_lock.Unlock()
     "insert op"
     sdata_op_ordering_lock.Unlock()
     sdata_lock.Lock()

T4:                                          WaitInterval(2s) -> it actually unlocks sdata_lock
     "send wakeup signal"
     sdata_lock.Unock()
                                             //got signal, wakeup immediately

With this one line change, we can avoid occasional high latency.

Signed-off-by: Ming Lin <ming.lin@alibaba-inc.com>
@minggr

This comment has been minimized.

minggr commented Jun 23, 2017

Seems the commit log was messed by the pull request, please see below for the clear commit log.
minggr@bc68338

@xiexingguo

This comment has been minimized.

Member

xiexingguo commented Jun 24, 2017

👍

@liupan1111

This comment has been minimized.

Contributor

liupan1111 commented Jun 24, 2017

good catch!

@liewegas liewegas merged commit 4e6af85 into ceph:master Jun 25, 2017

3 of 4 checks passed

arm64 make check arm64 make check failed
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
@mslovy

This comment has been minimized.

Contributor

mslovy commented Jun 27, 2017

👍, shall we backport this to Jewel?

@asheplyakov

This comment has been minimized.

asheplyakov commented Jun 27, 2017

I've filed a ticket regarding this problem: http://tracker.ceph.com/issues/20427
In the future it would be nice to file tickets before merging the fix

@mslovy

This comment has been minimized.

Contributor

mslovy commented Jun 27, 2017

great

@liewegas

This comment has been minimized.

Member

liewegas commented Jun 27, 2017

@xiexingguo

This comment has been minimized.

Member

xiexingguo commented Jun 27, 2017

You'll need to look at the jewel version.. the wq code was reworked for luminous so the same bug might not be in jewel.

According to http://tracker.ceph.com/issues/20427
It's originally tested and found in Jewel.

@mslovy

This comment has been minimized.

Contributor

mslovy commented Jun 27, 2017

exactly as xingguo mentioned. I still remember wq code is refacted in the hammer's release cycle and add shardOpWQ. So it may introduce at that time.

zealoussnow added a commit to zealoussnow/ceph that referenced this pull request Oct 23, 2017

osd: merge PR ceph#15891
see detail ceph#15891

Signed-off-by: Leo Zhang <nguzcf@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment