-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore: fix deferred_queue locking #38934
Conversation
ceph#30027 introduced a gap in osr protection (in _deferred_queue()) which could cause improper deferred_pending value while processing osr from _deferred_aio_finish(). As a result both segmentation fault in _deferred_aio_finish() or deadlock could occur. Fixes: https://tracker.ceph.com/issues/48776 Signed-off-by: Igor Fedotov <ifedotov@suse.com>
@ifed01 . Sorry, i don't know the root cause from the git log. Can you give more explain? Thanks ! |
@majianpeng - current implementation of BlueStore::_deferred_queue() has an unprotected gap when OSR's deferred_pending is reset and preserved in tmp local var. This might cause concurring _deferred_aio_finish() behave differently and trigger an improper removal from deferred_queue (e.g. https://tracker.ceph.com/issues/48876): if (!osr->deferred_pending) {
dout(20) << __func__ << " dequeueing" << dendl;
{
deferred_lock.lock();
auto q = deferred_queue.iterator_to(*osr);
deferred_queue.erase(q);
deferred_lock.unlock();
}
osr->deferred_lock.unlock();
} else { Another appearance (which I haven't investigated deeply but it dissolves after my patch) is an occasionally hanging ceph_test_objectstore's IORemount/2 test case: https://tracker.ceph.com/issues/48776 Both tickets are quite easily reproducible by multiple running of the above mentioned ceph_test_objectstore's IORemount/2 test case |
@ifed01 . w/ your pr. In __deffered_queue func, txc->osr->deferred_lock.lock();
{
if (!txc->osr->deferred_pending) {
tmp = new DeferredBatch(cct, txc->osr.get());
txc->osr->deferred_lock.unlock();
// >>For this case, it still don't insert osr into deferred_queue. And after unlock, _deffered_aio_finish
// will call deferred_queue.erase(). So i think for this case we can't unlock. and
// for deferred_pending != NULL, it can unlock.
need_lock_again = true;
} else {
tmp = txc->osr->deferred_pending;
}
} |
@ifed01 . For the hung problem is :
|
Agree that we should keep lock for deferred_pending == null case but IMO for deferred_pending we should hold it as well. |
I can anecdotally confirm that this PR makes the OSD much more stable. It was routinely crashing for me in testing before I applied it. Since then, no OSD crashes. |
I agree your point. |
@ifed01 could you please create a pacific backport PR for this? |
#30027 introduced a gap in osr
protection (in _deferred_queue()) which could cause improper deferred_pending value while
processing osr from _deferred_aio_finish().
As a result both segmentation fault in _deferred_aio_finish() or deadlock could occur.
Fixes: https://tracker.ceph.com/issues/48776
Signed-off-by: Igor Fedotov ifedotov@suse.com
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox