New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw_file: alternate fix deadlock on lru eviction #20034
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
untested, but the approach is sane and it preserves the latch in the common case
@yaozongyou @cbodley I'll try to get proper testing finished shortly, thanks for your help! |
I have been testing this fix in my environment, still hangs, here is the backtrace: (gdb) Thread 135
[Switching to thread 135 (Thread 0x7f3cfcf7a700 (LWP 1742))]
#6 0x00007f3d7f0a1110 in rgw::RGWFileHandle::reclaim (this=0x7f3d5e0db200)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1014
1014 fs->fh_cache.remove(fh.fh_hk.object, this, FHCache::FLAG_LOCK);
(gdb) bt
#0 0x00007f3d8fae51bd in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007f3d8fae0d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007f3d8fae0c08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f3d7f0b2df3 in __gthread_mutex_lock (__mutex=0x7f3d82045748)
at /data/home/richardyao/bin/gcc72/include/c++/7.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748
#4 std::mutex::lock (this=this@entry=0x7f3d82045748)
at /data/home/richardyao/bin/gcc72/include/c++/7.2.0/bits/std_mutex.h:103
#5 0x00007f3d7f0ccd5d in cohort::lru::TreeX<rgw::RGWFileHandle, boost::intrusive::rbtree<rgw::RGWFileHandle, boost::intrusive::compare<rgw::RGWFileHandle::FhLT>, boost::intrusive::member_hook<rgw::RGWFileHandle, boost::intrusive::set_member_hook<boost::intrusive::link_mode<(boost::intrusive::link_mode_type)1>, void, void, void>, &rgw::RGWFileHandle::fh_hook>, void, void, void, void>, rgw::RGWFileHandle::FhLT, rgw::RGWFileHandle::FhEQ, rgw::fh_key, std::mutex>::remove (this=0x7f3d820cee10,
hk=5070069916126884224, v=v@entry=0x7f3d5e0db200, flags=flags@entry=1)
at /data/home/richardyao/workspace/github/ceph/src/common/cohort_lru.h:447
#6 0x00007f3d7f0a1110 in rgw::RGWFileHandle::reclaim (this=0x7f3d5e0db200)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1014
#7 0x00007f3d7f0d1db2 in evict_block (this=<optimized out>)
at /data/home/richardyao/workspace/github/ceph/src/common/cohort_lru.h:147
#8 insert (flags=<synthetic pointer>, edge=cohort::lru::MRU, fac=<synthetic pointer>, this=<optimized out>)
at /data/home/richardyao/workspace/github/ceph/src/common/cohort_lru.h:239
#9 rgw::RGWLibFS::lookup_fh (this=this@entry=0x7f3d820cec00, parent=parent@entry=0x7f3d820cec20,
name=name@entry=0x7f3d73036058 "61", flags=flags@entry=68)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.h:1065
#10 0x00007f3d7f0a5ec6 in rgw::RGWLibFS::create (this=0x7f3d820cec00, parent=0x7f3d820cec20, name=0x7f3d73036058 "61",
st=0x7f3cfcf75a80, mask=7, flags=0) at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:645
#11 0x00007f3d7f0a62a8 in rgw_create (rgw_fs=<optimized out>, parent_fh=<optimized out>, name=<optimized out>,
st=<optimized out>, mask=<optimized out>, fh=0x7f3cfcf75998, posix_flags=193, flags=0)
---Type <return> to continue, or q <return> to quit---
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1667
#12 0x00007f3d7fdcb4c3 in rgw_fsal_open2 (obj_hdl=0x7f3d7ce0b200, state=0x7f3d73048d40, openflags=2,
createmode=FSAL_EXCLUSIVE, name=0x7f3d73036058 "61", attrib_set=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t",
new_obj=0x7f3cfcf75d00, attrs_out=0x7f3cfcf75c10, caller_perm_check=0x7f3cfcf75daf)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/FSAL_RGW/handle.c:1060
#13 0x0000000000535a48 in mdcache_open2 (obj_hdl=0x7f3d7cd7e738, state=0x7f3d73048d40, openflags=2,
createmode=FSAL_EXCLUSIVE, name=0x7f3d73036058 "61", attrs_in=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t",
new_obj=0x7f3cfcf767e0, attrs_out=0x0, caller_perm_check=0x7f3cfcf75daf)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_file.c:672
#14 0x000000000042fda3 in open2_by_name (in_obj=0x7f3d7cd7e738, state=0x7f3d73048d40, openflags=2,
createmode=FSAL_EXCLUSIVE, name=0x7f3d73036058 "61", attr=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t",
obj=0x7f3cfcf767e0, attrs_out=0x0) at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/fsal_helper.c:406
#15 0x0000000000432a8c in fsal_open2 (in_obj=0x7f3d7cd7e738, state=0x7f3d73048d40, openflags=2, createmode=FSAL_EXCLUSIVE,
name=0x7f3d73036058 "61", attr=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t", obj=0x7f3cfcf767e0,
attrs_out=0x0) at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/fsal_helper.c:1788
#16 0x000000000047054d in open4_ex (arg=0x7f3cb918e918, data=0x7f3cfcf76950, res_OPEN4=0x7f3d73090128,
clientid=0x7f3cb8019100, owner=0x7f3cb201d200, file_state=0x7f3cfcf76898, new_state=0x7f3cfcf76897)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/Protocols/NFS/nfs4_op_open.c:1440
#17 0x000000000047191a in nfs4_op_open (op=0x7f3cb918e910, data=0x7f3cfcf76950, resp=0x7f3d73090120)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/Protocols/NFS/nfs4_op_open.c:1845
#18 0x000000000045c791 in nfs4_Compound (arg=0x7f3cb928b630, req=0x7f3cb928ae28, res=0x7f3cafc0e280)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:743
#19 0x000000000044b19d in nfs_rpc_execute (reqdata=0x7f3cb928ae00)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1290
#20 0x000000000044b9b7 in worker_run (ctx=0x7f3d3ffca680)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1562
---Type <return> to continue, or q <return> to quit---
#21 0x0000000000504bc5 in fridgethr_start_routine (arg=0x7f3d3ffca680)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/support/fridgethr.c:550
#22 0x00007f3d8fadedc5 in start_thread () from /lib64/libpthread.so.0
#23 0x00007f3d8ec9474d in clone () from /lib64/libc.so.6
(gdb)
(gdb) f 6
#6 0x00007f3d7f0a1110 in rgw::RGWFileHandle::reclaim (this=0x7f3d5e0db200)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1014
1014 fs->fh_cache.remove(fh.fh_hk.object, this, FHCache::FLAG_LOCK);
(gdb) print fs->fh_cache.part[2].lock
$7 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 2, __count = 0, __owner = 1742, __nusers = 1, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\316\006\000\000\001", '\000' <repeats 26 times>,
__align = 2}}, <No data fields>}
(gdb) |
fs->fh_cache.remove(fh.fh_hk.object, this, FHCache::FLAG_LOCK); | ||
/* in this case, we are being called from a context which holds | ||
* the partition lock */ | ||
fs->fh_cache.remove(fh.fh_hk.object, this, FHCache::FLAG_NONE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be we should keep this as it is, and instead change RGWFileHandle::reclaim:
https://github.com/ceph/ceph/pull/20034/files#diff-b03934e90d5d464122f42ce5e12cca65R1014
The stack trace seems to have flag_lock in the call to fh_cache.remove. It
should be flag_none, yes?
I'll reread tomorrow. Thanks for testing.
Matt
…On Jan 20, 2018 1:37 AM, "Yao Zongyou" ***@***.***> wrote:
I have been testing this fix in my environment, still hangs, here is the
backtrace:
(gdb) Thread 135
[Switching to thread 135 (Thread 0x7f3cfcf7a700 (LWP 1742))]#6 0x00007f3d7f0a1110 in rgw::RGWFileHandle::reclaim (this=0x7f3d5e0db200)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1014
1014 fs->fh_cache.remove(fh.fh_hk.object, this, FHCache::FLAG_LOCK);
(gdb) bt#0 0x00007f3d8fae51bd in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x00007f3d8fae0d02 in _L_lock_791 () from /lib64/libpthread.so.0#2 0x00007f3d8fae0c08 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x00007f3d7f0b2df3 in __gthread_mutex_lock (__mutex=0x7f3d82045748)
at /data/home/richardyao/bin/gcc72/include/c++/7.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h:748#4 std::mutex::lock ***@***.***=0x7f3d82045748)
at /data/home/richardyao/bin/gcc72/include/c++/7.2.0/bits/std_mutex.h:103#5 0x00007f3d7f0ccd5d in cohort::lru::TreeX<rgw::RGWFileHandle, boost::intrusive::rbtree<rgw::RGWFileHandle, boost::intrusive::compare<rgw::RGWFileHandle::FhLT>, boost::intrusive::member_hook<rgw::RGWFileHandle, boost::intrusive::set_member_hook<boost::intrusive::link_mode<(boost::intrusive::link_mode_type)1>, void, void, void>, &rgw::RGWFileHandle::fh_hook>, void, void, void, void>, rgw::RGWFileHandle::FhLT, rgw::RGWFileHandle::FhEQ, rgw::fh_key, std::mutex>::remove (this=0x7f3d820cee10,
hk=5070069916126884224, ***@***.***=0x7f3d5e0db200, ***@***.***=1)
at /data/home/richardyao/workspace/github/ceph/src/common/cohort_lru.h:447#6 0x00007f3d7f0a1110 in rgw::RGWFileHandle::reclaim (this=0x7f3d5e0db200)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1014#7 0x00007f3d7f0d1db2 in evict_block (this=<optimized out>)
at /data/home/richardyao/workspace/github/ceph/src/common/cohort_lru.h:147#8 insert (flags=<synthetic pointer>, edge=cohort::lru::MRU, fac=<synthetic pointer>, this=<optimized out>)
at /data/home/richardyao/workspace/github/ceph/src/common/cohort_lru.h:239#9 rgw::RGWLibFS::lookup_fh ***@***.***=0x7f3d820cec00, ***@***.***=0x7f3d820cec20,
***@***.***=0x7f3d73036058 "61", ***@***.***=68)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.h:1065#10 0x00007f3d7f0a5ec6 in rgw::RGWLibFS::create (this=0x7f3d820cec00, parent=0x7f3d820cec20, name=0x7f3d73036058 "61",
st=0x7f3cfcf75a80, mask=7, flags=0) at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:645#11 0x00007f3d7f0a62a8 in rgw_create (rgw_fs=<optimized out>, parent_fh=<optimized out>, name=<optimized out>,
st=<optimized out>, mask=<optimized out>, fh=0x7f3cfcf75998, posix_flags=193, flags=0)
---Type <return> to continue, or q <return> to quit---
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1667#12 0x00007f3d7fdcb4c3 in rgw_fsal_open2 (obj_hdl=0x7f3d7ce0b200, state=0x7f3d73048d40, openflags=2,
createmode=FSAL_EXCLUSIVE, name=0x7f3d73036058 "61", attrib_set=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t",
new_obj=0x7f3cfcf75d00, attrs_out=0x7f3cfcf75c10, caller_perm_check=0x7f3cfcf75daf)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/FSAL_RGW/handle.c:1060#13 0x0000000000535a48 in mdcache_open2 (obj_hdl=0x7f3d7cd7e738, state=0x7f3d73048d40, openflags=2,
createmode=FSAL_EXCLUSIVE, name=0x7f3d73036058 "61", attrs_in=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t",
new_obj=0x7f3cfcf767e0, attrs_out=0x0, caller_perm_check=0x7f3cfcf75daf)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_file.c:672#14 0x000000000042fda3 in open2_by_name (in_obj=0x7f3d7cd7e738, state=0x7f3d73048d40, openflags=2,
createmode=FSAL_EXCLUSIVE, name=0x7f3d73036058 "61", attr=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t",
obj=0x7f3cfcf767e0, attrs_out=0x0) at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/fsal_helper.c:406#15 0x0000000000432a8c in fsal_open2 (in_obj=0x7f3d7cd7e738, state=0x7f3d73048d40, openflags=2, createmode=FSAL_EXCLUSIVE,
name=0x7f3d73036058 "61", attr=0x7f3cfcf766e0, verifier=0x7f3cfcf767c8 "\241^\353-:\t", obj=0x7f3cfcf767e0,
attrs_out=0x0) at /data/home/richardyao/workspace/github/nfs-ganesha/src/FSAL/fsal_helper.c:1788#16 0x000000000047054d in open4_ex (arg=0x7f3cb918e918, data=0x7f3cfcf76950, res_OPEN4=0x7f3d73090128,
clientid=0x7f3cb8019100, owner=0x7f3cb201d200, file_state=0x7f3cfcf76898, new_state=0x7f3cfcf76897)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/Protocols/NFS/nfs4_op_open.c:1440#17 0x000000000047191a in nfs4_op_open (op=0x7f3cb918e910, data=0x7f3cfcf76950, resp=0x7f3d73090120)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/Protocols/NFS/nfs4_op_open.c:1845#18 0x000000000045c791 in nfs4_Compound (arg=0x7f3cb928b630, req=0x7f3cb928ae28, res=0x7f3cafc0e280)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/Protocols/NFS/nfs4_Compound.c:743#19 0x000000000044b19d in nfs_rpc_execute (reqdata=0x7f3cb928ae00)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1290#20 0x000000000044b9b7 in worker_run (ctx=0x7f3d3ffca680)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1562
---Type <return> to continue, or q <return> to quit---#21 0x0000000000504bc5 in fridgethr_start_routine (arg=0x7f3d3ffca680)
at /data/home/richardyao/workspace/github/nfs-ganesha/src/support/fridgethr.c:550#22 0x00007f3d8fadedc5 in start_thread () from /lib64/libpthread.so.0#23 0x00007f3d8ec9474d in clone () from /lib64/libc.so.6
(gdb)
(gdb) f 6#6 0x00007f3d7f0a1110 in rgw::RGWFileHandle::reclaim (this=0x7f3d5e0db200)
at /data/home/richardyao/workspace/github/ceph/src/rgw/rgw_file.cc:1014
1014 fs->fh_cache.remove(fh.fh_hk.object, this, FHCache::FLAG_LOCK);
(gdb) print fs->fh_cache.part[2].lock$7 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 2, __count = 0, __owner = 1742, __nusers = 1, __kind = 0,
__spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\316\006\000\000\001", '\000' <repeats 26 times>,
__align = 2}}, <No data fields>}
(gdb)
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#20034 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASY56vjcXkpbHTCuE9YdU9VREwmQNHDks5tMYmXgaJpZM4RlACQ>
.
|
We should change the flag in RGWFileHandle::reclaim instead of RGWFileHandle::~RGWFileHandle()
|
Ah. That would do it.
Matt
…On Sat, Jan 20, 2018 at 1:56 AM, Yao Zongyou ***@***.***> wrote:
We should change the flag in RGWFileHandle::reclaim instead of
RGWFileHandle::~RGWFileHandle()
The stack trace seems to have flag_lock in the call to fh_cache.remove. It
should be flag_none, yes?
I'll reread tomorrow. Thanks for testing.
Matt
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#20034 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASY57yJ3Qjubw0sX2rEIkkrGirYdMUaks5tMY4mgaJpZM4RlACQ>
.
--
Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103
http://www.redhat.com/en/technologies/storage
tel. 734-821-5101
fax. 734-769-8938
cel. 734-216-5309
|
3cc7007
to
cae39c9
Compare
@yaozongyou update; It looks to me as if the cond-unlink belongs in the RGWFileHandle dtor, but NOT in the reclaim method; that method predates the factory method which I now have calling the RGWFileHandle dtor--which is more symmetrical; reclaim() should just indicate if we think an object must not be reclaimed; i.e., it seems clear that reclaim() and ~RGWFileHandle both attempted the conditional unlink--presumably in the reclaim path, we were always unlinking in reclaim first, but suggesting we don't anymore? |
cae39c9
to
7f27d3a
Compare
@yaozongyou actually, I think you're right that having unlink in reclaim() is preferable. updated |
This change is an alternate fix for two problems found and fixed by Yao Zongyou <yaozongyou@vip.qq.com>. The deadlock can be avoided just by not taking it in the recycle case, which invariantly holds the lock. The invalidation of the insert iterator by the recyle-path unlink we'd like to handle as a condition in order to preserve the cached insertion point optimization we get in the common case. (The original behavior was, indeed, incorrect.) Based on feedback from Yao, removed the RGWFileHandle dtor version of the unlink check, which I think happened twice. Signed-off-by: Matt Benjamin <mbenjamin@redhat.com>
7f27d3a
to
3cf0880
Compare
tested, it works greatly well in my environment. |
59d0740
to
3cf0880
Compare
(accidentally pushed additional commits, reverted to top commit 3cf0880) |
@mattbenjamin I'm guessing this is the fix for https://tracker.ceph.com/issues/22736 ? |
@smithfarm right, sorry to leave that off |
Fixes: https://tracker.ceph.com/issues/22736
This change is an alternate fix for two problems found and fixed
by Yao Zongyou yaozongyou@vip.qq.com.
The deadlock can be avoided just by not taking it in the recycle
case, which invariantly holds the lock.
The invalidation of the insert iterator by the recyle derived dtor
we'd like to handle as a condition in order to preserve the cached
insertion point optimization we get in the common case. (The
original behavior was, indeed, incorrect.)
Signed-off-by: Matt Benjamin mbenjamin@redhat.com