New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FileStore: Race condition during object delete is fixed #2510
Conversation
*outfd = fdcache.lookup(oid); | ||
if (*outfd) { | ||
((*index).index)->access_lock.put_write(); | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be only if need_lock ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my bad.. Thanks for catching..
Fixing it..
It would be nice to have unit tests covering these. In test/filestore/TestFileStore.cc maybe ? |
Loic, yes, I will write some tests on this. Basically, you want to incorporate different lfn_open() calls, right ? |
@somnathr they are triggered on make check, indeed. The most difficult part (IMHO) is to create a light weight environment for these tests to run in a meaningfull way. |
Loic, |
There was a race condition (hence OSD crash) between lfn_unlink and lfn_open. The reason was FDCache lookup was called without taking index lock from lfn_open. Lookup will increase reference count and thus Clear will not be able to delete those FDs. FDs will be leaked. The assert within FDCache clear was hitting because of this. Fixes: ceph#9480 Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
295448a
to
86a4bed
Compare
Loic, |
You can find examples on how to create coll_t for test purposes in other tests. |
Hmm, tests are using 'meta' container to read/write. So, I guess I can use that.. |
if (!replaying) { | ||
*outfd = fdcache.lookup(oid); | ||
if (*outfd) { | ||
if (need_lock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (*outfd && need_lock) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my bad, missed the return
wip-sam-testing |
FileStore: Race condition during object delete is fixed Reviewed-by: Samuel Just <sam.just@inktank.com>
This appears to be correct and passed tests. Merging. |
There was a race condition (hence OSD crash) between lfn_unlink
and lfn_open. The reason was FDCache lookup was called without
taking index lock from lfn_open. Lookup will increase reference
count and thus Clear will not be able to delete those FDs. FDs
will be leaked. The assert within FDCache clear was hitting
because of this.
Fixes: #9480
Signed-off-by: Somnath Roy somnath.roy@sandisk.com