Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileStore: Race condition during object delete is fixed #2510

Merged
merged 1 commit into from Sep 29, 2014

Conversation

somnathr
Copy link

There was a race condition (hence OSD crash) between lfn_unlink
and lfn_open. The reason was FDCache lookup was called without
taking index lock from lfn_open. Lookup will increase reference
count and thus Clear will not be able to delete those FDs. FDs
will be leaked. The assert within FDCache clear was hitting
because of this.

Fixes: #9480

Signed-off-by: Somnath Roy somnath.roy@sandisk.com

*outfd = fdcache.lookup(oid);
if (*outfd) {
((*index).index)->access_lock.put_write();
return 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be only if need_lock ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my bad.. Thanks for catching..
Fixing it..

@ghost
Copy link

ghost commented Sep 16, 2014

It would be nice to have unit tests covering these. In test/filestore/TestFileStore.cc maybe ?

@somnathr somnathr mentioned this pull request Sep 16, 2014
@somnathr
Copy link
Author

Loic, yes, I will write some tests on this. Basically, you want to incorporate different lfn_open() calls, right ?
Also, I am not that familiar about these tests. Are these tests getting triggered on 'make check' ?
If not, could you give some insight how to trigger (and test) my newly written tests ?

@ghost
Copy link

ghost commented Sep 16, 2014

@somnathr they are triggered on make check, indeed. The most difficult part (IMHO) is to create a light weight environment for these tests to run in a meaningfull way.

@somnathr
Copy link
Author

Loic,
I went over it but not sure how I will be calling function like lfn_open() from this level as it needs a collection id. Object id we can generate as we are doing in LFNIndex test. Also, I think testing the fuction like lfn_open() should be the responsibility of the upper level tests that are doing read/write/clone etc.
Let me know your thought on this.

There was a race condition (hence OSD crash) between lfn_unlink
and lfn_open. The reason was FDCache lookup was called without
taking index lock from lfn_open. Lookup will increase reference
count and thus Clear will not be able to delete those FDs. FDs
will be leaked. The assert within FDCache clear was hitting
because of this.

Fixes: ceph#9480

Signed-off-by: Somnath Roy <somnath.roy@sandisk.com>
@somnathr
Copy link
Author

Loic,
Addressed your comment on need_lock guard.

@ghost
Copy link

ghost commented Sep 16, 2014

You can find examples on how to create coll_t for test purposes in other tests.

@somnathr
Copy link
Author

Hmm, tests are using 'meta' container to read/write. So, I guess I can use that..

if (!replaying) {
*outfd = fdcache.lookup(oid);
if (*outfd) {
if (need_lock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (*outfd && need_lock) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad, missed the return

@athanatos
Copy link
Contributor

wip-sam-testing

athanatos pushed a commit that referenced this pull request Sep 29, 2014
FileStore: Race condition during object delete is fixed

Reviewed-by: Samuel Just <sam.just@inktank.com>
@athanatos athanatos merged commit ffda34c into ceph:giant Sep 29, 2014
@athanatos
Copy link
Contributor

This appears to be correct and passed tests. Merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants