Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/ObjectStore: properly clear object map when replaying OP_REMOVE #11388

Merged
merged 2 commits into from Oct 24, 2016

Conversation

Projects
None yet
6 participants
@ukernel
Copy link
Member

commented Oct 10, 2016

To remove an object, filestore needs to unlink corresponding object
file from filesystem and removes corresponding object keys from
DBObjectMap. When replaying OP_REMOVE operation, it's possible the
operation has completed partially, object file has been deleted, but
object keys in DBObjectMap hasn't.

The fix is force clear object keys if object file does not exists

Fixes: http://tracker.ceph.com/issues/17177
Signed-off-by: Yan, Zheng zyan@redhat.com

@tchaikov tchaikov self-assigned this Oct 10, 2016

fdcache.clear(o);
return 0;
} else if (hardlink == 1) {
if (hardlink == 0 || hardlink == 1) {

This comment has been minimized.

Copy link
@tchaikov

tchaikov Oct 10, 2016

Contributor

we always clear omap first, and then unlink the object file. so, if the hardlink is 1 here, we should have cleared the omap already, am i right?

This comment has been minimized.

Copy link
@ukernel

ukernel Oct 14, 2016

Author Member

why? we get object file's hardlink count before clearing omap and unlinking object file. hardlink == 1 is the most common case

This comment has been minimized.

Copy link
@liewegas

liewegas Oct 14, 2016

Member

So to clarify the situation is

  • clear omap
  • unlink
  • unlink persists to disk, but omap does not
  • crash
  • replay sees hardlink 0 and doesn't clear omap

?

This comment has been minimized.

Copy link
@ukernel

ukernel Oct 15, 2016

Author Member

yes

@ukernel ukernel force-pushed the ukernel:wip-17177 branch from f60b2f9 to f56288e Oct 10, 2016

ukernel added some commits Oct 10, 2016

os/ObjectStore: properly clear object map when replaying OP_REMOVE
To remove an object, filestore needs to unlink corresponding object
file from filesystem and removes corresponding object keys from
DBObjectMap. When replaying OP_REMOVE operation, it's possible the
operation has completed partially, object file has been deleted, but
object keys in DBObjectMap hasn't.

The fix is force clear object keys if object file does not exists

Fixes: http://tracker.ceph.com/issues/17177
Signed-off-by: Yan, Zheng <zyan@redhat.com>
os/ObjectStore: properly clone object map when replaying OP_COLL_MOVE…
…_RENAME

FileStore::_close_replay_guard does not sync the object map. If OSD
crashes while executing FileStore::_collection_move_rename, it's
possible that the replay guard is set, but the object map map update
gets lost. When recovering, OSD checks the replay guard and does
nothing.

The fix is sync the object map in FileStore::_close_replay_guard()

Signed-off-by: Yan, Zheng <zyan@redhat.com>

@ukernel ukernel force-pushed the ukernel:wip-17177 branch from f56288e to c66e466 Oct 13, 2016

@ukernel

This comment has been minimized.

Copy link
Member Author

commented Oct 14, 2016

@liewegas liewegas added the needs-qa label Oct 15, 2016

@liewegas

This comment has been minimized.

Copy link
Member

commented Oct 15, 2016

lgtm!

@athanatos

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2016

lgtm! (Whoa, good catch!)

@tchaikov

This comment has been minimized.

Copy link
Contributor

commented Oct 18, 2016

lgtm also.

@badone

This comment has been minimized.

Copy link
Contributor

commented Oct 18, 2016

Thanks @ukernel Hopefully we can get this merged soon.

@yuriw yuriw merged commit 73a1b45 into ceph:master Oct 24, 2016

2 checks passed

Signed-off-by all commits in this PR are signed
Details
default Build finished.
Details

@ukernel ukernel deleted the ukernel:wip-17177 branch Jan 12, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.