New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: miscellaneous multimds fixes #13698

Merged
merged 17 commits into from Apr 11, 2017

Conversation

Projects
None yet
2 participants
@ukernel
Member

ukernel commented Feb 28, 2017

No description provided.

@ukernel ukernel changed the title from [DNM]mds: miscellaneous multimds fixes to mds: miscellaneous multimds fixes Mar 23, 2017

ukernel added some commits Feb 21, 2017

mds: properly record dirty sessionmap in log segment
rename may dirty sessionmap. If sessionmap get dirtied, sessionmap
version should be recorded in corresponding log segment. Otherwise,
sessionmap doesn't get flushed properly when trimming log segments

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
client: hold reference for newly updated snaprealm
Client::update_snap_trace() may create new snaprealm, then update
them. When Client::update_snap_trace() return, the newly created
snaprealm get freed immediately. This is wrong because callers of
Client::update_snap_trace() expects Client::get_snap_realm() return
the updated snaprealm.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: don't ask peer to traverse old backtrace after encountering error
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: fix mds gets stuck in clientreplay state
When client request in clientreplay queue finishes, we should call
MDSRank::queue_one_replay(). Otherwise mds gets stuck in clientreplay
state. There are several cases that client request in clientreplay
queue finishes, but MDSRank::queue_one_replay() does not get called

To make the code clear, add a flag to MClientRequest to indicate if
it's in clientreplay queue.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: optimize state check of peer mds
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: don't break order of inter-dependent requests during mds recovers
If there is a recovering mds who replcated an object when it failed
and scatterlock in the object was in MIX state, It's possible that
the recovering mds needs to take wrlock on the scatterlock when it
replays unsafe requests. The surviver mds should delay taking rdlock
on the scatterlock for new requests. Otherwise new request may get
processed before replaying the unsafe requests. For example:

The recovering mds is auth mds of dirfrag, the survivor mds is auth
mds of correspinding inode. When 'rm -rf' the direcotry, the rmdir
request should get processed after the recovering mds replays unsafe
unlink requests.

To handle this corner case, add a flag to ScatterLock to indicate
if it was in MIX state when the recovering mds failed. If the flag
is set, delay taking rdlock on the scatterlock until the recovering
mds become active.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: handle race between stray reintegration and rmdir
It's possible that stray reintegration tries moving stray to a
removed directory.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: properly put PIN_IMPORTBOUND when import aborts
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: proper trim stopping mds's mdsdir inode
previous code does not work when only mdsdir inode is replicated
(mdsdir dirfrag is not replicated)

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: trim stopping mds' log after exporting strays and subtrees
exporting strays/subtrees needs to submit new log entries. trimming
log after exporting strays/subtrees is better.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: handle target mds failure when stopping mds exports strays
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: close client sessions after stopping mds exports all subtrees
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: speed-up subtree exporting during mds shutdown
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
mds: move inode to proper snaprealm during rename
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
client: re-send request to other mds if target mds stops
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
client: avoid choosing stopped mds as request target
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 10, 2017

Looks like there are a couple of new commits here, so will re-test.

@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 10, 2017

@ukernel please can you put any followups in a separate PR so that I can get this one tested and merged

@ukernel

This comment has been minimized.

Member

ukernel commented Apr 10, 2017

which commits are not tested, I can remove them

@jcsp

This comment has been minimized.

Contributor

jcsp commented Apr 10, 2017

It's the last three commits (i.e. everything up to and including 6e5b2fd is okay)

@ukernel

This comment has been minimized.

Member

ukernel commented Apr 11, 2017

new commits removed

@jcsp jcsp merged commit 93f1b90 into ceph:master Apr 11, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment