-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mds: Client syncfs is slow (waits for next MDS tick) #15544
Conversation
src/client/Client.cc
Outdated
@@ -5799,6 +5817,7 @@ void Client::unmount() | |||
|
|||
while (!mds_requests.empty()) { | |||
ldout(cct, 10) << "waiting on " << mds_requests.size() << " requests" << dendl; | |||
flush_mdlog_sync(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should only call flush_mdlog_sync() once when there are pending request or flushing caps. mount_cond can get signaled prematurely, calling flush_mdlog_sync() each time mount_cond get signaled creates unnecessary overhead on mds. Besides, we should call flush_mdlog_sync() after flushing dirty caps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 we are inside the loop "while(!mds_requests.empty())", so i think this means there are at least 1 pending requests ?
2 client_clock is still hold here until next statement, so I don't understand why mount_cond would get signaled prematurely.
3 if we don't add flush_mdlog_sync() here, in this while loop, then next statement (mount_cond.Wait(client_lock)) will still have to wait for next mds tick, right?
4 you said flushing dirty caps, flush_caps_sync() is used to flush dirty caps, right ? and it is been called later in function "Client::unmount";
may be i misunderstand your comment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handle_client_reply() signals mount_cond. If there are N pending requests, flush_mdlog_sync() get called N times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is reducing the messages/requests that trigger mdlog->flush to as few as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, you are right, flush_mdlog_sync() in the while loop would be trigged N times
I should call this before this while loop and add check in flush_mdlog_sync
This has passed tests + is ready to merge, @taodd please could you clean up the commit messages, they should be something like this:
Have a look at other recent commits in the repository for examples. |
Fixes: http://tracker.ceph.com/issues/20129 Signed-off-by: dongdong tao <tdd21151186@gmail.com>
Fixes: http://tracker.ceph.com/issues/20129 Signed-off-by: dongdong tao <tdd21151186@gmail.com>
updated the commit message |
@ukernel @taodd anyone have any thoughts about how to handle upgrades? I should have asked this before merging really :-), but we definitely need an answer before we release luminous -- MDSs currently abort if they see an unexpected CEPH_SESSION_* in a MClientSession, so the client probably needs a way to avoid sending it to older MDSs. |
yes, we should, how do we identify older MDSs ? |
Turns out we already had a handy luminous feature bit to check, I've created a PR to handle upgrades #15805 |
tracker url: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao tdd21151186@gmail.com