Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: Client syncfs is slow (waits for next MDS tick) #15544

Merged
merged 2 commits into from
Jun 19, 2017
Merged

Conversation

taodd
Copy link
Contributor

@taodd taodd commented Jun 7, 2017

tracker url: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao tdd21151186@gmail.com

@taodd taodd changed the title Fix issue: Client syncfs is slow (waits for next MDS tick) mds: Client syncfs is slow (waits for next MDS tick) Jun 7, 2017
@jcsp jcsp added bug-fix cephfs Ceph File System labels Jun 7, 2017
@jcsp jcsp requested a review from ukernel June 7, 2017 10:24
@@ -5799,6 +5817,7 @@ void Client::unmount()

while (!mds_requests.empty()) {
ldout(cct, 10) << "waiting on " << mds_requests.size() << " requests" << dendl;
flush_mdlog_sync();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should only call flush_mdlog_sync() once when there are pending request or flushing caps. mount_cond can get signaled prematurely, calling flush_mdlog_sync() each time mount_cond get signaled creates unnecessary overhead on mds. Besides, we should call flush_mdlog_sync() after flushing dirty caps

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 we are inside the loop "while(!mds_requests.empty())", so i think this means there are at least 1 pending requests ?
2 client_clock is still hold here until next statement, so I don't understand why mount_cond would get signaled prematurely.
3 if we don't add flush_mdlog_sync() here, in this while loop, then next statement (mount_cond.Wait(client_lock)) will still have to wait for next mds tick, right?

4 you said flushing dirty caps, flush_caps_sync() is used to flush dirty caps, right ? and it is been called later in function "Client::unmount";
may be i misunderstand your comment ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_client_reply() signals mount_cond. If there are N pending requests, flush_mdlog_sync() get called N times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is reducing the messages/requests that trigger mdlog->flush to as few as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, you are right, flush_mdlog_sync() in the while loop would be trigged N times
I should call this before this while loop and add check in flush_mdlog_sync

@jcsp
Copy link
Contributor

jcsp commented Jun 19, 2017

This has passed tests + is ready to merge, @taodd please could you clean up the commit messages, they should be something like this:

client: signal MDS to flush log when doing a syncfs

<...more description text...>

Fixes: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao <tdd21151186@gmail.com>

Have a look at other recent commits in the repository for examples.

Fixes: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao <tdd21151186@gmail.com>
Fixes: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao <tdd21151186@gmail.com>
@taodd
Copy link
Contributor Author

taodd commented Jun 19, 2017

updated the commit message

@jcsp jcsp merged commit 18de794 into ceph:master Jun 19, 2017
@jcsp
Copy link
Contributor

jcsp commented Jun 19, 2017

@ukernel @taodd anyone have any thoughts about how to handle upgrades? I should have asked this before merging really :-), but we definitely need an answer before we release luminous -- MDSs currently abort if they see an unexpected CEPH_SESSION_* in a MClientSession, so the client probably needs a way to avoid sending it to older MDSs.

@taodd
Copy link
Contributor Author

taodd commented Jun 19, 2017

yes, we should, how do we identify older MDSs ?

@jcsp
Copy link
Contributor

jcsp commented Jun 21, 2017

Turns out we already had a handy luminous feature bit to check, I've created a PR to handle upgrades #15805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix cephfs Ceph File System
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants