Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: Client syncfs is slow (waits for next MDS tick) #15544

Merged
merged 2 commits into from Jun 19, 2017

Conversation

Projects
None yet
3 participants
@taodd
Copy link
Member

taodd commented Jun 7, 2017

tracker url: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao tdd21151186@gmail.com

@taodd taodd changed the title Fix issue: Client syncfs is slow (waits for next MDS tick) mds: Client syncfs is slow (waits for next MDS tick) Jun 7, 2017

@jcsp jcsp requested a review from ukernel Jun 7, 2017

@@ -5799,6 +5817,7 @@ void Client::unmount()

while (!mds_requests.empty()) {
ldout(cct, 10) << "waiting on " << mds_requests.size() << " requests" << dendl;
flush_mdlog_sync();

This comment has been minimized.

Copy link
@ukernel

ukernel Jun 7, 2017

Member

we should only call flush_mdlog_sync() once when there are pending request or flushing caps. mount_cond can get signaled prematurely, calling flush_mdlog_sync() each time mount_cond get signaled creates unnecessary overhead on mds. Besides, we should call flush_mdlog_sync() after flushing dirty caps

This comment has been minimized.

Copy link
@taodd

taodd Jun 8, 2017

Author Member

1 we are inside the loop "while(!mds_requests.empty())", so i think this means there are at least 1 pending requests ?
2 client_clock is still hold here until next statement, so I don't understand why mount_cond would get signaled prematurely.
3 if we don't add flush_mdlog_sync() here, in this while loop, then next statement (mount_cond.Wait(client_lock)) will still have to wait for next mds tick, right?

4 you said flushing dirty caps, flush_caps_sync() is used to flush dirty caps, right ? and it is been called later in function "Client::unmount";
may be i misunderstand your comment ?

This comment has been minimized.

Copy link
@ukernel

ukernel Jun 8, 2017

Member

handle_client_reply() signals mount_cond. If there are N pending requests, flush_mdlog_sync() get called N times.

This comment has been minimized.

Copy link
@ukernel

ukernel Jun 8, 2017

Member

My point is reducing the messages/requests that trigger mdlog->flush to as few as possible

This comment has been minimized.

Copy link
@taodd

taodd Jun 8, 2017

Author Member

yeah, you are right, flush_mdlog_sync() in the while loop would be trigged N times
I should call this before this while loop and add check in flush_mdlog_sync

@ukernel

ukernel approved these changes Jun 8, 2017

@jcsp

This comment has been minimized.

Copy link
Contributor

jcsp commented Jun 19, 2017

This has passed tests + is ready to merge, @taodd please could you clean up the commit messages, they should be something like this:

client: signal MDS to flush log when doing a syncfs

<...more description text...>

Fixes: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao <tdd21151186@gmail.com>

Have a look at other recent commits in the repository for examples.

@taodd taodd force-pushed the taodd:master branch from 2468008 to 3fa5dcb Jun 19, 2017

taodd added some commits Jun 7, 2017

Client: signal MDS to flush log when doing a syncfs
Fixes: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao <tdd21151186@gmail.com>
client: signal MDS to flush log when doing a syncfs
Fixes: http://tracker.ceph.com/issues/20129
Signed-off-by: dongdong tao <tdd21151186@gmail.com>

@taodd taodd force-pushed the taodd:master branch from 3fa5dcb to 616e763 Jun 19, 2017

@taodd

This comment has been minimized.

Copy link
Member Author

taodd commented Jun 19, 2017

updated the commit message

@jcsp jcsp merged commit 18de794 into ceph:master Jun 19, 2017

2 of 4 checks passed

arm64 make check arm64 make check started
Details
make check running make check
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
@jcsp

This comment has been minimized.

Copy link
Contributor

jcsp commented Jun 19, 2017

@ukernel @taodd anyone have any thoughts about how to handle upgrades? I should have asked this before merging really :-), but we definitely need an answer before we release luminous -- MDSs currently abort if they see an unexpected CEPH_SESSION_* in a MClientSession, so the client probably needs a way to avoid sending it to older MDSs.

@taodd

This comment has been minimized.

Copy link
Member Author

taodd commented Jun 19, 2017

yes, we should, how do we identify older MDSs ?

@jcsp

This comment has been minimized.

Copy link
Contributor

jcsp commented Jun 21, 2017

Turns out we already had a handy luminous feature bit to check, I've created a PR to handle upgrades #15805

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.