Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mds: Kill C_SaferCond in evict_sessions() #9971

Merged
merged 1 commit into from Jul 6, 2016
Merged

Conversation

fullerdj
Copy link
Contributor

MDSRankDispatcher::evict_sessions waits on a C_SaferCond for
kill_session to complete on each of its victims. Change the
command handling flow to pass command messages all the way down
to MDSRankDispatcher. Extract the MDSDaemon's reply path into a
static function callable from a new context in the MDSRankDispatcher.

See: http://tracker.ceph.com/issues/16288
Signed-off-by: Douglas Fuller dfuller@redhat.com

@gregsfortytwo
Copy link
Member

NB: make check failure is bluestore unit tests.

@@ -587,6 +587,33 @@ void MDSDaemon::tick()
}
}

void MDSDaemon::send_command_reply(MCommand *m, MDSRank *mds_rank,
int r, bufferlist outbl,
std::string outs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string should probably be a const reference?

@gregsfortytwo
Copy link
Member

Looks good to me; just a few nits. Since we test commands in the nightlies this should have plenty of coverage, but you did test it locally, right?

@fullerdj
Copy link
Contributor Author

Comments addressed, should I run this through a suite or something?

@gregsfortytwo
Copy link
Member

There's probably something available under vstart-runner that tests the commands. If that passes we can mark it as needs-qa and let @jcsp or somebody bundle it up in a testing branch.
(Or you can schedule one if it you like, especially if there's an empty queue!)

@jcsp
Copy link
Contributor

jcsp commented Jun 30, 2016

I am seeing a hang inside evict sessions when running TestVolumeClient:
http://qa-proxy.ceph.com/teuthology/jspray-2016-06-29_18:28:44-fs-wip-jcsp-testing-20160629-distro-basic-mira/283892/teuthology.log

@fullerdj can you reproduce locally with vstart_runner?

@fullerdj
Copy link
Contributor Author

Yes, I've been working on that.

On Jun 30, 2016, at 7:33 AM, John Spray notifications@github.com wrote:

I am seeing a hang inside evict sessions when running TestVolumeClient:
http://qa-proxy.ceph.com/teuthology/jspray-2016-06-29_18:28:44-fs-wip-jcsp-testing-20160629-distro-basic-mira/283892/teuthology.log

@fullerdj can you reproduce locally with vstart_runner?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@fullerdj fullerdj force-pushed the wip-djf-16288 branch 2 times, most recently from 71eda15 to 1da3d6a Compare June 30, 2016 17:55
@fullerdj
Copy link
Contributor Author

This resolves the hang I was seeing, but it looks like you may have encountered a different case. It looks like you are seeing a nearly identical case to http://tracker.ceph.com/issues/16288, but I have never been able to reproduce that one.

MDSRankDispatcher::evict_sessions waits on a C_SaferCond for
kill_session to complete on each of its victims. Change the
command handling flow to pass command messages all the way down
to MDSRankDispatcher. Extract the MDSDaemon's reply path into a
static function callable from a new context in the MDSRankDispatcher.

See: http://tracker.ceph.com/issues/16288
Signed-off-by: Douglas Fuller <dfuller@redhat.com>
@jcsp jcsp merged commit 835fdca into ceph:master Jul 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix cephfs Ceph File System
Projects
None yet
3 participants