Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd-nbd: fix kernel deadlock during teuthology testing #10985

Merged
merged 4 commits into from
Sep 9, 2016

Conversation

dillaman
Copy link

@dillaman dillaman commented Sep 5, 2016

Fixes: http://tracker.ceph.com/issues/16921
Signed-off-by: Jason Dillaman dillaman@redhat.com

@dillaman
Copy link
Author

dillaman commented Sep 5, 2016

@trociny trociny self-assigned this Sep 6, 2016
trociny pushed a commit that referenced this pull request Sep 6, 2016
rbd-nbd: fix kernel deadlock during teuthology testing #10985
@@ -614,6 +614,8 @@ static int do_map()

server.start();
ioctl(nbd, NBD_DO_IT);
ioctl(nbd, NBD_CLEAR_QUE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillaman NBD_CLEAR_QUE looks like deprecated and nop.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just put these in here to match qemu-nbd's implementation.

@trociny
Copy link
Contributor

trociny commented Sep 7, 2016

@dillaman Still observing hangs running this on teuthology:

http://pulpito.ceph.com/trociny-2016-09-07_07:30:28-rbd-wip-mgolub-testing---basic-mira/
http://pulpito.ceph.com/trociny-2016-09-07_07:29:56-rbd-wip-mgolub-testing---basic-vps/

Also, there were failures:

2016-09-07T07:44:18.420 INFO:teuthology.orchestra.run.vpm021.stdout:rbd_resize(65375232) failed
2016-09-07T07:44:18.420 INFO:teuthology.orchestra.run.vpm021.stdout:do_clone: ops->resize: Device or resource busy

http://pulpito.ceph.com/trociny-2016-09-07_07:31:00-rbd-wip-mgolub-testing---basic-vps/
Though it might be unrelated.

@dillaman
Copy link
Author

dillaman commented Sep 7, 2016

@trociny Definitely a different issue:

[ 7241.278129] block nbd0: Other side returned error (22)
[ 7241.283380] blk_update_request: I/O error, dev nbd0, sector 456576

This implies that rbd-nbd didn't receive an update notification:

$ rbd info pool_client.0/image_client.0-clone15
rbd image 'image_client.0-clone15':
size 176 MB in 45 objects

$ sudo blockdev --getsize64 /dev/nbd0
185068544

The last recorded fsx resize should have made the image 118,824,448 bytes:

2016-09-07T07:56:14.708 INFO:teuthology.orchestra.run.vpm107.stdout:1460 trunc from 0xd404c00 to 0x7151e00

The backtrace from the fsx process shows its in the middle of a write-induced resize, so there must be a race condition between resizing the image and a synchronous read/write operation noticing the change.

@dillaman
Copy link
Author

dillaman commented Sep 7, 2016

The resize is hanging because rbd-nbd (the exclusive lock owner) isn't responding -- and it isn't responding most likely because its librbd thread is deadlocked. It looks like you have gdb attached to it so I cannot see what its doing.

@trociny
Copy link
Contributor

trociny commented Sep 7, 2016

Sorry, yes I tried attaching with gdb but it hanged and letft the terminal opened. I have kiled my session.

@dillaman
Copy link
Author

dillaman commented Sep 7, 2016

@trociny Ok -- same thing is happening to me. I am thoroughly convinced that the nbd block driver has lots of edge conditions. Looking at the deadlock listed on mira023 (which is running a 4.8-rc5 kernel instead of stock), it is showing a deadlock in the nbd block driver shut down logic and looking at the code it is definitely a problem and I don't immediately see a way around it.

@dillaman
Copy link
Author

dillaman commented Sep 9, 2016

@trociny It appears to be stable now -- I updated the tests to not colocate nbd with the OSDs and I fixed the resize failure.

ceph/ceph-qa-suite#1170

@dillaman
Copy link
Author

dillaman commented Sep 9, 2016

trociny pushed a commit that referenced this pull request Sep 9, 2016
rbd-nbd: fix kernel deadlock during teuthology testing #10985
@@ -121,6 +121,9 @@ class ResizeRequest : public Request<ImageCtxT> {
Context *send_append_op_event();
Context *handle_append_op_event(int *result);

void send_flush_cache();
Context *handle_flush_cache(int *result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dillaman Update the diagram?

Jason Dillaman added 4 commits September 9, 2016 08:21
Fixes: http://tracker.ceph.com/issues/16921
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Any potential writeback outside the extents of a shrunk image
would result in orphaned objects.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
@dillaman
Copy link
Author

dillaman commented Sep 9, 2016

@trociny Pushed updates -- the unrelated blacklisting test failure is fixed by PR #11034 and the dynamic_features failure is covered by open ticket http://tracker.ceph.com/issues/17227

@trociny
Copy link
Contributor

trociny commented Sep 9, 2016

lgtm

@trociny trociny merged commit 253cdda into ceph:master Sep 9, 2016
@dillaman dillaman deleted the wip-16921 branch September 9, 2016 13:34
@@ -727,6 +729,7 @@ static int rbd_nbd(int argc, const char *argv[])
env_to_vec(args);
global_init(NULL, args, CEPH_ENTITY_TYPE_CLIENT, CODE_ENVIRONMENT_DAEMON,
CINIT_FLAG_UNPRIVILEGED_DAEMON_DEFAULTS);
g_ceph_context->_conf->set_val_or_die("pid_file", "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jason (and all reading this), I'm working on starting rbd-nbd with systemd. And it is best to let rbd-nbd write a pid file so that systemd can reliably determine the main process. But this config is disabled here. See this comment for more. Do you have any suggestions? thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not go directly and add notify/watchdog support? That would make more sense I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion here. I remember I was not very happy when it was added and we had some discussion. On the other hand for usual cases when the rbd-nbd processes are controlled by rbd map/unmap, which does not need the pid files, it is good to have the pid file disabled by default, to avoid pid file collisions.

So I think a solution could be to set the pidfile only if it is explicitly specified via command line arguments. Though I don't think it is a good place to discuss this issue. I suggest to open a tracker ticket, describing the issue, and the provide a PR with the proposed solution, and we may discuss it there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants