Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msg/simple: wait dispatch_queue until all pipes closed #9930

Merged
merged 1 commit into from Aug 29, 2016

Conversation

yuyuyu101
Copy link
Member

@yuyuyu101 yuyuyu101 commented Jun 25, 2016

Now we use dispatch_queue.wait to wait for SimpleMessenger shutdown,
but we need to ensure DispatchQueue can process event after Accepter down,
Otherwise accepter may continue to accept new connection which may queue
new item. so we can't rely on DispatchQueue now.

Introduce stop_cond and stop flag to indicate this function like
AsyncMessenger did

Signed-off-by: Haomai Wang haomai@xsky.com

@gregsfortytwo
Copy link
Member

If we don't stop the dispatch_queue until after all the pipes are closed...can't the DIspatcher try to re-open Connections in order to deliver message responses?
If we're worried about accepting new connections, maybe we need to re-order the Accepter shutdown, but that doesn't mean stopping the DispatchQueue way at the end of the process.

@yuyuyu101
Copy link
Member Author

@gregsfortytwo good comment, updated!

@gregsfortytwo
Copy link
Member

Okay, this is probably good now.

@tchaikov
Copy link
Contributor

tchaikov commented Jul 2, 2016

2016-07-01T23:28:37.185 INFO:tasks.ceph.mon.b.mira089.stderr:/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-11.0.0-252-gae36581/src/msg/simple/SimpleMessenger.cc: 230: FAILED assert(!cleared)
2016-07-01T23:28:37.190 INFO:tasks.ceph.mon.b.mira089.stderr: ceph version v11.0.0-252-gae36581 (ae3658114102a34fd78b7ba68640558df3c111fd)
2016-07-01T23:28:37.193 INFO:tasks.ceph.mon.b.mira089.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x1378e27]
2016-07-01T23:28:37.197 INFO:tasks.ceph.mon.b.mira089.stderr: 2: (SimpleMessenger::reaper()+0x35d) [0x1470187]
2016-07-01T23:28:37.201 INFO:tasks.ceph.mon.b.mira089.stderr: 3: (SimpleMessenger::wait()+0xacb) [0x1473671]
2016-07-01T23:28:37.203 INFO:tasks.ceph.mon.b.mira089.stderr: 4: (main()+0x4872) [0x109a4df]
2016-07-01T23:28:37.205 INFO:tasks.ceph.mon.b.mira089.stderr: 5: (__libc_start_main()+0xf5) [0x7f51e6c18ec5]
2016-07-01T23:28:37.207 INFO:tasks.ceph.mon.b.mira089.stderr: 6: ceph-mon() [0x10937f9]

see

dropped from the test branch for now

@yuriw
Copy link
Contributor

yuriw commented Jul 15, 2016

@yuyuyu101 @tchaikov I took it for testing as see lots of

11.0.0/src/msg/simple/SimpleMessenger.cc: 230: FAILED assert(!cleared)
2016-07-14T18:12:58.818 INFO:tasks.ceph.mon.b.smithi018.stderr: ceph version v11.0.0-633-g061bf04 (061bf04)
2016-07-14T18:12:58.823 INFO:tasks.ceph.mon.b.smithi018.stderr: 1: (ceph::ceph_assert_fail(char const, char const, int, char const*)+0x85) [0x8517e5]
2016-07-14T18:12:58.827 INFO:tasks.ceph.mon.b.smithi018.stderr: 2: (SimpleMessenger::reaper()+0x6ff) [0x8d033f]
2016-07-14T18:12:58.829 INFO:tasks.ceph.mon.b.smithi018.stderr: 3: (SimpleMessenger::wait()+0x49f) [0x8d0b7f]
2016-07-14T18:12:58.845 INFO:tasks.ceph.mon.b.smithi018.stderr: 4: (main()+0x268b) [0x5e756b]
2016-07-14T18:12:58.859 INFO:tasks.ceph.mon.b.smithi018.stderr: 5: (__libc_start_main()+0xf5) [0x7fda39754b15]
2016-07-14T18:12:58.867 INFO:tasks.ceph.mon.b.smithi018.stderr: 6: ceph-mon() [0x648a85]
2016-07-14T18:12:58.874 INFO:tasks.ceph.mon.b.smithi018.stderr: NOTE: a copy of the executable, or objdump -rdS <executable> is needed to interpret this.

in run http://pulpito.ceph.com/yuriw-2016-07-14_17:00:02-rados-wip-yuri-testing2_2016_7_14-distro-basic-smithi/
removed testing and needs-qa tags

Now we use dispatch_queue.wait to wait for SimpleMessenger shutdown,
but we need to ensure DispatchQueue can process event after Accepter down,
Otherwise accepter may continue to accept new connection which may queue
new item. so we can't rely on DispatchQueue now.

Introduce stop_cond and stop flag to indicate this function like
AsyncMessenger did

Fixes: http://tracker.ceph.com/issues/16472
Signed-off-by: Haomai Wang <haomai@xsky.com>
@yuyuyu101
Copy link
Member Author

retest this please

@yuyuyu101 yuyuyu101 deleted the wip-16472 branch August 30, 2016 02:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants