Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: dbus: Prevent lrmd from hanging on dbus calls #1201

Merged
merged 1 commit into from Dec 15, 2016

Conversation

gao-yan
Copy link
Member

@gao-yan gao-yan commented Dec 14, 2016

With "service" class of resources, by chance, lrmd hangs on futex()
syscall:

root@node2:~ # cat /proc/2503/stack
[] futex_wait_queue_me+0xc0/0x130
[] futex_wait+0x163/0x250
[] do_futex+0xe0/0x540
[] SyS_futex+0x6e/0x140
[] entry_SYSCALL_64_fastpath+0x12/0x6d
[] 0xffffffffffffffff

Cluster no longer behaves and cannot recover from the situation.

According to the backtrace, it seems due to the reentrancy of
dbus_connection_dispatch():

(gdb) bt
#0 0x00007f07f7d2e0af in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f07f6c29925 in _dbus_connection_acquire_dispatch (connection=0x13411f0) at dbus-connection.c:4142
#2 0x00007f07f6c2b3bc in dbus_connection_dispatch (connection=connection@entry=0x13411f0) at dbus-connection.c:4577
#3 0x00007f07f8d88e50 in pcmk_dbus_connection_dispatch (connection=connection@entry=0x13411f0, new_status=new_status@entry=DBUS_DISPATCH_DATA_REMAINS, data=data@entry=0x0) at dbus.c:410
#4 0x00007f07f6c29b70 in _dbus_connection_update_dispatch_status_and_unlock (connection=0x13411f0, new_status=DBUS_DISPATCH_DATA_REMAINS) at dbus-connection.c:4346
#5 0x00007f07f6c29f79 in check_for_reply_and_update_dispatch_unlocked (connection=connection@entry=0x13411f0, pending=pending@entry=0x135a8b0) at dbus-connection.c:2355
#6 0x00007f07f6c2a08b in _dbus_connection_block_pending_call (pending=0x135a8b0) at dbus-connection.c:2461
#7 0x00007f07f6c396ba in dbus_pending_call_block (pending=) at dbus-pending-call.c:741
#8 0x00007f07f8d8929c in pcmk_dbus_send_recv (msg=msg@entry=0x1340940, connection=0x13411f0, error=error@entry=0x7ffc5d148fc0, timeout=-1) at dbus.c:141
#9 0x00007f07f8d8d2d7 in systemd_unit_by_name (arg_name=arg_name@entry=0x133dcb0 "service", op=op@entry=0x0) at systemd.c:296
#10 0x00007f07f8d8d45b in systemd_unit_exists (name=name@entry=0x133dcb0 "service") at systemd.c:416
#11 0x00007f07f8d83dc5 in resources_find_service_class (agent=0x133dcb0 "service") at services.c:88
#12 0x0000000000405b05 in action_complete (action=0x134e0b0) at lrmd.c:876
#13 0x00007f07f8d867e3 in operation_finalize (op=0x134e0b0) at services_linux.c:257
#14 0x00007f07f8d899d8 in pcmk_dbus_lookup_result (reply=reply@entry=0x135cc80, data=data@entry=0x1355e30) at dbus.c:289
#15 0x00007f07f8d89ba4 in pcmk_dbus_lookup_cb (pending=, user_data=0x1355e30) at dbus.c:334
#16 0x00007f07f6c28032 in complete_pending_call_and_unlock (connection=0x13411f0, pending=0x135a2c0, message=) at dbus-connection.c:2331
#17 0x00007f07f6c2b401 in dbus_connection_dispatch (connection=connection@entry=0x13411f0) at dbus-connection.c:4626
#18 0x00007f07f8d88e50 in pcmk_dbus_connection_dispatch (connection=connection@entry=0x13411f0, new_status=new_status@entry=DBUS_DISPATCH_DATA_REMAINS, data=data@entry=0x0) at dbus.c:410
#19 0x00007f07f6c29b70 in _dbus_connection_update_dispatch_status_and_unlock (connection=0x13411f0, new_status=DBUS_DISPATCH_DATA_REMAINS) at dbus-connection.c:4346
#20 0x00007f07f6c29ca6 in _dbus_connection_handle_watch (watch=, condition=1, data=0x13411f0) at dbus-connection.c:1520
#21 0x00007f07f6c40f2a in dbus_watch_handle (watch=watch@entry=0x133d6a0, flags=flags@entry=1) at dbus-watch.c:722
#22 0x00007f07f8d887da in pcmk_dbus_watch_dispatch (userdata=0x133d6a0) at dbus.c:448
#23 0x00007f07f8fcfef7 in mainloop_gio_callback (gio=, condition=G_IO_IN, data=0x133f210) at mainloop.c:673
#24 0x00007f07f82a0015 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#25 0x00007f07f82a0388 in ?? () from /usr/lib64/libglib-2.0.so.0
#26 0x00007f07f82a064a in g_main_loop_run () from /usr/lib64/libglib-2.0.so.0
#27 0x0000000000402c0e in main (argc=, argv=0x7ffc5d149818) at main.c:476

As described in:
https://dbus.freedesktop.org/doc/api/html/group__DBusConnection.html#ga55ff88cd22c0672441c7deffbfb68fbf

, dbus_connection_dispatch() MUST NOT BE CALLED from inside the
DBusDispatchStatusFunction. It seems that pcmk_dbus_watch_dispatch() is
an appropriate place to do it instead.

@gao-yan
Copy link
Member Author

gao-yan commented Dec 14, 2016

I just realized there's the relevant discussion on this in #1140. Not sure if you already have any progress on this. In case not, this is an idea on the potential solution. I'll do more test on it.

@kgaillot
Copy link
Contributor

I just realized there's the relevant discussion on this in #1140. Not sure if you already have any progress on this. In case not, this is an idea on the potential solution. I'll do more test on it.

I was about to mention that comment. I've been unable to spend time on my systemd project (I just finished another big project, and was going to pick it up again soon), so this is a very helpful start. :)


while (dbus_connection_get_dispatch_status(connection) == DBUS_DISPATCH_DATA_REMAINS) {
crm_trace("Dispatching for connection %p", connection);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the unlikely case that we get DBUS_DISPATCH_NEED_MEMORY here, we'll end up freeing the list entry even though we didn't dispatch it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's supposed to change back to DBUS_DISPATCH_DATA_REMAINS once it's ready again. So it's always a ready-to-dispatch list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but if it returns NEED_MEMORY, the loop terminates, and the entire list is freed -- so we'll never check it again, even if enough memory gets freed to do so.

Copy link
Member Author

@gao-yan gao-yan Dec 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list is freed because it's no longer ready to dispatch ;) As their documentation says, the DBusDispatchStatusFunction will "be invoked when the dispatch status changes". So if there's enough memory later, the dispatch status should be changed back to DBUS_DISPATCH_DATA_REMAINS. In our case, pcmk_dbus_connection_dispatch_status() is supposed to be invoked again, which will add the connection back into the list again. Probably I'm misunderstanding it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't think of that. Makes sense.

if (new_status == DBUS_DISPATCH_DATA_REMAINS){
dbus_connection_dispatch(connection);
conn_dispatches = g_list_append(conn_dispatches, connection);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

g_list_append() has really bad performance, so unless the order is important, we should always prefer g_list_prepend()

@@ -507,6 +523,8 @@ pcmk_dbus_watch_dispatch(gpointer userdata)
crm_err("DBus encountered OOM while attempting to dispatch %p (%s)",
client, dbus_watch_flags_to_string(flags));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should have an "else" here -- if we don't have enough memory to do dbus_watch_handle(), we probably won't have enough to dispatch either, and might as well wait for the next time. although, it's less than ideal already, that if we get oom, we wait until there's more data on the fd before trying again. of course, if we get oom, we'll probably segfault or fast-exit before long anyway, so maybe it's not worth worrying about

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

With "service" class of resources, by chance, lrmd hangs on futex()
syscall:

root@node2:~ # cat /proc/2503/stack
[<ffffffff810fa0c0>] futex_wait_queue_me+0xc0/0x130
[<ffffffff810faf23>] futex_wait+0x163/0x250
[<ffffffff810fc870>] do_futex+0xe0/0x540
[<ffffffff810fcd3e>] SyS_futex+0x6e/0x140
[<ffffffff815e142e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[<ffffffffffffffff>] 0xffffffffffffffff

Cluster no longer behaves and cannot recover from the situation.

According to the backtrace, it seems due to the reentrancy of
dbus_connection_dispatch():

 (gdb) bt
 #0  0x00007f07f7d2e0af in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
 ClusterLabs#1  0x00007f07f6c29925 in _dbus_connection_acquire_dispatch (connection=0x13411f0) at dbus-connection.c:4142
 #2  0x00007f07f6c2b3bc in dbus_connection_dispatch (connection=connection@entry=0x13411f0) at dbus-connection.c:4577
 #3  0x00007f07f8d88e50 in pcmk_dbus_connection_dispatch (connection=connection@entry=0x13411f0, new_status=new_status@entry=DBUS_DISPATCH_DATA_REMAINS, data=data@entry=0x0) at dbus.c:410
 ClusterLabs#4  0x00007f07f6c29b70 in _dbus_connection_update_dispatch_status_and_unlock (connection=0x13411f0, new_status=DBUS_DISPATCH_DATA_REMAINS) at dbus-connection.c:4346
 ClusterLabs#5  0x00007f07f6c29f79 in check_for_reply_and_update_dispatch_unlocked (connection=connection@entry=0x13411f0, pending=pending@entry=0x135a8b0) at dbus-connection.c:2355
 ClusterLabs#6  0x00007f07f6c2a08b in _dbus_connection_block_pending_call (pending=0x135a8b0) at dbus-connection.c:2461
 ClusterLabs#7  0x00007f07f6c396ba in dbus_pending_call_block (pending=<optimized out>) at dbus-pending-call.c:741
 ClusterLabs#8  0x00007f07f8d8929c in pcmk_dbus_send_recv (msg=msg@entry=0x1340940, connection=0x13411f0, error=error@entry=0x7ffc5d148fc0, timeout=-1) at dbus.c:141
 ClusterLabs#9  0x00007f07f8d8d2d7 in systemd_unit_by_name (arg_name=arg_name@entry=0x133dcb0 "service", op=op@entry=0x0) at systemd.c:296
 ClusterLabs#10 0x00007f07f8d8d45b in systemd_unit_exists (name=name@entry=0x133dcb0 "service") at systemd.c:416
 ClusterLabs#11 0x00007f07f8d83dc5 in resources_find_service_class (agent=0x133dcb0 "service") at services.c:88
 ClusterLabs#12 0x0000000000405b05 in action_complete (action=0x134e0b0) at lrmd.c:876
 ClusterLabs#13 0x00007f07f8d867e3 in operation_finalize (op=0x134e0b0) at services_linux.c:257
 ClusterLabs#14 0x00007f07f8d899d8 in pcmk_dbus_lookup_result (reply=reply@entry=0x135cc80, data=data@entry=0x1355e30) at dbus.c:289
 ClusterLabs#15 0x00007f07f8d89ba4 in pcmk_dbus_lookup_cb (pending=<optimized out>, user_data=0x1355e30) at dbus.c:334
 ClusterLabs#16 0x00007f07f6c28032 in complete_pending_call_and_unlock (connection=0x13411f0, pending=0x135a2c0, message=<optimized out>) at dbus-connection.c:2331
 ClusterLabs#17 0x00007f07f6c2b401 in dbus_connection_dispatch (connection=connection@entry=0x13411f0) at dbus-connection.c:4626
 ClusterLabs#18 0x00007f07f8d88e50 in pcmk_dbus_connection_dispatch (connection=connection@entry=0x13411f0, new_status=new_status@entry=DBUS_DISPATCH_DATA_REMAINS, data=data@entry=0x0) at dbus.c:410
 ClusterLabs#19 0x00007f07f6c29b70 in _dbus_connection_update_dispatch_status_and_unlock (connection=0x13411f0, new_status=DBUS_DISPATCH_DATA_REMAINS) at dbus-connection.c:4346
 ClusterLabs#20 0x00007f07f6c29ca6 in _dbus_connection_handle_watch (watch=<optimized out>, condition=1, data=0x13411f0) at dbus-connection.c:1520
 ClusterLabs#21 0x00007f07f6c40f2a in dbus_watch_handle (watch=watch@entry=0x133d6a0, flags=flags@entry=1) at dbus-watch.c:722
 ClusterLabs#22 0x00007f07f8d887da in pcmk_dbus_watch_dispatch (userdata=0x133d6a0) at dbus.c:448
 ClusterLabs#23 0x00007f07f8fcfef7 in mainloop_gio_callback (gio=<optimized out>, condition=G_IO_IN, data=0x133f210) at mainloop.c:673
 ClusterLabs#24 0x00007f07f82a0015 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
 ClusterLabs#25 0x00007f07f82a0388 in ?? () from /usr/lib64/libglib-2.0.so.0
 ClusterLabs#26 0x00007f07f82a064a in g_main_loop_run () from /usr/lib64/libglib-2.0.so.0
 ClusterLabs#27 0x0000000000402c0e in main (argc=<optimized out>, argv=0x7ffc5d149818) at main.c:476

As described in:
https://dbus.freedesktop.org/doc/api/html/group__DBusConnection.html#ga55ff88cd22c0672441c7deffbfb68fbf

, dbus_connection_dispatch() MUST NOT BE CALLED from inside the
DBusDispatchStatusFunction. It seems that pcmk_dbus_watch_dispatch() is
an appropriate place to do it instead.
@kgaillot kgaillot merged commit 7b95089 into ClusterLabs:master Dec 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants