Skip to content

Commit

Permalink
iothread: workaround glib bug which hangs qmp-test
Browse files Browse the repository at this point in the history
Free the AIO context earlier than the GMainContext (if we have) to
workaround a glib2 bug that GSource context pointer is not cleared even
if the context has already been destroyed (while it should).

The patch itself only changed the order to destroy the objects, no
functional change at all. Without this workaround, we can encounter
qmp-test hang with oob (and possibly any other use case when iothread is
used with GMainContexts):

  #0  0x00007f35ffe45334 in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00007f35ffe405d8 in _L_lock_854 () from /lib64/libpthread.so.0
  #2  0x00007f35ffe404a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
  qemu#3  0x00007f35fc5b9c9d in g_source_unref_internal (source=0x24f0600, context=0x7f35f0000960, have_lock=0) at gmain.c:1685
  qemu#4  0x0000000000aa6672 in aio_context_unref (ctx=0x24f0600) at /root/qemu/util/async.c:497
  qemu#5  0x000000000065851c in iothread_instance_finalize (obj=0x24f0380) at /root/qemu/iothread.c:129
  qemu#6  0x0000000000962d79 in object_deinit (obj=0x24f0380, type=0x242e960) at /root/qemu/qom/object.c:462
  qemu#7  0x0000000000962e0d in object_finalize (data=0x24f0380) at /root/qemu/qom/object.c:476
  qemu#8  0x0000000000964146 in object_unref (obj=0x24f0380) at /root/qemu/qom/object.c:924
  qemu#9  0x0000000000965880 in object_finalize_child_property (obj=0x24ec640, name=0x24efca0 "mon_iothread", opaque=0x24f0380) at /root/qemu/qom/object.c:1436
  qemu#10 0x0000000000962c33 in object_property_del_child (obj=0x24ec640, child=0x24f0380, errp=0x0) at /root/qemu/qom/object.c:436
  qemu#11 0x0000000000962d26 in object_unparent (obj=0x24f0380) at /root/qemu/qom/object.c:455
  qemu#12 0x0000000000658f00 in iothread_destroy (iothread=0x24f0380) at /root/qemu/iothread.c:365
  qemu#13 0x00000000004c67a8 in monitor_cleanup () at /root/qemu/monitor.c:4663
  qemu#14 0x0000000000669e27 in main (argc=16, argv=0x7ffc8b1ae2f8, envp=0x7ffc8b1ae380) at /root/qemu/vl.c:4749

The glib2 bug is fixed in commit 26056558b ("gmain: allow
g_source_get_context() on destroyed sources", 2012-07-30), so the first
good version is glib2 2.33.10. But we still support building with
glib as old as 2.28, so we need the workaround.

Let's make sure we destroy the GSources first before its owner context
until we drop support for glib older than 2.33.10.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180409083956.1780-1-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
  • Loading branch information
xzpeter authored and ebblake committed Apr 10, 2018
1 parent c398851 commit 1554434
Showing 1 changed file with 14 additions and 4 deletions.
18 changes: 14 additions & 4 deletions iothread.c
Expand Up @@ -117,16 +117,26 @@ static void iothread_instance_finalize(Object *obj)
IOThread *iothread = IOTHREAD(obj);

iothread_stop(iothread);
/*
* Before glib2 2.33.10, there is a glib2 bug that GSource context
* pointer may not be cleared even if the context has already been
* destroyed (while it should). Here let's free the AIO context
* earlier to bypass that glib bug.
*
* We can remove this comment after the minimum supported glib2
* version boosts to 2.33.10. Before that, let's free the
* GSources first before destroying any GMainContext.
*/
if (iothread->ctx) {
aio_context_unref(iothread->ctx);
iothread->ctx = NULL;
}
if (iothread->worker_context) {
g_main_context_unref(iothread->worker_context);
iothread->worker_context = NULL;
}
qemu_cond_destroy(&iothread->init_done_cond);
qemu_mutex_destroy(&iothread->init_done_lock);
if (!iothread->ctx) {
return;
}
aio_context_unref(iothread->ctx);
}

static void iothread_complete(UserCreatable *obj, Error **errp)
Expand Down

0 comments on commit 1554434

Please sign in to comment.