fix memop size calculation for i386 #9

ndesh26 · 2018-03-06T17:44:06Z

the memop size is encoded in mop variable and not in idx

ndesh26 · 2018-03-06T17:57:22Z

I am not sure if the second commit is the best way to handle the problem

add callbacks for the helper functions defined in target-i386/mem_helper.c as they do not lead callbacks

Spotted by ASAN: QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124 #2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59 #3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95 #4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393 #5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830 #6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022 #7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40 #8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716 #9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679 #10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Spotted thanks to ASAN: QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest ==30302==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124 #2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203 #3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 qemu#11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 qemu#12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 qemu#13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 qemu#14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 qemu#15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 qemu#16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 qemu#17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 qemu#18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Indirect leak of 45 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94 #2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204 #5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 qemu#11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 qemu#12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 qemu#13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 qemu#14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 qemu#15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 qemu#16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 qemu#17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 qemu#18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 qemu#19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 qemu#20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180306170959.3921-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

This fixes leaks found by ASAN such as: GTESTER tests/test-blockjob ================================================================= ==31442==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129 #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360 #3 0x5584d2732498 in block_job_txn_new /home/elmarco/src/qemu/blockjob.c:172 #4 0x5584d2739b28 in block_job_create /home/elmarco/src/qemu/blockjob.c:973 #5 0x5584d270ae31 in mk_job /home/elmarco/src/qemu/tests/test-blockjob.c:34 #6 0x5584d270b1c1 in do_test_id /home/elmarco/src/qemu/tests/test-blockjob.c:57 #7 0x5584d270b65c in test_job_ids /home/elmarco/src/qemu/tests/test-blockjob.c:118 #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255 #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339 #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351 qemu#11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426 qemu#12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692 qemu#13 0x5584d270d6e2 in main /home/elmarco/src/qemu/tests/test-blockjob.c:377 qemu#14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29) Add an assert to make sure that the job doesn't have associated txn before free(). [Jeff Cody: N.B., used updated patch provided by John Snow] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>

Free the AIO context earlier than the GMainContext (if we have) to workaround a glib2 bug that GSource context pointer is not cleared even if the context has already been destroyed (while it should). The patch itself only changed the order to destroy the objects, no functional change at all. Without this workaround, we can encounter qmp-test hang with oob (and possibly any other use case when iothread is used with GMainContexts): #0 0x00007f35ffe45334 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f35ffe405d8 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007f35ffe404a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f35fc5b9c9d in g_source_unref_internal (source=0x24f0600, context=0x7f35f0000960, have_lock=0) at gmain.c:1685 #4 0x0000000000aa6672 in aio_context_unref (ctx=0x24f0600) at /root/qemu/util/async.c:497 #5 0x000000000065851c in iothread_instance_finalize (obj=0x24f0380) at /root/qemu/iothread.c:129 #6 0x0000000000962d79 in object_deinit (obj=0x24f0380, type=0x242e960) at /root/qemu/qom/object.c:462 #7 0x0000000000962e0d in object_finalize (data=0x24f0380) at /root/qemu/qom/object.c:476 #8 0x0000000000964146 in object_unref (obj=0x24f0380) at /root/qemu/qom/object.c:924 #9 0x0000000000965880 in object_finalize_child_property (obj=0x24ec640, name=0x24efca0 "mon_iothread", opaque=0x24f0380) at /root/qemu/qom/object.c:1436 #10 0x0000000000962c33 in object_property_del_child (obj=0x24ec640, child=0x24f0380, errp=0x0) at /root/qemu/qom/object.c:436 qemu#11 0x0000000000962d26 in object_unparent (obj=0x24f0380) at /root/qemu/qom/object.c:455 qemu#12 0x0000000000658f00 in iothread_destroy (iothread=0x24f0380) at /root/qemu/iothread.c:365 qemu#13 0x00000000004c67a8 in monitor_cleanup () at /root/qemu/monitor.c:4663 qemu#14 0x0000000000669e27 in main (argc=16, argv=0x7ffc8b1ae2f8, envp=0x7ffc8b1ae380) at /root/qemu/vl.c:4749 The glib2 bug is fixed in commit 26056558b ("gmain: allow g_source_get_context() on destroyed sources", 2012-07-30), so the first good version is glib2 2.33.10. But we still support building with glib as old as 2.28, so we need the workaround. Let's make sure we destroy the GSources first before its owner context until we drop support for glib older than 2.33.10. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180409083956.1780-1-peterx@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>

Eric Auger reported the problem days ago that OOB broke ARM when running with libvirt: http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html The problem was that the monitor dispatcher bottom half was bound to qemu_aio_context now, which could be polled unexpectedly in block code. We should keep the dispatchers run in iohandler_ctx just like what we did before the Out-Of-Band series (chardev uses qio, and qio binds everything with iohandler_ctx). If without this change, QMP dispatcher might be run even before reaching main loop in block IO path, for example, in a stack like (the ARM case, "cont" command handler run even during machine init phase): #0 qmp_cont () #1 0x00000000006bd210 in qmp_marshal_cont () #2 0x0000000000ac05c4 in do_qmp_dispatch () #3 0x0000000000ac07a0 in qmp_dispatch () #4 0x0000000000472d60 in monitor_qmp_dispatch_one () #5 0x000000000047302c in monitor_qmp_bh_dispatcher () #6 0x0000000000acf374 in aio_bh_call () #7 0x0000000000acf428 in aio_bh_poll () #8 0x0000000000ad5110 in aio_poll () #9 0x0000000000a08ab8 in blk_prw () #10 0x0000000000a091c4 in blk_pread () qemu#11 0x0000000000734f94 in pflash_cfi01_realize () qemu#12 0x000000000075a3a4 in device_set_realized () qemu#13 0x00000000009a26cc in property_set_bool () qemu#14 0x00000000009a0a40 in object_property_set () qemu#15 0x00000000009a3a08 in object_property_set_qobject () qemu#16 0x00000000009a0c8c in object_property_set_bool () qemu#17 0x0000000000758f94 in qdev_init_nofail () qemu#18 0x000000000058e190 in create_one_flash () qemu#19 0x000000000058e2f4 in create_flash () qemu#20 0x00000000005902f0 in machvirt_init () qemu#21 0x00000000007635cc in machine_run_board_init () qemu#22 0x00000000006b135c in main () Actually the problem is more severe than that. After we switched to the qemu AIO handler it means the monitor dispatcher code can even be called with nested aio_poll(), then it can be an explicit aio_poll() inside another main loop aio_poll() which could be racy too; breaking code like TPM and 9p that use nested event loops. Switch to use the iohandler_ctx for monitor dispatchers. My sincere thanks to Eric Auger who offered great help during both debugging and verifying the problem. The ARM test was carried out by applying this patch upon QEMU 2.12.0-rc0 and problem is gone after the patch. A quick test of mine shows that after this patch applied we can pass all raw iotests even with OOB on by default. CC: Eric Blake <eblake@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> Reported-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180410044942.17059-1-peterx@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>

…nnect When cancel migration during RDMA precopy, the source qemu main thread hangs sometime. The backtrace is: (gdb) bt #0 0x00007f249eabd43d in write () from /lib64/libpthread.so.0 #1 0x00007f24a1ce98e4 in rdma_get_cm_event (channel=0x4675d10, event=0x7ffe2f643dd0) at src/cma.c:2189 #2 0x00000000007b6166 in qemu_rdma_cleanup (rdma=0x6784000) at migration/rdma.c:2296 #3 0x00000000007b7cae in qio_channel_rdma_close (ioc=0x3bfcc30, errp=0x0) at migration/rdma.c:2999 #4 0x00000000008db60e in qio_channel_close (ioc=0x3bfcc30, errp=0x0) at io/channel.c:273 #5 0x00000000007a8765 in channel_close (opaque=0x3bfcc30) at migration/qemu-file-channel.c:98 #6 0x00000000007a71f9 in qemu_fclose (f=0x527c000) at migration/qemu-file.c:334 #7 0x0000000000795b96 in migrate_fd_cleanup (opaque=0x3b46280) at migration/migration.c:1162 #8 0x000000000093a71b in aio_bh_call (bh=0x3db7a20) at util/async.c:90 #9 0x000000000093a7b2 in aio_bh_poll (ctx=0x3b121c0) at util/async.c:118 #10 0x000000000093f2ad in aio_dispatch (ctx=0x3b121c0) at util/aio-posix.c:436 qemu#11 0x000000000093ab41 in aio_ctx_dispatch (source=0x3b121c0, callback=0x0, user_data=0x0) at util/async.c:261 qemu#12 0x00007f249f73c7aa in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 qemu#13 0x000000000093dc5e in glib_pollfds_poll () at util/main-loop.c:215 qemu#14 0x000000000093dd4e in os_host_main_loop_wait (timeout=28000000) at util/main-loop.c:263 qemu#15 0x000000000093de05 in main_loop_wait (nonblocking=0) at util/main-loop.c:522 qemu#16 0x00000000005bc6a5 in main_loop () at vl.c:1944 qemu#17 0x00000000005c39b5 in main (argc=56, argv=0x7ffe2f6443f8, envp=0x3ad0030) at vl.c:4752 It does not get the RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect sometime. According to IB Spec once active side send DREQ message, it should wait for DREP message and only once it arrived it should trigger a DISCONNECT event. DREP message can be dropped due to network issues. For that case the spec defines a DREP_timeout state in the CM state machine, if the DREP is dropped we should get a timeout and a TIMEWAIT_EXIT event will be trigger. Unfortunately the current kernel CM implementation doesn't include the DREP_timeout state and in above scenario we will not get DISCONNECT or TIMEWAIT_EXIT events. So it should not invoke rdma_get_cm_event which may hang forever, and the event channel is also destroyed in qemu_rdma_cleanup. Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

instead of destroying and recreating window, fixes segfault caused by handle_keyup trying to access no more existing window when using Ctrl-Alt-U to restore window "un-scaled" dimensions Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7f92b80 (LWP 3711)] handle_keyup (ev=0x7fffffffd010) at ui/sdl2.c:416 416 scon->ignore_hotkeys = false; (gdb) bt #0 handle_keyup (ev=0x7fffffffd010) at ui/sdl2.c:416 #1 sdl2_poll_events (scon=0x100fee5a8) at ui/sdl2.c:608 #2 0x0000000100585bf2 in dpy_refresh (s=0x101ad3e00) at ui/console.c:1658 #3 gui_update (opaque=0x101ad3e00) at ui/console.c:205 #4 0x0000000100690f2c in timerlist_run_timers (timer_list=0x100ede130) at util/qemu-timer.c:536 #5 0x0000000100691177 in qemu_clock_run_timers (type=QEMU_CLOCK_REALTIME) at util/qemu-timer.c:547 #6 qemu_clock_run_all_timers () at util/qemu-timer.c:674 #7 0x0000000100691651 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:503 #8 0x00000001003d650f in main_loop () at vl.c:1848 #9 0x0000000100289681 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4605 Signed-off-by: Amadeusz Sławiński <amade@asmblr.net> Message-id: 20180613172707.31530-1-amade@asmblr.net Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

With a Spice port chardev, it is possible to reenter monitor_qapi_event_queue() (when the client disconnects for example). This will dead-lock on monitor_lock. Instead, use some TLS variables to check for recursion and queue the events. Fixes: (gdb) bt #0 0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316 #7 0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197 #8 0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197 #9 0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388 qemu#11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347 qemu#12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 qemu#13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341 qemu#14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 qemu#15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 qemu#16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353 qemu#17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199 qemu#18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112 qemu#19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147 qemu#20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42 qemu#21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425 qemu#22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468 qemu#23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517 qemu#24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538 qemu#25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624 qemu#26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649 qemu#27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58 qemu#28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822 qemu#29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862 qemu#30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644 Note that error report is now moved to the first caller, which may receive an error for a recursed event. This is probably fine (95% of callers use &error_abort, the rest have NULL error and ignore it) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180731150144.14022-1-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [*_no_recurse renamed to *_no_reenter, local variables reordered] Signed-off-by: Markus Armbruster <armbru@redhat.com>

Spotted by ASAN, during make check... Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67 #3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24 #4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144 #5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506 #6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569 #7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306 #8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357 #9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561 #10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592 qemu#11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257 qemu#12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105 qemu#13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323 qemu#14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373 qemu#15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124 qemu#16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337 qemu#17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 qemu#18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 qemu#19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 qemu#20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 qemu#21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com> [Screwed up in commit b273145] Cc: qemu-stable@nongnu.org Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu crash. The backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080, io_read=0x8db841 <qio_channel_restart_read>, io_write=0x0, opaque=0x38111e0) at io/channel.c: #2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438 #3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47 #4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327 at migration/qemu-file-channel.c:83 #5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299 #6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562 #7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575 #8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655 #9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126 #10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366 qemu#11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1 qemu#12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6 qemu#13 0x00007f96fe858760 in ?? () qemu#14 0x0000000000000000 in ?? () RDMA QIOChannel not implement io_set_aio_fd_handler. so qio_channel_set_aio_fd_handler will access NULL pointer. Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash. The backtrace is: (gdb) bt #0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6 #1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6 #2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460 #5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768) at migration/qemu-file-channel.c:83 #6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299 #7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562 #8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575 #9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647 #10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794 qemu#11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504 qemu#12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0 qemu#13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6 This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine(). Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

Because RDMA QIOChannel not implement shutdown function, If the to_dst_file was set error, the return path thread will wait forever. and the migration thread will wait return path thread exit. the backtrace of return path thread is: (gdb) bt #0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6 #1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000) at qemu-timer.c:325 #2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000) at migration/rdma.c:1501 #3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000, byte_len=0x7ef7091d0640) at migration/rdma.c:1580 #4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000, head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726 #5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720, expecting=3) at migration/rdma.c:1903 #6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8, size=32768) at migration/rdma.c:2714 #7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232 #8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0) at migration/qemu-file.c:502 #9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515 #10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591 qemu#11 0x00000000006a46d3 in source_return_path_thread ( opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1331 qemu#12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 qemu#13 0x00007f372a77635d in clone () from /lib64/libc.so.6 the backtrace of migration thread is: (gdb) bt #0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 <current_migration.37100+88>) at util/qemu-thread-posix.c:504 #2 0x00000000006a4bc5 in await_return_path_close_on_source ( ms=0xd826a0 <current_migration.37100>) at migration/migration.c:1460 #3 0x00000000006a53e4 in migration_completion (s=0xd826a0 <current_migration.37100>, current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980) at migration/migration.c:1695 #4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1837 #5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f372a77635d in clone () from /lib64/libc.so.6 Signed-off-by: Lidong Chen <lidongchen@tencent.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

tests/cdrom-test -p /x86_64/cdrom/boot/megasas Produces the following ASAN leak. ==25700==ERROR: LeakSanitizer: detected memory leaks Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f06f8faac48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7f06f87a73c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x55a729f17738 in pci_dma_sglist_init /home/elmarco/src/qq/include/hw/pci/pci.h:818 #3 0x55a729f2a706 in megasas_map_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:698 #4 0x55a729f39421 in megasas_handle_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:1574 #5 0x55a729f3f70d in megasas_handle_frame /home/elmarco/src/qq/hw/scsi/megasas.c:1955 #6 0x55a729f40939 in megasas_mmio_write /home/elmarco/src/qq/hw/scsi/megasas.c:2119 #7 0x55a729f41102 in megasas_port_write /home/elmarco/src/qq/hw/scsi/megasas.c:2170 #8 0x55a729220e60 in memory_region_write_accessor /home/elmarco/src/qq/memory.c:527 #9 0x55a7292212b3 in access_with_adjusted_size /home/elmarco/src/qq/memory.c:594 #10 0x55a72922cf70 in memory_region_dispatch_write /home/elmarco/src/qq/memory.c:1473 qemu#11 0x55a7290f5907 in flatview_write_continue /home/elmarco/src/qq/exec.c:3255 qemu#12 0x55a7290f5ceb in flatview_write /home/elmarco/src/qq/exec.c:3294 qemu#13 0x55a7290f6457 in address_space_write /home/elmarco/src/qq/exec.c:3384 qemu#14 0x55a7290f64a8 in address_space_rw /home/elmarco/src/qq/exec.c:3395 qemu#15 0x55a72929ecb0 in kvm_handle_io /home/elmarco/src/qq/accel/kvm/kvm-all.c:1729 qemu#16 0x55a7292a0db5 in kvm_cpu_exec /home/elmarco/src/qq/accel/kvm/kvm-all.c:1969 qemu#17 0x55a7291c4212 in qemu_kvm_cpu_thread_fn /home/elmarco/src/qq/cpus.c:1215 qemu#18 0x55a72a966a6c in qemu_thread_start /home/elmarco/src/qq/util/qemu-thread-posix.c:504 qemu#19 0x7f06ed486593 in start_thread (/lib64/libpthread.so.0+0x7593) Move the qemu_sglist_destroy() from megasas_complete_command() to megasas_unmap_frame(), so map/unmap are balanced. Signed-off-by: Marc-AndrÃ© Lureau <marcandre.lureau@redhat.com> Message-Id: <20180814141247.32336-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

Spotted by ASAN doing some manual testing: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f5fcdc75e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f5fcd47241d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55f989be92ce in timer_new /home/elmarco/src/qq/include/qemu/timer.h:561 #3 0x55f989be92ff in timer_new_ms /home/elmarco/src/qq/include/qemu/timer.h:630 #4 0x55f989c0219d in hmp_migrate /home/elmarco/src/qq/hmp.c:2038 #5 0x55f98955927b in handle_hmp_command /home/elmarco/src/qq/monitor.c:3498 #6 0x55f98955fb8c in monitor_command_cb /home/elmarco/src/qq/monitor.c:4371 #7 0x55f98ad40f11 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:393 #8 0x55f98955fa4f in monitor_read /home/elmarco/src/qq/monitor.c:4354 #9 0x55f98aae30d7 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175 #10 0x55f98aae317a in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187 qemu#11 0x55f98aae940c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 qemu#12 0x55f98ab63018 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 qemu#13 0x7f5fcd46c8ac in g_main_dispatch gmain.c:3177 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180901134652.25884-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 #2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 #3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 #4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 #5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 #6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 #2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 #3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 #4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 #5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 #6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Spotted by ASAN while running: $ tests/migration-test -p /x86_64/migration/postcopy/recovery ================================================================= ==18034==ERROR: LeakSanitizer: detected memory leaks Direct leak of 33864 byte(s) in 1 object(s) allocated from: #0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50) #1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d) #2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183 #3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159 #4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78 #5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295 #6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111 #7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91 #8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108 #9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158 #10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89 qemu#11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180925092245.29565-1-marcandre.lureau@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Recently, the test case has started failing because some job related functions want to drop the AioContext lock even though it hasn't been taken: (gdb) bt #0 0x00007f51c067c9fb in raise () from /lib64/libc.so.6 #1 0x00007f51c067e77d in abort () from /lib64/libc.so.6 #2 0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96 #4 0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516 #5 0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738 #6 0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986 #7 0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941 #8 0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148 #9 0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420 #10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629 qemu#11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685 qemu#12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783 qemu#13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402 qemu#14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457 qemu#15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478 qemu#16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 qemu#17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 qemu#18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0 qemu#19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0 qemu#20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0 qemu#21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581 It is yet unclear whether this should really be considered a bug in the test case or whether blk_unref() should work for callers that haven't taken the AioContext lock, but in order to fix the build tests quickly, just take the AioContext lock around blk_unref(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Spotted by ASAN: ================================================================= ==11893==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1120 byte(s) in 28 object(s) allocated from: #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) #2 0x559e708b56a4 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:66 #3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23 #4 0x559e708bda7d in parse_string /home/elmarco/src/qq/qobject/json-parser.c:143 #5 0x559e708c1009 in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:484 #6 0x559e708c1627 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:547 #7 0x559e708c1c67 in json_parser_parse /home/elmarco/src/qq/qobject/json-parser.c:573 #8 0x559e708bc0ff in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:92 #9 0x559e708d1655 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:292 #10 0x559e708d1fe1 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:339 qemu#11 0x559e708bc856 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:121 qemu#12 0x559e708b8b4b in qobject_from_jsonv /home/elmarco/src/qq/qobject/qjson.c:69 qemu#13 0x559e708b8d02 in qobject_from_json /home/elmarco/src/qq/qobject/qjson.c:83 qemu#14 0x559e708a74ae in from_json_str /home/elmarco/src/qq/tests/check-qjson.c:30 qemu#15 0x559e708a9f83 in utf8_string /home/elmarco/src/qq/tests/check-qjson.c:781 qemu#16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 qemu#17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180901211917.10372-1-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Let start from the beginning: Commit b9e413d (in 2.9) "block: explicitly acquire aiocontext in aio callbacks that need it" added pairs of aio_context_acquire/release to mirror_write_complete and mirror_read_complete, when they were aio callbacks for blk_aio_* calls. Then, commit 2e1990b (in 3.0) "block/mirror: Convert to coroutines" dropped these blk_aio_* calls, than mirror_write_complete and mirror_read_complete are not callbacks more, and don't need additional aiocontext acquiring. Furthermore, mirror_read_complete calls blk_co_pwritev inside these pair of aio_context_acquire/release, which leads to the following dead-lock with mirror: (gdb) info thr Id Target Id Frame 3 Thread (LWP 145412) "qemu-system-x86" syscall () 2 Thread (LWP 145416) "qemu-system-x86" __lll_lock_wait () * 1 Thread (LWP 145411) "qemu-system-x86" __lll_lock_wait () (gdb) bt #0 __lll_lock_wait () #1 _L_lock_812 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561032dce420 <qemu_global_mutex>, file=0x5610327d8654 "util/main-loop.c", line=236) at util/qemu-thread-posix.c:66 #4 qemu_mutex_lock_iothread_impl #5 os_host_main_loop_wait (timeout=480116000) at util/main-loop.c:236 #6 main_loop_wait (nonblocking=0) at util/main-loop.c:497 #7 main_loop () at vl.c:1892 #8 main Printing contents of qemu_global_mutex, I see that "__owner = 145416", so, thr1 is main loop, and now it wants BQL, which is owned by thr2. (gdb) thr 2 (gdb) bt #0 __lll_lock_wait () #1 _L_lock_870 () #2 __GI___pthread_mutex_lock #3 qemu_mutex_lock_impl (mutex=0x561034d25dc0, ... #4 aio_context_acquire (ctx=0x561034d25d60) #5 dma_blk_cb #6 dma_blk_io #7 dma_blk_read #8 ide_dma_cb #9 bmdma_cmd_writeb #10 bmdma_write qemu#11 memory_region_write_accessor qemu#12 access_with_adjusted_size qemu#15 flatview_write qemu#16 address_space_write qemu#17 address_space_rw qemu#18 kvm_handle_io qemu#19 kvm_cpu_exec qemu#20 qemu_kvm_cpu_thread_fn qemu#21 qemu_thread_start qemu#22 start_thread qemu#23 clone () Printing mutex in fr 2, I see "__owner = 145411", so thr2 wants aio context mutex, which is owned by thr1. Classic dead-lock. Then, let's check that aio context is hold by mirror coroutine: just print coroutine stack of first tracked request in mirror job target: (gdb) [...] (gdb) qemu coroutine 0x561035dd0860 #0 qemu_coroutine_switch #1 qemu_coroutine_yield #2 qemu_co_mutex_lock_slowpath #3 qemu_co_mutex_lock #4 qcow2_co_pwritev #5 bdrv_driver_pwritev #6 bdrv_aligned_pwritev #7 bdrv_co_pwritev #8 blk_co_pwritev #9 mirror_read_complete () at block/mirror.c:232 #10 mirror_co_read () at block/mirror.c:370 qemu#11 coroutine_trampoline qemu#12 __start_context Yes it is mirror_read_complete calling blk_co_pwritev after acquiring aio context. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

When a monitor is connected to a Spice chardev, the monitor cleanup can dead-lock: #0 0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316 #7 0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197 #8 0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197 #9 0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414 #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388 qemu#11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347 qemu#12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 qemu#13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341 qemu#14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241 qemu#15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340 qemu#16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280 qemu#17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233 qemu#18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786 qemu#19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683 qemu#20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660 Because spice code tries to emit a "disconnected" signal on the monitors. Fix this dead-lock by releasing the monitor lock for flush/destroy. monitor_lock protects mon_list, monitor_qapi_event_state and monitor_destroyed. monitor_flush() and monitor_data_destroy() don't access any of those variables. monitor_cleanup()'s loop is safe because it uses QTAILQ_FOREACH_SAFE(), and no further monitor can be added after calling monitor_cleanup() thanks to monitor_destroyed check in monitor_list_append(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20181205203737.9011-8-marcandre.lureau@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>

One multifd channel will shutdown all the other multifd's IOChannel when it fails to receive an IOChannel. In this senario, if some multifds had not received its IOChannel yet, it would try to shutdown its IOChannel which could cause nullptr access at qio_channel_shutdown. Here is the coredump stack: #0 object_get_class (obj=obj@entry=0x0) at qom/object.c:908 #1 0x00005563fdbb8f4a in qio_channel_shutdown (ioc=0x0, how=QIO_CHANNEL_SHUTDOWN_BOTH, errp=0x0) at io/channel.c:355 #2 0x00005563fd7b4c5f in multifd_recv_terminate_threads (err=<optimized out>) at migration/ram.c:1280 #3 0x00005563fd7bc019 in multifd_recv_new_channel (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce00) at migration/ram.c:1478 #4 0x00005563fda82177 in migration_ioc_process_incoming (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce30) at migration/migration.c:605 #5 0x00005563fda8567d in migration_channel_process_incoming (ioc=0x556400255610) at migration/channel.c:44 #6 0x00005563fda83ee0 in socket_accept_incoming_migration (listener=0x5563fff6b920, cioc=0x556400255610, opaque=<optimized out>) at migration/socket.c:166 #7 0x00005563fdbc25cd in qio_net_listener_channel_func (ioc=<optimized out>, condition=<optimized out>, opaque=<optimized out>) at io/net-listener.c:54 #8 0x00007f895b6fe9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #9 0x00005563fdc18136 in glib_pollfds_poll () at util/main-loop.c:218 #10 0x00005563fdc181b5 in os_host_main_loop_wait (timeout=1000000000) at util/main-loop.c:241 qemu#11 0x00005563fdc183a2 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517 qemu#12 0x00005563fd8edb37 in main_loop () at vl.c:1791 qemu#13 0x00005563fd74fd45 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4473 To fix it up, let's check p->c before calling qio_channel_shutdown. Signed-off-by: Jiahui Cen <cenjiahui@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

…threads One multifd will lock all the other multifds' IOChannel mutex to inform them to quit by setting p->quit or shutting down p->c. In this senario, if some multifds had already been terminated and multifd_load_cleanup/multifd_save_cleanup had destroyed their mutex, it could cause destroyed mutex access when trying lock their mutex. Here is the coredump stack: #0 0x00007f81a2794437 in raise () from /usr/lib64/libc.so.6 #1 0x00007f81a2795b28 in abort () from /usr/lib64/libc.so.6 #2 0x00007f81a278d1b6 in __assert_fail_base () from /usr/lib64/libc.so.6 #3 0x00007f81a278d262 in __assert_fail () from /usr/lib64/libc.so.6 #4 0x000055eb1bfadbd3 in qemu_mutex_lock_impl (mutex=0x55eb1e2d1988, file=<optimized out>, line=<optimized out>) at util/qemu-thread-posix.c:64 #5 0x000055eb1bb4564a in multifd_send_terminate_threads (err=<optimized out>) at migration/ram.c:1015 #6 0x000055eb1bb4bb7f in multifd_send_thread (opaque=0x55eb1e2d19f8) at migration/ram.c:1171 #7 0x000055eb1bfad628 in qemu_thread_start (args=0x55eb1e170450) at util/qemu-thread-posix.c:502 #8 0x00007f81a2b36df5 in start_thread () from /usr/lib64/libpthread.so.0 #9 0x00007f81a286048d in clone () from /usr/lib64/libc.so.6 To fix it up, let's destroy the mutex after all the other multifd threads had been terminated. Signed-off-by: Jiahui Cen <cenjiahui@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

All paths that lead to bdrv_backup_top_drop(), except for the call from backup_clean(), imply that the BDS AioContext has already been acquired, so doing it there too can potentially lead to QEMU hanging on AIO_WAIT_WHILE(). An easy way to trigger this situation is by issuing a two actions transaction, with a proper and a bogus blockdev-backup, so the second one will trigger a rollback. This will trigger a hang with an stack trace like this one: #0 0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39 #1 0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055e743386e09 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336 #3 0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true) at util/aio-posix.c:669 #4 0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878 #5 0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017 #6 0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262 #7 0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644 #8 0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273 #9 0x000055e74331461f in backup_job_create (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478 #10 0x000055e74315bc52 in do_backup_common (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0) at blockdev.c:3580 qemu#11 0x000055e74315c37c in do_blockdev_backup (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0) at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492 qemu#12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018) at blockdev.c:1885 qemu#13 0x000055e743160152 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340 qemu#14 0x000055e743287ff5 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8) at qapi/qapi-commands-transaction.c:44 qemu#15 0x000055e74333de6c in do_qmp_dispatch (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#16 0x000055e74333de6c in qmp_dispatch (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>) at monitor/qmp.c:145 qemu#18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 qemu#19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117 qemu#20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117 qemu#21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459 qemu#22 0x000055e743385742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176 qemu#24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829 qemu#25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219 qemu#26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#28 0x000055e74316a3c1 in main_loop () at vl.c:1828 qemu#29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by not acquiring the AioContext there, and ensuring all paths leading to it have it already acquired (backup_clean()). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Dirty map addition and removal functions are not acquiring to BDS AioContext, while they may call to code that expects it to be acquired. This may trigger a crash with a stack trace like this one: #0 0x00007f0ef146370f in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f0ef144db25 in __GI_abort () at abort.c:79 #2 0x0000565022294dce in error_exit (err=<optimized out>, msg=msg@entry=0x56502243a730 <__func__.16350> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36 #3 0x00005650222950ba in qemu_mutex_unlock_impl (mutex=mutex@entry=0x5650244b0240, file=file@entry=0x565022439adf "util/async.c", line=line@entry=526) at util/qemu-thread-posix.c:108 #4 0x0000565022290029 in aio_context_release (ctx=ctx@entry=0x5650244b01e0) at util/async.c:526 #5 0x000056502221cd08 in bdrv_can_store_new_dirty_bitmap (bs=bs@entry=0x5650244dc820, name=name@entry=0x56502481d360 "bitmap1", granularity=granularity@entry=65536, errp=errp@entry=0x7fff22831718) at block/dirty-bitmap.c:542 #6 0x000056502206ae53 in qmp_block_dirty_bitmap_add (errp=0x7fff22831718, disabled=false, has_disabled=<optimized out>, persistent=<optimized out>, has_persistent=true, granularity=65536, has_granularity=<optimized out>, name=0x56502481d360 "bitmap1", node=<optimized out>) at blockdev.c:2894 #7 0x000056502206ae53 in qmp_block_dirty_bitmap_add (node=<optimized out>, name=0x56502481d360 "bitmap1", has_granularity=<optimized out>, granularity=<optimized out>, has_persistent=true, persistent=<optimized out>, has_disabled=false, disabled=false, errp=0x7fff22831718) at blockdev.c:2856 #8 0x00005650221847a3 in qmp_marshal_block_dirty_bitmap_add (args=<optimized out>, ret=<optimized out>, errp=0x7fff22831798) at qapi/qapi-commands-block-core.c:651 #9 0x0000565022247e6c in do_qmp_dispatch (errp=0x7fff22831790, allow_oob=<optimized out>, request=<optimized out>, cmds=0x565022b32d60 <qmp_commands>) at qapi/qmp-dispatch.c:132 #10 0x0000565022247e6c in qmp_dispatch (cmds=0x565022b32d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#11 0x0000565022166061 in monitor_qmp_dispatch (mon=0x56502450faa0, req=<optimized out>) at monitor/qmp.c:145 qemu#12 0x00005650221666fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234 qemu#13 0x000056502228f866 in aio_bh_call (bh=0x56502440eae0) at util/async.c:117 qemu#14 0x000056502228f866 in aio_bh_poll (ctx=ctx@entry=0x56502440d7a0) at util/async.c:117 qemu#15 0x0000565022292c54 in aio_dispatch (ctx=0x56502440d7a0) at util/aio-posix.c:459 qemu#16 0x000056502228f742 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#17 0x00007f0ef5ce667d in g_main_dispatch (context=0x56502449aa40) at gmain.c:3176 qemu#18 0x00007f0ef5ce667d in g_main_context_dispatch (context=context@entry=0x56502449aa40) at gmain.c:3829 qemu#19 0x0000565022291d08 in glib_pollfds_poll () at util/main-loop.c:219 qemu#20 0x0000565022291d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#21 0x0000565022291d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#22 0x00005650220743c1 in main_loop () at vl.c:1828 qemu#23 0x0000565021f20a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 Fix this by acquiring the AioContext at qmp_block_dirty_bitmap_add() and qmp_block_dirty_bitmap_add(). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782175 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

external_snapshot_abort() calls to bdrv_set_backing_hd(), which returns state->old_bs to the main AioContext, as it's intended to be used then the BDS is going to be released. As that's not the case when aborting an external snapshot, return it to the AioContext it was before the call. This issue can be triggered by issuing a transaction with two actions, a proper blockdev-snapshot-sync and a bogus one, so the second will trigger a transaction abort. This results in a crash with an stack trace like this one: #0 0x00007fa1048b28df in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007fa10489ccf5 in __GI_abort () at abort.c:79 #2 0x00007fa10489cbc9 in __assert_fail_base (fmt=0x7fa104a03300 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=0x557224014d30 "block.c", line=2240, function=<optimized out>) at assert.c:92 #3 0x00007fa1048aae96 in __GI___assert_fail (assertion=assertion@entry=0x5572240b44d8 "bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs)", file=file@entry=0x557224014d30 "block.c", line=line@entry=2240, function=function@entry=0x5572240b5d60 <__PRETTY_FUNCTION__.31620> "bdrv_replace_child_noperm") at assert.c:101 #4 0x0000557223e631f8 in bdrv_replace_child_noperm (child=0x557225b9c980, new_bs=new_bs@entry=0x557225c42e40) at block.c:2240 #5 0x0000557223e68be7 in bdrv_replace_node (from=0x557226951a60, to=0x557225c42e40, errp=0x5572247d6138 <error_abort>) at block.c:4196 #6 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1731 #7 0x0000557223d069c4 in external_snapshot_abort (common=0x557225d7e170) at blockdev.c:1717 #8 0x0000557223d09013 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=0x557225cc7d70, errp=errp@entry=0x7ffe704c0c98) at blockdev.c:2360 #9 0x0000557223e32085 in qmp_marshal_transaction (args=<optimized out>, ret=<optimized out>, errp=0x7ffe704c0d08) at qapi/qapi-commands-transaction.c:44 #10 0x0000557223ee798c in do_qmp_dispatch (errp=0x7ffe704c0d00, allow_oob=<optimized out>, request=<optimized out>, cmds=0x5572247d3cc0 <qmp_commands>) at qapi/qmp-dispatch.c:132 qemu#11 0x0000557223ee798c in qmp_dispatch (cmds=0x5572247d3cc0 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>) at qapi/qmp-dispatch.c:175 qemu#12 0x0000557223e06141 in monitor_qmp_dispatch (mon=0x557225c69ff0, req=<optimized out>) at monitor/qmp.c:120 qemu#13 0x0000557223e0678a in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:209 qemu#14 0x0000557223f2f366 in aio_bh_call (bh=0x557225b9dc60) at util/async.c:117 qemu#15 0x0000557223f2f366 in aio_bh_poll (ctx=ctx@entry=0x557225b9c840) at util/async.c:117 qemu#16 0x0000557223f32754 in aio_dispatch (ctx=0x557225b9c840) at util/aio-posix.c:459 qemu#17 0x0000557223f2f242 in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260 qemu#18 0x00007fa10913467d in g_main_dispatch (context=0x557225c28e80) at gmain.c:3176 qemu#19 0x00007fa10913467d in g_main_context_dispatch (context=context@entry=0x557225c28e80) at gmain.c:3829 qemu#20 0x0000557223f31808 in glib_pollfds_poll () at util/main-loop.c:219 qemu#21 0x0000557223f31808 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 qemu#22 0x0000557223f31808 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518 qemu#23 0x0000557223d13201 in main_loop () at vl.c:1828 qemu#24 0x0000557223bbfb82 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4504 RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1779036 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

If the multifd_send_threads is not created when migration is failed, multifd_save_cleanup would be called twice. In this senario, the multifd_send_state is accessed after it has been released, the result is that the source VM is crashing down. Here is the coredump stack: Program received signal SIGSEGV, Segmentation fault. 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012 1012 MultiFDSendParams *p = &multifd_send_state->params[i]; #0 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012 #1 0x00005629333ab8a9 in multifd_save_cleanup () at migration/ram.c:1028 #2 0x00005629333abaea in multifd_new_send_channel_async (task=0x562935450e70, opaque=<optimized out>) at migration/ram.c:1202 #3 0x000056293373a562 in qio_task_complete (task=task@entry=0x562935450e70) at io/task.c:196 #4 0x000056293373a6e0 in qio_task_thread_result (opaque=0x562935450e70) at io/task.c:111 #5 0x00007f475d4d75a7 in g_idle_dispatch () from /usr/lib64/libglib-2.0.so.0 #6 0x00007f475d4da9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #7 0x0000562933785b33 in glib_pollfds_poll () at util/main-loop.c:219 #8 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #9 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518 #10 0x00005629334c5acf in main_loop () at vl.c:1810 qemu#11 0x000056293334d7bb in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4471 If the multifd_send_threads is not created when migration is failed. In this senario, we don't call multifd_save_cleanup in multifd_new_send_channel_async. Signed-off-by: Zhimin Feng <fengzhimin1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>

'crypto_opts' forgot to free in qcow2_close(), this patch fix the bellow leak stack: Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f0edd81f970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970) #1 0x7f0edc6d149d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d) #2 0x55d7eaede63d in qobject_input_start_struct /mnt/sdb/qemu-new/qemu_test/qemu/qapi/qobject-input-visitor.c:295 #3 0x55d7eaed78b8 in visit_start_struct /mnt/sdb/qemu-new/qemu_test/qemu/qapi/qapi-visit-core.c:49 #4 0x55d7eaf5140b in visit_type_QCryptoBlockOpenOptions qapi/qapi-visit-crypto.c:290 #5 0x55d7eae43af3 in block_crypto_open_opts_init /mnt/sdb/qemu-new/qemu_test/qemu/block/crypto.c:163 #6 0x55d7eacd2924 in qcow2_update_options_prepare /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1148 #7 0x55d7eacd33f7 in qcow2_update_options /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1232 #8 0x55d7eacd9680 in qcow2_do_open /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1512 #9 0x55d7eacdc55e in qcow2_open_entry /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1792 #10 0x55d7eacdc8fe in qcow2_open /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:1819 qemu#11 0x55d7eac3742d in bdrv_open_driver /mnt/sdb/qemu-new/qemu_test/qemu/block.c:1317 qemu#12 0x55d7eac3e990 in bdrv_open_common /mnt/sdb/qemu-new/qemu_test/qemu/block.c:1575 qemu#13 0x55d7eac4442c in bdrv_open_inherit /mnt/sdb/qemu-new/qemu_test/qemu/block.c:3126 qemu#14 0x55d7eac45c3f in bdrv_open /mnt/sdb/qemu-new/qemu_test/qemu/block.c:3219 qemu#15 0x55d7ead8e8a4 in blk_new_open /mnt/sdb/qemu-new/qemu_test/qemu/block/block-backend.c:397 qemu#16 0x55d7eacde74c in qcow2_co_create /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:3534 qemu#17 0x55d7eacdfa6d in qcow2_co_create_opts /mnt/sdb/qemu-new/qemu_test/qemu/block/qcow2.c:3668 qemu#18 0x55d7eac1c678 in bdrv_create_co_entry /mnt/sdb/qemu-new/qemu_test/qemu/block.c:485 qemu#19 0x55d7eb0024d2 in coroutine_trampoline /mnt/sdb/qemu-new/qemu_test/qemu/util/coroutine-ucontext.c:115 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200227012950.12256-2-pannengyuan@huawei.com> Signed-off-by: Max Reitz <mreitz@redhat.com>

There is a use-after-free possible: bdrv_unref_child() leaves bs->backing freed but not NULL. bdrv_attach_child may produce nested polling loop due to drain, than access of freed pointer is possible. I've produced the following crash on 30 iotest with modified code. It does not reproduce on master, but still seems possible: #0 __strcmp_avx2 () at /lib64/libc.so.6 #1 bdrv_backing_overridden (bs=0x55c9d3cc2060) at block.c:6350 #2 bdrv_refresh_filename (bs=0x55c9d3cc2060) at block.c:6404 #3 bdrv_backing_attach (c=0x55c9d48e5520) at block.c:1063 #4 bdrv_replace_child_noperm (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2290 #5 bdrv_replace_child (child=child@entry=0x55c9d48e5520, new_bs=new_bs@entry=0x55c9d3cc2060) at block.c:2320 #6 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3c5a3d0, errp=0x7ffd117108e0) at block.c:2424 #7 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3c5a3d0, child_bs=child_bs@entry=0x55c9d3cc2060, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd117108e0) at block.c:5876 #8 in bdrv_set_backing_hd (bs=bs@entry=0x55c9d3c5a3d0, backing_hd=backing_hd@entry=0x55c9d3cc2060, errp=errp@entry=0x7ffd117108e0) at block.c:2576 #9 stream_prepare (job=0x55c9d49d84a0) at block/stream.c:150 #10 job_prepare (job=0x55c9d49d84a0) at job.c:761 qemu#11 job_txn_apply (txn=<optimized out>, fn=<optimized out>) at job.c:145 qemu#12 job_do_finalize (job=0x55c9d49d84a0) at job.c:778 qemu#13 job_completed_txn_success (job=0x55c9d49d84a0) at job.c:832 qemu#14 job_completed (job=0x55c9d49d84a0) at job.c:845 qemu#15 job_completed (job=0x55c9d49d84a0) at job.c:836 qemu#16 job_exit (opaque=0x55c9d49d84a0) at job.c:864 qemu#17 aio_bh_call (bh=0x55c9d471a160) at util/async.c:117 qemu#18 aio_bh_poll (ctx=ctx@entry=0x55c9d3c46720) at util/async.c:117 qemu#19 aio_poll (ctx=ctx@entry=0x55c9d3c46720, blocking=blocking@entry=true) at util/aio-posix.c:728 qemu#20 bdrv_parent_drained_begin_single (poll=true, c=0x55c9d3d558f0) at block/io.c:121 qemu#21 bdrv_parent_drained_begin_single (c=c@entry=0x55c9d3d558f0, poll=poll@entry=true) at block/io.c:114 qemu#22 bdrv_replace_child_noperm (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2258 qemu#23 bdrv_replace_child (child=child@entry=0x55c9d3d558f0, new_bs=new_bs@entry=0x55c9d3d27300) at block.c:2320 qemu#24 bdrv_root_attach_child (child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, ctx=<optimized out>, perm=<optimized out>, shared_perm=21, opaque=0x55c9d3cc2060, errp=0x7ffd11710c60) at block.c:2424 qemu#25 bdrv_attach_child (parent_bs=parent_bs@entry=0x55c9d3cc2060, child_bs=child_bs@entry=0x55c9d3d27300, child_name=child_name@entry=0x55c9d241d478 "backing", child_role=child_role@entry=0x55c9d26ecee0 <child_backing>, errp=errp@entry=0x7ffd11710c60) at block.c:5876 qemu#26 bdrv_set_backing_hd (bs=bs@entry=0x55c9d3cc2060, backing_hd=backing_hd@entry=0x55c9d3d27300, errp=errp@entry=0x7ffd11710c60) at block.c:2576 qemu#27 stream_prepare (job=0x55c9d495ead0) at block/stream.c:150 ... Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200316060631.30052-2-vsementsov@virtuozzo.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>

We neglect to free port->bh on the error paths. Fix that. Reproducer: {'execute': 'device_add', 'arguments': {'id': 'virtio_serial_pci0', 'driver': 'virtio-serial-pci', 'bus': 'pci.0', 'addr': '0x5'}, 'id': 'yVkZcGgV'} {'execute': 'device_add', 'arguments': {'id': 'port1', 'driver': 'virtserialport', 'name': 'port1', 'chardev': 'channel1', 'bus': 'virtio_serial_pci0.0', 'nr': 1}, 'id': '3dXdUgJA'} {'execute': 'device_add', 'arguments': {'id': 'port2', 'driver': 'virtserialport', 'name': 'port2', 'chardev': 'channel2', 'bus': 'virtio_serial_pci0.0', 'nr': 1}, 'id': 'qLzcCkob'} {'execute': 'device_add', 'arguments': {'id': 'port2', 'driver': 'virtserialport', 'name': 'port2', 'chardev': 'channel2', 'bus': 'virtio_serial_pci0.0', 'nr': 2}, 'id': 'qLzcCkob'} The leak stack: Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f04a8008ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8) #1 0x7f04a73cf1d5 in g_malloc (/lib64/libglib-2.0.so.0+0x531d5) #2 0x56273eaee484 in aio_bh_new /mnt/sdb/backup/qemu/util/async.c:125 #3 0x56273eafe9a8 in qemu_bh_new /mnt/sdb/backup/qemu/util/main-loop.c:532 #4 0x56273d52e62e in virtser_port_device_realize /mnt/sdb/backup/qemu/hw/char/virtio-serial-bus.c:946 #5 0x56273dcc5040 in device_set_realized /mnt/sdb/backup/qemu/hw/core/qdev.c:891 #6 0x56273e5ebbce in property_set_bool /mnt/sdb/backup/qemu/qom/object.c:2238 #7 0x56273e5e5a9c in object_property_set /mnt/sdb/backup/qemu/qom/object.c:1324 #8 0x56273e5ef5f8 in object_property_set_qobject /mnt/sdb/backup/qemu/qom/qom-qobject.c:26 #9 0x56273e5e5e6a in object_property_set_bool /mnt/sdb/backup/qemu/qom/object.c:1390 #10 0x56273daa40de in qdev_device_add /mnt/sdb/backup/qemu/qdev-monitor.c:680 qemu#11 0x56273daa53e9 in qmp_device_add /mnt/sdb/backup/qemu/qdev-monitor.c:805 Fixes: 199646d Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Amit Shah <amit@kernel.org> Message-Id: <20200309021738.30072-1-pannengyuan@huawei.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

The tulip networking card emulation has an OOB issue in 'tulip_copy_tx_buffers' when the guest provide malformed descriptor. This test will trigger a ASAN heap overflow crash. To trigger this issue we can construct the data as following: 1. construct a 'tulip_descriptor'. Its control is set to '0x7ff | 0x7ff << 11', this will make the 'tulip_copy_tx_buffers's 'len1' and 'len2' to 0x7ff(2047). So 'len1+len2' will overflow 'TULIPState's 'tx_frame' field. This descriptor's 'buf_addr1' and 'buf_addr2' should set to a guest address. 2. write this descriptor to tulip device's CSR4 register. This will set the 'TULIPState's 'current_tx_desc' field. 3. write 'CSR6_ST' to tulip device's CSR6 register. This will trigger 'tulip_xmit_list_update' and finally calls 'tulip_copy_tx_buffers'. Following shows the backtrack of crash: ==31781==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x628000007cd0 at pc 0x7fe03c5a077a bp 0x7fff05b46770 sp 0x7fff05b45f18 WRITE of size 2047 at 0x628000007cd0 thread T0 #0 0x7fe03c5a0779 (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x79779) #1 0x5575fb6daa6a in flatview_read_continue /home/test/qemu/exec.c:3194 #2 0x5575fb6daccb in flatview_read /home/test/qemu/exec.c:3227 #3 0x5575fb6dae66 in address_space_read_full /home/test/qemu/exec.c:3240 #4 0x5575fb6db0cb in address_space_rw /home/test/qemu/exec.c:3268 #5 0x5575fbdfd460 in dma_memory_rw_relaxed /home/test/qemu/include/sysemu/dma.h:87 #6 0x5575fbdfd4b5 in dma_memory_rw /home/test/qemu/include/sysemu/dma.h:110 #7 0x5575fbdfd866 in pci_dma_rw /home/test/qemu/include/hw/pci/pci.h:787 #8 0x5575fbdfd8a3 in pci_dma_read /home/test/qemu/include/hw/pci/pci.h:794 #9 0x5575fbe02761 in tulip_copy_tx_buffers hw/net/tulip.c:585 #10 0x5575fbe0366b in tulip_xmit_list_update hw/net/tulip.c:678 qemu#11 0x5575fbe04073 in tulip_write hw/net/tulip.c:783 Signed-off-by: Li Qiang <liq3ea@163.com> Signed-off-by: Jason Wang <jasowang@redhat.com>

Direct leak of 4120 byte(s) in 1 object(s) allocated from: #0 0x7fa114931887 in __interceptor_calloc (/lib64/libasan.so.6+0xb0887) #1 0x7fa1144ad8f0 in g_malloc0 (/lib64/libglib-2.0.so.0+0x588f0) #2 0x561e3c9c8897 in qmp_object_add /home/elmarco/src/qemu/qom/qom-qmp-cmds.c:291 #3 0x561e3cf48736 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:155 #4 0x561e3c8efb36 in monitor_qmp_dispatch /home/elmarco/src/qemu/monitor/qmp.c:145 #5 0x561e3c8f09ed in monitor_qmp_bh_dispatcher /home/elmarco/src/qemu/monitor/qmp.c:234 #6 0x561e3d08c993 in aio_bh_call /home/elmarco/src/qemu/util/async.c:136 #7 0x561e3d08d0a5 in aio_bh_poll /home/elmarco/src/qemu/util/async.c:164 #8 0x561e3d0a535a in aio_dispatch /home/elmarco/src/qemu/util/aio-posix.c:380 #9 0x561e3d08e3ca in aio_ctx_dispatch /home/elmarco/src/qemu/util/async.c:298 #10 0x7fa1144a776e in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x5276e) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200325184723.2029630-3-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

There is an overflow, the source 'datain.data[2]' is 100 bytes, but the 'ss' is 252 bytes.This may cause a security issue because we can access a lot of unrelated memory data. The len for sbp copy data should take the minimum of mx_sb_len and sb_len_wr, not the maximum. If we use iscsi device for VM backend storage, ASAN show stack: READ of size 252 at 0xfffd149dcfc4 thread T0 #0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34) #1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9 #2 0xfffd1af0e2dc (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc) #3 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174) #4 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac) #5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5 #6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9 #7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20 #8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520 #9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5 #10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4) qemu#11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9 qemu#12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242 qemu#13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518 qemu#14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9 qemu#15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5 qemu#16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c) qemu#17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740) 0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4) allocated by thread T0 here: #0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70) #1 0xfffd1af0e254 (/usr/lib64/iscsi/libiscsi.so.8+0xe254) #2 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174) #3 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac) #4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5 #5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9 #6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20 #7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520 #8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5 #9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4) #10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9 qemu#11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242 qemu#12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518 qemu#13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9 qemu#14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5 qemu#15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c) qemu#16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740) Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

when s->inflight is freed, vhost_dev_free_inflight may try to access s->inflight->addr, it will retrigger the following issue. ==7309==ERROR: AddressSanitizer: heap-use-after-free on address 0x604001020d18 at pc 0x555555ce948a bp 0x7fffffffb170 sp 0x7fffffffb160 READ of size 8 at 0x604001020d18 thread T0 #0 0x555555ce9489 in vhost_dev_free_inflight /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 #1 0x555555cd86eb in virtio_reset /root/smartx/qemu-el7/qemu-test/hw/virtio/virtio.c:1214 #2 0x5555560d3eff in virtio_pci_reset hw/virtio/virtio-pci.c:1859 #3 0x555555f2ac53 in device_set_realized hw/core/qdev.c:893 #4 0x5555561d572c in property_set_bool qom/object.c:1925 #5 0x5555561de8de in object_property_set_qobject qom/qom-qobject.c:27 #6 0x5555561d99f4 in object_property_set_bool qom/object.c:1188 #7 0x555555e50ae7 in qdev_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:626 #8 0x555555e51213 in qmp_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:806 #9 0x555555e8ff40 in hmp_device_add /root/smartx/qemu-el7/qemu-test/hmp.c:1951 #10 0x555555be889a in handle_hmp_command /root/smartx/qemu-el7/qemu-test/monitor.c:3404 qemu#11 0x555555beac8b in monitor_command_cb /root/smartx/qemu-el7/qemu-test/monitor.c:4296 qemu#12 0x555556433eb7 in readline_handle_byte util/readline.c:393 qemu#13 0x555555be89ec in monitor_read /root/smartx/qemu-el7/qemu-test/monitor.c:4279 qemu#14 0x5555563285cc in tcp_chr_read chardev/char-socket.c:470 qemu#15 0x7ffff670b968 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4a968) qemu#16 0x55555640727c in glib_pollfds_poll util/main-loop.c:215 qemu#17 0x55555640727c in os_host_main_loop_wait util/main-loop.c:238 qemu#18 0x55555640727c in main_loop_wait util/main-loop.c:497 qemu#19 0x555555b2d0bf in main_loop /root/smartx/qemu-el7/qemu-test/vl.c:2013 qemu#20 0x555555b2d0bf in main /root/smartx/qemu-el7/qemu-test/vl.c:4776 qemu#21 0x7fffdd2eb444 in __libc_start_main (/lib64/libc.so.6+0x22444) qemu#22 0x555555b3767a (/root/smartx/qemu-el7/qemu-test/x86_64-softmmu/qemu-system-x86_64+0x5e367a) 0x604001020d18 is located 8 bytes inside of 40-byte region [0x604001020d10,0x604001020d38) freed by thread T0 here: #0 0x7ffff6f00508 in __interceptor_free (/lib64/libasan.so.4+0xde508) #1 0x7ffff671107d in g_free (/lib64/libglib-2.0.so.0+0x5007d) previously allocated by thread T0 here: #0 0x7ffff6f00a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88) #1 0x7ffff6710fc5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x4ffc5) SUMMARY: AddressSanitizer: heap-use-after-free /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 in vhost_dev_free_inflight Shadow bytes around the buggy address: 0x0c08801fc150: fa fa 00 00 00 00 04 fa fa fa fd fd fd fd fd fa 0x0c08801fc160: fa fa fd fd fd fd fd fd fa fa 00 00 00 00 04 fa 0x0c08801fc170: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 04 fa 0x0c08801fc180: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 00 01 0x0c08801fc190: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 04 fa =>0x0c08801fc1a0: fa fa fd[fd]fd fd fd fa fa fa fd fd fd fd fd fa 0x0c08801fc1b0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x0c08801fc1c0: fa fa 00 00 00 00 00 fa fa fa fd fd fd fd fd fd 0x0c08801fc1d0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa 0x0c08801fc1e0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd 0x0c08801fc1f0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==7309==ABORTING Signed-off-by: Li Feng <fengli@smartx.com> Message-Id: <20200417101707.14467-1-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

…gration 'rdma->host' is malloced in qemu_rdma_data_init, but forgot to free on the error path in rdma_start_incoming_migration(), this patch fix that. The leak stack: Direct leak of 2 byte(s) in 1 object(s) allocated from: #0 0x7fb7add18ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8) #1 0x7fb7ad0df1d5 in g_malloc (/lib64/libglib-2.0.so.0+0x531d5) #2 0x7fb7ad0f8b32 in g_strdup (/lib64/libglib-2.0.so.0+0x6cb32) #3 0x55a0464a0f6f in qemu_rdma_data_init /mnt/sdb/qemu/migration/rdma.c:2647 #4 0x55a0464b0e76 in rdma_start_incoming_migration /mnt/sdb/qemu/migration/rdma.c:4020 #5 0x55a0463f898a in qemu_start_incoming_migration /mnt/sdb/qemu/migration/migration.c:365 #6 0x55a0458c75d3 in qemu_init /mnt/sdb/qemu/softmmu/vl.c:4438 #7 0x55a046a3d811 in main /mnt/sdb/qemu/softmmu/main.c:48 #8 0x7fb7a8417872 in __libc_start_main (/lib64/libc.so.6+0x23872) #9 0x55a04536b26d in _start (/mnt/sdb/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x286926d) Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200420102727.17339-1-pannengyuan@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

When error happen in multifd_new_send_channel_async, 'sioc' will not be used to create the multifd_send_thread. Let's free it to avoid a memleak. And also do error_free after migrate_set_error() to avoid another leak in the same place. The leak stack: Direct leak of 2880 byte(s) in 8 object(s) allocated from: #0 0x7f20b5118ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8) #1 0x7f20b44df1d5 in g_malloc (/lib64/libglib-2.0.so.0+0x531d5) #2 0x564133bce18b in object_new_with_type /mnt/sdb/backup/qemu/qom/object.c:683 #3 0x564133eea950 in qio_channel_socket_new /mnt/sdb/backup/qemu/io/channel-socket.c:56 #4 0x5641339cfe4f in socket_send_channel_create /mnt/sdb/backup/qemu/migration/socket.c:37 #5 0x564133a10328 in multifd_save_setup /mnt/sdb/backup/qemu/migration/multifd.c:772 #6 0x5641339cebed in migrate_fd_connect /mnt/sdb/backup/qemu/migration/migration.c:3530 #7 0x5641339d15e4 in migration_channel_connect /mnt/sdb/backup/qemu/migration/channel.c:92 #8 0x5641339cf5b7 in socket_outgoing_migration /mnt/sdb/backup/qemu/migration/socket.c:108 Direct leak of 384 byte(s) in 8 object(s) allocated from: #0 0x7f20b5118cf0 in calloc (/lib64/libasan.so.5+0xefcf0) #1 0x7f20b44df22d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5322d) #2 0x56413406fc17 in error_setv /mnt/sdb/backup/qemu/util/error.c:61 #3 0x564134070464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109 #4 0x5641340851be in inet_connect_addr /mnt/sdb/backup/qemu/util/qemu-sockets.c:379 #5 0x5641340851be in inet_connect_saddr /mnt/sdb/backup/qemu/util/qemu-sockets.c:458 #6 0x5641340870ab in socket_connect /mnt/sdb/backup/qemu/util/qemu-sockets.c:1105 #7 0x564133eeaabf in qio_channel_socket_connect_sync /mnt/sdb/backup/qemu/io/channel-socket.c:145 #8 0x564133eeabf5 in qio_channel_socket_connect_worker /mnt/sdb/backup/qemu/io/channel-socket.c:168 Indirect leak of 360 byte(s) in 8 object(s) allocated from: #0 0x7f20b5118ae8 in __interceptor_malloc (/lib64/libasan.so.5+0xefae8) #1 0x7f20af901817 in __GI___vasprintf_chk (/lib64/libc.so.6+0x10d817) #2 0x7f20b451fa6c in g_vasprintf (/lib64/libglib-2.0.so.0+0x93a6c) #3 0x7f20b44f8cd0 in g_strdup_vprintf (/lib64/libglib-2.0.so.0+0x6ccd0) #4 0x7f20b44f8d8c in g_strdup_printf (/lib64/libglib-2.0.so.0+0x6cd8c) #5 0x56413406fc86 in error_setv /mnt/sdb/backup/qemu/util/error.c:65 #6 0x564134070464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109 #7 0x5641340851be in inet_connect_addr /mnt/sdb/backup/qemu/util/qemu-sockets.c:379 #8 0x5641340851be in inet_connect_saddr /mnt/sdb/backup/qemu/util/qemu-sockets.c:458 #9 0x5641340870ab in socket_connect /mnt/sdb/backup/qemu/util/qemu-sockets.c:1105 #10 0x564133eeaabf in qio_channel_socket_connect_sync /mnt/sdb/backup/qemu/io/channel-socket.c:145 qemu#11 0x564133eeabf5 in qio_channel_socket_connect_worker /mnt/sdb/backup/qemu/io/channel-socket.c:168 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200506095416.26099-2-pannengyuan@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

…leaks When error happen in multifd_send_thread, it use error_copy to set migrate error in multifd_send_terminate_threads(). We should call error_free after it. Similarly, fix another two places in multifd_recv_thread/multifd_save_cleanup. The leak stack: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f781af07cf0 in calloc (/lib64/libasan.so.5+0xefcf0) #1 0x7f781a2ce22d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5322d) #2 0x55ee1d075c17 in error_setv /mnt/sdb/backup/qemu/util/error.c:61 #3 0x55ee1d076464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109 #4 0x55ee1cef066e in qio_channel_socket_writev /mnt/sdb/backup/qemu/io/channel-socket.c:569 #5 0x55ee1cee806b in qio_channel_writev /mnt/sdb/backup/qemu/io/channel.c:207 #6 0x55ee1cee806b in qio_channel_writev_all /mnt/sdb/backup/qemu/io/channel.c:171 #7 0x55ee1cee8248 in qio_channel_write_all /mnt/sdb/backup/qemu/io/channel.c:257 #8 0x55ee1ca12c9a in multifd_send_thread /mnt/sdb/backup/qemu/migration/multifd.c:657 #9 0x55ee1d0607fc in qemu_thread_start /mnt/sdb/backup/qemu/util/qemu-thread-posix.c:519 #10 0x7f78159ae2dd in start_thread (/lib64/libpthread.so.0+0x82dd) qemu#11 0x7f78156df4b2 in __GI___clone (/lib64/libc.so.6+0xfc4b2) Indirect leak of 52 byte(s) in 1 object(s) allocated from: #0 0x7f781af07f28 in __interceptor_realloc (/lib64/libasan.so.5+0xeff28) #1 0x7f78156f07d9 in __GI___vasprintf_chk (/lib64/libc.so.6+0x10d7d9) #2 0x7f781a30ea6c in g_vasprintf (/lib64/libglib-2.0.so.0+0x93a6c) #3 0x7f781a2e7cd0 in g_strdup_vprintf (/lib64/libglib-2.0.so.0+0x6ccd0) #4 0x7f781a2e7d8c in g_strdup_printf (/lib64/libglib-2.0.so.0+0x6cd8c) #5 0x55ee1d075c86 in error_setv /mnt/sdb/backup/qemu/util/error.c:65 #6 0x55ee1d076464 in error_setg_errno_internal /mnt/sdb/backup/qemu/util/error.c:109 #7 0x55ee1cef066e in qio_channel_socket_writev /mnt/sdb/backup/qemu/io/channel-socket.c:569 #8 0x55ee1cee806b in qio_channel_writev /mnt/sdb/backup/qemu/io/channel.c:207 #9 0x55ee1cee806b in qio_channel_writev_all /mnt/sdb/backup/qemu/io/channel.c:171 #10 0x55ee1cee8248 in qio_channel_write_all /mnt/sdb/backup/qemu/io/channel.c:257 qemu#11 0x55ee1ca12c9a in multifd_send_thread /mnt/sdb/backup/qemu/migration/multifd.c:657 qemu#12 0x55ee1d0607fc in qemu_thread_start /mnt/sdb/backup/qemu/util/qemu-thread-posix.c:519 qemu#13 0x7f78159ae2dd in start_thread (/lib64/libpthread.so.0+0x82dd) qemu#14 0x7f78156df4b2 in __GI___clone (/lib64/libc.so.6+0xfc4b2) Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200506095416.26099-3-pannengyuan@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

When we hotplug vcpus, cpu_update_state is added to vm_change_state_head in kvm_arch_init_vcpu(). But it forgot to delete in kvm_arch_destroy_vcpu() after unplug. Then it will cause a use-after-free access. This patch delete it in kvm_arch_destroy_vcpu() to fix that. Reproducer: virsh setvcpus vm1 4 --live virsh setvcpus vm1 2 --live virsh suspend vm1 virsh resume vm1 The UAF stack: ==qemu-system-x86_64==28233==ERROR: AddressSanitizer: heap-use-after-free on address 0x62e00002e798 at pc 0x5573c6917d9e bp 0x7fff07139e50 sp 0x7fff07139e40 WRITE of size 1 at 0x62e00002e798 thread T0 #0 0x5573c6917d9d in cpu_update_state /mnt/sdb/qemu/target/i386/kvm.c:742 #1 0x5573c699121a in vm_state_notify /mnt/sdb/qemu/vl.c:1290 #2 0x5573c636287e in vm_prepare_start /mnt/sdb/qemu/cpus.c:2144 #3 0x5573c6362927 in vm_start /mnt/sdb/qemu/cpus.c:2150 #4 0x5573c71e8304 in qmp_cont /mnt/sdb/qemu/monitor/qmp-cmds.c:173 #5 0x5573c727cb1e in qmp_marshal_cont qapi/qapi-commands-misc.c:835 #6 0x5573c7694c7a in do_qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:132 #7 0x5573c7694c7a in qmp_dispatch /mnt/sdb/qemu/qapi/qmp-dispatch.c:175 #8 0x5573c71d9110 in monitor_qmp_dispatch /mnt/sdb/qemu/monitor/qmp.c:145 #9 0x5573c71dad4f in monitor_qmp_bh_dispatcher /mnt/sdb/qemu/monitor/qmp.c:234 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200513132630.13412-1-pannengyuan@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Trying libFuzzer on the vmport device, we get: AddressSanitizer:DEADLYSIGNAL ================================================================= ==29476==ERROR: AddressSanitizer: SEGV on unknown address 0x000000008840 (pc 0x56448bec4d79 bp 0x7ffeec9741b0 sp 0x7ffeec9740e0 T0) ==29476==The signal is caused by a READ memory access. #0 0x56448bec4d78 in vmport_ioport_read (qemu-fuzz-i386+0x1260d78) #1 0x56448bb5f175 in memory_region_read_accessor (qemu-fuzz-i386+0xefb175) #2 0x56448bb30c13 in access_with_adjusted_size (qemu-fuzz-i386+0xeccc13) #3 0x56448bb2ea27 in memory_region_dispatch_read1 (qemu-fuzz-i386+0xecaa27) #4 0x56448bb2e443 in memory_region_dispatch_read (qemu-fuzz-i386+0xeca443) #5 0x56448b961ab1 in flatview_read_continue (qemu-fuzz-i386+0xcfdab1) #6 0x56448b96336d in flatview_read (qemu-fuzz-i386+0xcff36d) #7 0x56448b962ec4 in address_space_read_full (qemu-fuzz-i386+0xcfeec4) This is easily reproducible using: $ echo inb 0x5658 | qemu-system-i386 -M isapc,accel=qtest -qtest stdio [I 1589796572.009763] OPENED [R +0.008069] inb 0x5658 Segmentation fault (core dumped) $ coredumpctl gdb -q Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00005605b54d0f21 in vmport_ioport_read (opaque=0x5605b7531ce0, addr=0, size=4) at hw/i386/vmport.c:77 77 eax = env->regs[R_EAX]; (gdb) p cpu $1 = (X86CPU *) 0x0 (gdb) bt #0 0x00005605b54d0f21 in vmport_ioport_read (opaque=0x5605b7531ce0, addr=0, size=4) at hw/i386/vmport.c:77 #1 0x00005605b53db114 in memory_region_read_accessor (mr=0x5605b7531d80, addr=0, value=0x7ffc9d261a30, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:434 #2 0x00005605b53db5d4 in access_with_adjusted_size (addr=0, value=0x7ffc9d261a30, size=1, access_size_min=4, access_size_max=4, access_fn= 0x5605b53db0d2 <memory_region_read_accessor>, mr=0x5605b7531d80, attrs=...) at memory.c:544 #3 0x00005605b53de156 in memory_region_dispatch_read1 (mr=0x5605b7531d80, addr=0, pval=0x7ffc9d261a30, size=1, attrs=...) at memory.c:1396 #4 0x00005605b53de228 in memory_region_dispatch_read (mr=0x5605b7531d80, addr=0, pval=0x7ffc9d261a30, op=MO_8, attrs=...) at memory.c:1424 #5 0x00005605b537c80a in flatview_read_continue (fv=0x5605b7650290, addr=22104, attrs=..., ptr=0x7ffc9d261b4b, len=1, addr1=0, l=1, mr=0x5605b7531d80) at exec.c:3200 #6 0x00005605b537c95d in flatview_read (fv=0x5605b7650290, addr=22104, attrs=..., buf=0x7ffc9d261b4b, len=1) at exec.c:3239 #7 0x00005605b537c9e6 in address_space_read_full (as=0x5605b5f74ac0 <address_space_io>, addr=22104, attrs=..., buf=0x7ffc9d261b4b, len=1) at exec.c:3252 #8 0x00005605b53d5a5d in address_space_read (len=1, buf=0x7ffc9d261b4b, attrs=..., addr=22104, as=0x5605b5f74ac0 <address_space_io>) at include/exec/memory.h:2401 #9 0x00005605b53d5a5d in cpu_inb (addr=22104) at ioport.c:88 X86CPU is NULL because QTest accelerator does not use CPU. Fix by returning default values when QTest accelerator is used. Reported-by: Clang AddressSanitizer Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

mon_get_cpu_env() is indirectly called monitor_parse_arguments() where the current monitor isn't set yet. Instead of using monitor_cur_env(), explicitly pass the Monitor pointer to the function. Without this fix, an HMP command like "x $pc" crashes like this: #0 0x0000555555caa01f in mon_get_cpu_sync (mon=0x0, synchronize=true) at ../monitor/misc.c:270 #1 0x0000555555caa141 in mon_get_cpu (mon=0x0) at ../monitor/misc.c:294 #2 0x0000555555caa158 in mon_get_cpu_env () at ../monitor/misc.c:299 #3 0x0000555555b19739 in monitor_get_pc (mon=0x555556ad2de0, md=0x5555565d2d40 <monitor_defs+1152>, val=0) at ../target/i386/monitor.c:607 #4 0x0000555555cadbec in get_monitor_def (mon=0x555556ad2de0, pval=0x7fffffffc208, name=0x7fffffffc220 "pc") at ../monitor/misc.c:1681 #5 0x000055555582ec4f in expr_unary (mon=0x555556ad2de0) at ../monitor/hmp.c:387 #6 0x000055555582edbb in expr_prod (mon=0x555556ad2de0) at ../monitor/hmp.c:421 #7 0x000055555582ee79 in expr_logic (mon=0x555556ad2de0) at ../monitor/hmp.c:455 #8 0x000055555582eefe in expr_sum (mon=0x555556ad2de0) at ../monitor/hmp.c:484 #9 0x000055555582efe8 in get_expr (mon=0x555556ad2de0, pval=0x7fffffffc418, pp=0x7fffffffc408) at ../monitor/hmp.c:511 #10 0x000055555582fcd4 in monitor_parse_arguments (mon=0x555556ad2de0, endp=0x7fffffffc890, cmd=0x555556675b50 <hmp_cmds+7920>) at ../monitor/hmp.c:876 qemu#11 0x00005555558306a8 in handle_hmp_command (mon=0x555556ad2de0, cmdline=0x555556ada452 "$pc") at ../monitor/hmp.c:1087 qemu#12 0x000055555582df14 in monitor_command_cb (opaque=0x555556ad2de0, cmdline=0x555556ada450 "x $pc", readline_opaque=0x0) at ../monitor/hmp.c:47 After this fix, nothing is left in monitor_parse_arguments() that can indirectly call monitor_cur(), so the fix is complete. Fixes: ff04108 Reported-by: lichun <lichun@ruijie.com.cn> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20201113114326.97663-4-kwolf@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

In qobject_type(), NULL is returned when the 'QObject' returned from parse_value() is not of QString type, and this 'QObject' memory will leaked. So we need to first cache the 'QObject' returned from parse_value(), and finally free 'QObject' memory at the end of the function. Also, we add a testcast about invalid dict key. The memleak stack is as follows: Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0xfffe4b3c34fb in __interceptor_malloc (/lib64/libasan.so.4+0xd34fb) #1 0xfffe4ae48aa3 in g_malloc (/lib64/libglib-2.0.so.0+0x58aa3) #2 0xaaab3557d9f7 in qnum_from_int qemu/qobject/qnum.c:25 #3 0xaaab35584d23 in parse_literal qemu/qobject/json-parser.c:511 #4 0xaaab35584d23 in parse_value qemu/qobject/json-parser.c:554 #5 0xaaab35583d77 in parse_pair qemu/qobject/json-parser.c:270 #6 0xaaab355845db in parse_object qemu/qobject/json-parser.c:327 #7 0xaaab355845db in parse_value qemu/qobject/json-parser.c:546 #8 0xaaab35585b1b in json_parser_parse qemu/qobject/json-parser.c:580 #9 0xaaab35583703 in json_message_process_token qemu/qobject/json-streamer.c:92 #10 0xaaab355ddccf in json_lexer_feed_char qemu/qobject/json-lexer.c:313 qemu#11 0xaaab355de0eb in json_lexer_feed qemu/qobject/json-lexer.c:350 qemu#12 0xaaab354aff67 in tcp_chr_read qemu/chardev/char-socket.c:525 qemu#13 0xfffe4ae429db in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x529db) qemu#14 0xfffe4ae42d8f (/lib64/libglib-2.0.so.0+0x52d8f) qemu#15 0xfffe4ae430df in g_main_loop_run (/lib64/libglib-2.0.so.0+0x530df) qemu#16 0xaaab34d70bff in iothread_run qemu/iothread.c:82 qemu#17 0xaaab3559d71b in qemu_thread_start qemu/util/qemu-thread-posix.c:519 Fixes: 532fb53 ("qapi: Make more of qobject_to()") Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Alex Chen <alex.chen@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20201113145525.85151-1-alex.chen@huawei.com> [Commit message tweaked]

Properly free resp for get_watchdog_action() to avoid memory leak. ASAN shows memory leak stack: Indirect leak of 12360 byte(s) in 3 object(s) allocated from: #0 0x7f41ab6cbd4e in __interceptor_calloc (/lib64/libasan.so.5+0x112d4e) #1 0x7f41ab4eaa50 in g_malloc0 (/lib64/libglib-2.0.so.0+0x55a50) #2 0x556487d5374b in qdict_new ../qobject/qdict.c:29 #3 0x556487d65e1a in parse_object ../qobject/json-parser.c:318 #4 0x556487d65cb6 in parse_pair ../qobject/json-parser.c:287 #5 0x556487d65ebd in parse_object ../qobject/json-parser.c:343 #6 0x556487d661d5 in json_parser_parse ../qobject/json-parser.c:580 #7 0x556487d513df in json_message_process_token ../qobject/json-streamer.c:92 #8 0x556487d63919 in json_lexer_feed_char ../qobject/json-lexer.c:313 #9 0x556487d63d75 in json_lexer_feed ../qobject/json-lexer.c:350 #10 0x556487d28b2a in qmp_fd_receive ../tests/qtest/libqtest.c:613 qemu#11 0x556487d2a16f in qtest_qmp_eventwait_ref ../tests/qtest/libqtest.c:827 qemu#12 0x556487d248e2 in get_watchdog_action ../tests/qtest/npcm7xx_watchdog_timer-test.c:94 qemu#13 0x556487d25765 in test_enabling_flags ../tests/qtest/npcm7xx_watchdog_timer-test.c:243 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Message-Id: <20201118115646.2461726-3-kuhn.chenqun@huawei.com> Reviewed-by: Havard Skinnemoen <hskinnemoen@google.com> Reviewed-by: Hao Wu <wuhaotsh@google.com> Signed-off-by: Thomas Huth <thuth@redhat.com>

When 'j = icu->nr_sense – 1', the 'j < icu->nr_sense' condition is true, then 'j = icu->nr_sense', the'icu->init_sense[j]' has out-of-bounds access. The asan showed stack: ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000004d7d at pc 0x55852cd26a76 bp 0x7ffe39f26200 sp 0x7ffe39f261f0 READ of size 1 at 0x604000004d7d thread T0 #0 0x55852cd26a75 in rxicu_realize ../hw/intc/rx_icu.c:311 #1 0x55852cf075f7 in device_set_realized ../hw/core/qdev.c:886 #2 0x55852cd4a32f in property_set_bool ../qom/object.c:2251 #3 0x55852cd4f9bb in object_property_set ../qom/object.c:1398 #4 0x55852cd54f3f in object_property_set_qobject ../qom/qom-qobject.c:28 #5 0x55852cd4fc3f in object_property_set_bool ../qom/object.c:1465 #6 0x55852cbf0b27 in register_icu ../hw/rx/rx62n.c:156 #7 0x55852cbf12a6 in rx62n_realize ../hw/rx/rx62n.c:261 #8 0x55852cf075f7 in device_set_realized ../hw/core/qdev.c:886 #9 0x55852cd4a32f in property_set_bool ../qom/object.c:2251 #10 0x55852cd4f9bb in object_property_set ../qom/object.c:1398 qemu#11 0x55852cd54f3f in object_property_set_qobject ../qom/qom-qobject.c:28 qemu#12 0x55852cd4fc3f in object_property_set_bool ../qom/object.c:1465 qemu#13 0x55852cbf1a85 in rx_gdbsim_init ../hw/rx/rx-gdbsim.c:109 qemu#14 0x55852cd22de0 in qemu_init ../softmmu/vl.c:4380 qemu#15 0x55852ca57088 in main ../softmmu/main.c:49 qemu#16 0x7feefafa5d42 in __libc_start_main (/lib64/libc.so.6+0x26d42) Add the 'ice->src[i].sense' initialize to the default value, and then process init_sense array to identify which irqs should be level-triggered. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20201111141733.2358800-1-kuhn.chenqun@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

When running device-introspect-test, a memory leak occurred in the s390_cpu_initfn function, this patch use timer_free() in the finalize function to fix it. ASAN shows memory leak stack: Direct leak of 3552 byte(s) in 74 object(s) allocated from: #0 0xfffeb3d4e1f0 in __interceptor_calloc (/lib64/libasan.so.5+0xee1f0) #1 0xfffeb36e6800 in g_malloc0 (/lib64/libglib-2.0.so.0+0x56800) #2 0xaaad51a8f9c4 in timer_new_full qemu/include/qemu/timer.h:523 #3 0xaaad51a8f9c4 in timer_new qemu/include/qemu/timer.h:544 #4 0xaaad51a8f9c4 in timer_new_ns qemu/include/qemu/timer.h:562 #5 0xaaad51a8f9c4 in s390_cpu_initfn qemu/target/s390x/cpu.c:304 #6 0xaaad51e00f58 in object_init_with_type qemu/qom/object.c:371 #7 0xaaad51e0406c in object_initialize_with_type qemu/qom/object.c:515 #8 0xaaad51e042e0 in object_new_with_type qemu/qom/object.c:729 #9 0xaaad51e3ff40 in qmp_device_list_properties qemu/qom/qom-qmp-cmds.c:153 #10 0xaaad51910518 in qdev_device_help qemu/softmmu/qdev-monitor.c:283 qemu#11 0xaaad51911918 in qmp_device_add qemu/softmmu/qdev-monitor.c:801 qemu#12 0xaaad51911e48 in hmp_device_add qemu/softmmu/qdev-monitor.c:916 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Gan Qixin <ganqixin@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20201204081209.360524-4-ganqixin@huawei.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>

Alexander reported an issue in gic_get_current_cpu() using the fuzzer. Yet another "deref current_cpu with QTest" bug, reproducible doing: $ echo readb 0xf03ff000 | qemu-system-arm -M npcm750-evb,accel=qtest -qtest stdio [I 1611849440.651452] OPENED [R +0.242498] readb 0xf03ff000 hw/intc/arm_gic.c:63:29: runtime error: member access within null pointer of type 'CPUState' (aka 'struct CPUState') SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hw/intc/arm_gic.c:63:29 in AddressSanitizer:DEADLYSIGNAL ================================================================= ==3719691==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000082a0 (pc 0x5618790ac882 bp 0x7ffca946f4f0 sp 0x7ffca946f4a0 T0) ==3719691==The signal is caused by a READ memory access. #0 0x5618790ac882 in gic_get_current_cpu hw/intc/arm_gic.c:63:29 #1 0x5618790a8901 in gic_dist_readb hw/intc/arm_gic.c:955:11 #2 0x5618790a7489 in gic_dist_read hw/intc/arm_gic.c:1158:17 #3 0x56187adc573b in memory_region_read_with_attrs_accessor softmmu/memory.c:464:9 #4 0x56187ad7903a in access_with_adjusted_size softmmu/memory.c:552:18 #5 0x56187ad766d6 in memory_region_dispatch_read1 softmmu/memory.c:1426:16 #6 0x56187ad758a8 in memory_region_dispatch_read softmmu/memory.c:1449:9 #7 0x56187b09e84c in flatview_read_continue softmmu/physmem.c:2822:23 #8 0x56187b0a0115 in flatview_read softmmu/physmem.c:2862:12 #9 0x56187b09fc9e in address_space_read_full softmmu/physmem.c:2875:18 #10 0x56187aa88633 in address_space_read include/exec/memory.h:2489:18 qemu#11 0x56187aa88633 in qtest_process_command softmmu/qtest.c:558:13 qemu#12 0x56187aa81881 in qtest_process_inbuf softmmu/qtest.c:797:9 qemu#13 0x56187aa80e02 in qtest_read softmmu/qtest.c:809:5 current_cpu is NULL because QTest accelerator does not use CPU. Fix by skipping the check and returning the first CPU index when QTest accelerator is used, similarly to commit c781a2c ("hw/i386/vmport: Allow QTest use without crashing"). Reported-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Alexander Bulekov <alxndr@bu.edu> Message-id: 20210128161417.3726358-1-philmd@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

When the reconnect in NBD client is in progress, the iochannel used for NBD connection doesn't exist. Therefore an attempt to detach it from the aio_context of the parent BlockDriverState results in a NULL pointer dereference. The problem is triggerable, in particular, when an outgoing migration is about to finish, and stopping the dataplane tries to move the BlockDriverState from the iothread aio_context to the main loop. If the NBD connection is lost before this point, and the NBD client has entered the reconnect procedure, QEMU crashes: #0 qemu_aio_coroutine_enter (ctx=0x5618056c7580, co=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109 #1 0x00005618034b1b68 in nbd_client_attach_aio_context_bh ( opaque=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c:164 #2 0x000056180353116b in aio_wait_bh (opaque=0x7f60e1e63700) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55 #3 0x0000561803530633 in aio_bh_call (bh=0x7f60d40a7e80) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136 #4 aio_bh_poll (ctx=ctx@entry=0x5618056c7580) at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164 #5 0x0000561803533e5a in aio_poll (ctx=ctx@entry=0x5618056c7580, blocking=blocking@entry=true) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650 #6 0x000056180353128d in aio_wait_bh_oneshot (ctx=0x5618056c7580, cb=<optimized out>, opaque=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71 #7 0x000056180345c50a in bdrv_attach_aio_context (new_context=0x5618056c7580, bs=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172 #8 bdrv_set_aio_context_ignore (bs=bs@entry=0x561805ed4c00, new_context=new_context@entry=0x5618056c7580, ignore=ignore@entry=0x7f60e1e63780) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237 #9 0x000056180345c969 in bdrv_child_try_set_aio_context ( bs=bs@entry=0x561805ed4c00, ctx=0x5618056c7580, ignore_child=<optimized out>, errp=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332 #10 0x00005618034957db in blk_do_set_aio_context (blk=0x56180695b3f0, new_context=0x5618056c7580, update_root_node=update_root_node@entry=true, errp=errp@entry=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989 qemu#11 0x00005618034980bd in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>, errp=errp@entry=0x0) at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010 qemu#12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 qemu#13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=0x5618056d9f08) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245 qemu#14 0x00005618031c9b2e in virtio_vmstate_change (opaque=0x5618056d9f90, running=0, state=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220 qemu#15 0x0000561803208bfd in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275 qemu#16 0x0000561803155c02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE, send_stop=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:1032 qemu#17 0x00005618033e3765 in migration_completion (s=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914 qemu#18 migration_iteration_run (s=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275 qemu#19 migration_thread (opaque=opaque@entry=0x5618056e6960) at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439 qemu#20 0x0000561803536ad6 in qemu_thread_start (args=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519 qemu#21 0x00007f61085d06ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 qemu#22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6 qemu#23 0x0000000000000000 in ?? () Fix it by checking that the iochannel is non-null before trying to detach it from the aio_context. If it is null, no detaching is needed, and it will get reattached in the proper aio_context once the connection is reestablished. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210129073859.683063-2-rvkagan@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com>

When an NBD block driver state is moved from one aio_context to another (e.g. when doing a drain in a migration thread), nbd_client_attach_aio_context_bh is executed that enters the connection coroutine. However, the assumption that ->connection_co is always present here appears incorrect: the connection may have encountered an error other than -EIO in the underlying transport, and thus may have decided to quit rather than keep trying to reconnect, and therefore it may have terminated the connection coroutine. As a result an attempt to reassign the client in this state (NBD_CLIENT_QUIT) to a different aio_context leads to a null pointer dereference: #0 qio_channel_detach_aio_context (ioc=0x0) at /build/qemu-gYtjVn/qemu-5.0.1/io/channel.c:452 #1 0x0000562a242824b3 in bdrv_detach_aio_context (bs=0x562a268d6a00) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6151 #2 bdrv_set_aio_context_ignore (bs=bs@entry=0x562a268d6a00, new_context=new_context@entry=0x562a260c9580, ignore=ignore@entry=0x7feeadc9b780) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6230 #3 0x0000562a24282969 in bdrv_child_try_set_aio_context (bs=bs@entry=0x562a268d6a00, ctx=0x562a260c9580, ignore_child=<optimized out>, errp=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6332 #4 0x0000562a242bb7db in blk_do_set_aio_context (blk=0x562a2735d0d0, new_context=0x562a260c9580, update_root_node=update_root_node@entry=true, errp=errp@entry=0x0) at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:1989 #5 0x0000562a242be0bd in blk_set_aio_context (blk=<optimized out>, new_context=<optimized out>, errp=errp@entry=0x0) at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:2010 #6 0x0000562a23fbd953 in virtio_blk_data_plane_stop (vdev=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292 #7 0x0000562a241fc7bf in virtio_bus_stop_ioeventfd (bus=0x562a260dbf08) at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio-bus.c:245 #8 0x0000562a23fefb2e in virtio_vmstate_change (opaque=0x562a260dbf90, running=0, state=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio.c:3220 #9 0x0000562a2402ebfd in vm_state_notify (running=running@entry=0, state=state@entry=RUN_STATE_FINISH_MIGRATE) at /build/qemu-gYtjVn/qemu-5.0.1/softmmu/vl.c:1275 #10 0x0000562a23f7bc02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE, send_stop=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/cpus.c:1032 qemu#11 0x0000562a24209765 in migration_completion (s=0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:2914 qemu#12 migration_iteration_run (s=0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3275 qemu#13 migration_thread (opaque=opaque@entry=0x562a260e83a0) at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3439 qemu#14 0x0000562a2435ca96 in qemu_thread_start (args=<optimized out>) at /build/qemu-gYtjVn/qemu-5.0.1/util/qemu-thread-posix.c:519 qemu#15 0x00007feed31466ba in start_thread (arg=0x7feeadc9c700) at pthread_create.c:333 qemu#16 0x00007feed2e7c41d in __GI___sysctl (name=0x0, nlen=608471908, oldval=0x562a2452b138, oldlenp=0x0, newval=0x562a2452c5e0 <__func__.28102>, newlen=0) at ../sysdeps/unix/sysv/linux/sysctl.c:30 qemu#17 0x0000000000000000 in ?? () Fix it by checking that the connection coroutine is non-null before trying to enter it. If it is null, no entering is needed, as the connection is probably going down anyway. Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210129073859.683063-3-rvkagan@yandex-team.ru> Signed-off-by: Eric Blake <eblake@redhat.com>

Address space is destroyed without proper removal of its listeners with current code. They are expected to be removed in virtio_device_instance_finalize [1], but qemu calls it through object_deinit, after address_space_destroy call through device_set_realized [2]. Move it to virtio_device_unrealize, called before device_set_realized [3] and making it symmetric with memory_listener_register in virtio_device_realize. v2: Delete no-op call of virtio_device_instance_finalize. Add backtraces. [1] #0 virtio_device_instance_finalize (obj=0x555557de5120) at /home/qemu/include/hw/virtio/virtio.h:71 #1 0x0000555555b703c9 in object_deinit (type=0x555556639860, obj=<optimized out>) at ../qom/object.c:671 #2 object_finalize (data=0x555557de5120) at ../qom/object.c:685 #3 object_unref (objptr=0x555557de5120) at ../qom/object.c:1184 #4 0x0000555555b4de9d in bus_free_bus_child (kid=0x555557df0660) at ../hw/core/qdev.c:55 #5 0x0000555555c65003 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:281 Queued by: #0 bus_remove_child (bus=0x555557de5098, child=child@entry=0x555557de5120) at ../hw/core/qdev.c:60 #1 0x0000555555b4ee31 in device_unparent (obj=<optimized out>) at ../hw/core/qdev.c:984 #2 0x0000555555b70465 in object_finalize_child_property ( obj=<optimized out>, name=<optimized out>, opaque=0x555557de5120) at ../qom/object.c:1725 #3 0x0000555555b6fa17 in object_property_del_child ( child=0x555557de5120, obj=0x555557ddcf90) at ../qom/object.c:645 #4 object_unparent (obj=0x555557de5120) at ../qom/object.c:664 #5 0x0000555555b4c071 in bus_unparent (obj=<optimized out>) at ../hw/core/bus.c:147 #6 0x0000555555b70465 in object_finalize_child_property ( obj=<optimized out>, name=<optimized out>, opaque=0x555557de5098) at ../qom/object.c:1725 #7 0x0000555555b6fa17 in object_property_del_child ( child=0x555557de5098, obj=0x555557ddcf90) at ../qom/object.c:645 #8 object_unparent (obj=0x555557de5098) at ../qom/object.c:664 #9 0x0000555555b4ee19 in device_unparent (obj=<optimized out>) at ../hw/core/qdev.c:981 #10 0x0000555555b70465 in object_finalize_child_property ( obj=<optimized out>, name=<optimized out>, opaque=0x555557ddcf90) at ../qom/object.c:1725 qemu#11 0x0000555555b6fa17 in object_property_del_child ( child=0x555557ddcf90, obj=0x55555685da10) at ../qom/object.c:645 qemu#12 object_unparent (obj=0x555557ddcf90) at ../qom/object.c:664 qemu#13 0x00005555558dc331 in pci_for_each_device_under_bus ( opaque=<optimized out>, fn=<optimized out>, bus=<optimized out>) at ../hw/pci/pci.c:1654 [2] Optimizer omits pci_qdev_unrealize, called by device_set_realized, and do_pci_unregister_device, called by pci_qdev_unrealize and caller of address_space_destroy. #0 address_space_destroy (as=0x555557ddd1b8) at ../softmmu/memory.c:2840 #1 0x0000555555b4fc53 in device_set_realized (obj=0x555557ddcf90, value=<optimized out>, errp=0x7fffeea8f1e0) at ../hw/core/qdev.c:850 #2 0x0000555555b6eaa6 in property_set_bool (obj=0x555557ddcf90, v=<optimized out>, name=<optimized out>, opaque=0x555556650ba0, errp=0x7fffeea8f1e0) at ../qom/object.c:2255 #3 0x0000555555b70e07 in object_property_set ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99df "realized", v=v@entry=0x7fffe46b7500, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #4 0x0000555555b73c5f in object_property_set_qobject ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99df "realized", value=value@entry=0x7fffe44f6180, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 #5 0x0000555555b71044 in object_property_set_bool ( obj=0x555557ddcf90, name=0x555555db99df "realized", value=<optimized out>, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 #6 0x0000555555921cb7 in pcie_unplug_device (bus=<optimized out>, dev=0x555557ddcf90, opaque=<optimized out>) at /home/qemu/include/hw/qdev-core.h:17 #7 0x00005555558dc331 in pci_for_each_device_under_bus ( opaque=<optimized out>, fn=<optimized out>, bus=<optimized out>) at ../hw/pci/pci.c:1654 [3] #0 virtio_device_unrealize (dev=0x555557de5120) at ../hw/virtio/virtio.c:3680 #1 0x0000555555b4fc63 in device_set_realized (obj=0x555557de5120, value=<optimized out>, errp=0x7fffee28df90) at ../hw/core/qdev.c:850 #2 0x0000555555b6eab6 in property_set_bool (obj=0x555557de5120, v=<optimized out>, name=<optimized out>, opaque=0x555556650ba0, errp=0x7fffee28df90) at ../qom/object.c:2255 #3 0x0000555555b70e17 in object_property_set ( obj=obj@entry=0x555557de5120, name=name@entry=0x555555db99ff "realized", v=v@entry=0x7ffdd8035040, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #4 0x0000555555b73c6f in object_property_set_qobject ( obj=obj@entry=0x555557de5120, name=name@entry=0x555555db99ff "realized", value=value@entry=0x7ffdd8035020, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 #5 0x0000555555b71054 in object_property_set_bool ( obj=0x555557de5120, name=name@entry=0x555555db99ff "realized", value=value@entry=false, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 #6 0x0000555555b4edc5 in qdev_unrealize (dev=<optimized out>) at ../hw/core/qdev.c:403 #7 0x0000555555b4c2a9 in bus_set_realized (obj=<optimized out>, value=<optimized out>, errp=<optimized out>) at ../hw/core/bus.c:204 #8 0x0000555555b6eab6 in property_set_bool (obj=0x555557de5098, v=<optimized out>, name=<optimized out>, opaque=0x555557df04c0, errp=0x7fffee28e0a0) at ../qom/object.c:2255 #9 0x0000555555b70e17 in object_property_set ( obj=obj@entry=0x555557de5098, name=name@entry=0x555555db99ff "realized", v=v@entry=0x7ffdd8034f50, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 #10 0x0000555555b73c6f in object_property_set_qobject ( obj=obj@entry=0x555557de5098, name=name@entry=0x555555db99ff "realized", value=value@entry=0x7ffdd8020630, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 qemu#11 0x0000555555b71054 in object_property_set_bool ( obj=obj@entry=0x555557de5098, name=name@entry=0x555555db99ff "realized", value=value@entry=false, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 qemu#12 0x0000555555b4c725 in qbus_unrealize ( bus=bus@entry=0x555557de5098) at ../hw/core/bus.c:178 qemu#13 0x0000555555b4fc00 in device_set_realized (obj=0x555557ddcf90, value=<optimized out>, errp=0x7fffee28e1e0) at ../hw/core/qdev.c:844 qemu#14 0x0000555555b6eab6 in property_set_bool (obj=0x555557ddcf90, v=<optimized out>, name=<optimized out>, opaque=0x555556650ba0, errp=0x7fffee28e1e0) at ../qom/object.c:2255 qemu#15 0x0000555555b70e17 in object_property_set ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99ff "realized", v=v@entry=0x7ffdd8020560, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/object.c:1400 qemu#16 0x0000555555b73c6f in object_property_set_qobject ( obj=obj@entry=0x555557ddcf90, name=name@entry=0x555555db99ff "realized", value=value@entry=0x7ffdd8020540, errp=errp@entry=0x5555565bbf38 <error_abort>) at ../qom/qom-qobject.c:28 qemu#17 0x0000555555b71054 in object_property_set_bool ( obj=0x555557ddcf90, name=0x555555db99ff "realized", value=<optimized out>, errp=0x5555565bbf38 <error_abort>) at ../qom/object.c:1470 qemu#18 0x0000555555921cb7 in pcie_unplug_device (bus=<optimized out>, dev=0x555557ddcf90, opaque=<optimized out>) at /home/qemu/include/hw/qdev-core.h:17 qemu#19 0x00005555558dc331 in pci_for_each_device_under_bus ( opaque=<optimized out>, fn=<optimized out>, bus=<optimized out>) at ../hw/pci/pci.c:1654 Fixes: c611c76 ("virtio: add MemoryListener to cache ring translations") Buglink: https://bugs.launchpad.net/qemu/+bug/1912846 Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20210125192505.390554-1-eperezma@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Incoming enabled bitmaps are busy, because we do bdrv_dirty_bitmap_create_successor() for them. But disabled bitmaps being migrated are not marked busy, and user can remove them during the incoming migration. Then we may crash in cancel_incoming_locked() when try to remove the bitmap that was already removed by user, like this: #0 qemu_mutex_lock_impl (mutex=0x5593d88c50d1, file=0x559680554b20 "../block/dirty-bitmap.c", line=64) at ../util/qemu-thread-posix.c:77 #1 bdrv_dirty_bitmaps_lock (bs=0x5593d88c0ee9) at ../block/dirty-bitmap.c:64 #2 bdrv_release_dirty_bitmap (bitmap=0x5596810e9570) at ../block/dirty-bitmap.c:362 #3 cancel_incoming_locked (s=0x559680be8208 <dbm_state+40>) at ../migration/block-dirty-bitmap.c:918 #4 dirty_bitmap_load (f=0x559681d02b10, opaque=0x559680be81e0 <dbm_state>, version_id=1) at ../migration/block-dirty-bitmap.c:1194 #5 vmstate_load (f=0x559681d02b10, se=0x559680fb5810) at ../migration/savevm.c:908 #6 qemu_loadvm_section_part_end (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2473 #7 qemu_loadvm_state_main (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2626 #8 postcopy_ram_listen_thread (opaque=0x0) at ../migration/savevm.c:1871 #9 qemu_thread_start (args=0x5596817ccd10) at ../util/qemu-thread-posix.c:521 #10 start_thread () at /lib64/libpthread.so.0 qemu#11 clone () at /lib64/libc.so.6 Note bs pointer taken from bitmap: it's definitely bad aligned. That's because we are in use after free, bitmap is already freed. So, let's make disabled bitmaps (being migrated) busy during incoming migration. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210322094906.5079-2-vsementsov@virtuozzo.com>

When building with --enable-sanitizers we get: Direct leak of 32 byte(s) in 2 object(s) allocated from: #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf) #1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958) #2 0x561847f02ca2 in usb_packet_init hw/usb/core.c:531:5 #3 0x561848df4df4 in usb_ehci_init hw/usb/hcd-ehci.c:2575:5 #4 0x561847c119ac in ehci_sysbus_init hw/usb/hcd-ehci-sysbus.c:73:5 #5 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9 #6 0x56184a5bd955 in object_init_with_type qom/object.c:371:9 #7 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5 #8 0x56184a5a24d5 in object_initialize qom/object.c:536:5 #9 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5 #10 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10 qemu#11 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5 qemu#12 0x561849542d18 in npcm7xx_init hw/arm/npcm7xx.c:427:5 Similarly to commit d710e1e ("usb: ehci: fix memory leak in ehci"), fix by calling usb_ehci_finalize() to free the USBPacket. Fixes: 7341ea0 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210323183701.281152-1-f4bug@amsat.org> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

When building with --enable-sanitizers we get: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf) #1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958) #2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5 #3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9 #4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5 #5 0x56184a5a24d5 in object_initialize qom/object.c:536:5 #6 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5 #7 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10 #8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5 #9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5 The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize() to destroy them. Fixes: 58ac482 ("introduce xlnx-dp") Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20210323182958.277654-1-f4bug@amsat.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

fix memop size calculation for i386

7454f02

the memop size is encoded in mop variable and not in idx

ndesh26 force-pushed the qsim_v26 branch from ef26120 to 8d8a88e Compare March 11, 2018 20:12

initiate callback for mem access by helper functions

2705b6f

add callbacks for the helper functions defined in target-i386/mem_helper.c as they do not lead callbacks

ndesh26 force-pushed the qsim_v26 branch from 8d8a88e to 2705b6f Compare March 11, 2018 20:24

pranith merged commit 11c2cc9 into pranith:qsim_v26 Mar 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix memop size calculation for i386 #9

fix memop size calculation for i386 #9

ndesh26 commented Mar 6, 2018

ndesh26 commented Mar 6, 2018

fix memop size calculation for i386 #9

fix memop size calculation for i386 #9

Conversation

ndesh26 commented Mar 6, 2018

ndesh26 commented Mar 6, 2018