Fix address resolution issue with `-s` option #30

quantumdude836 · 2015-10-25T17:34:18Z

In qemu_chr_parse_socket, port is copied to addr, but has_port is not set, so port isn't copied in future qapi_copy_SocketAddress calls. This results in an "address resolution failed for ::" message when using the -s option (or using -gdb tcp::<port>, i.e. not specifying a hostname).

`port` is set in `addr`, but `has_port` is not, so `port` isn't copied in `qapi_copy_SocketAddress`

up2wing · 2015-10-26T01:08:34Z

Well, as I know, here is just a mirror of qemu.git. It seemed that Qemu only accept patches from mailing list:
http://wiki.qemu.org/Contribute/SubmitAPatch

Thx

quantumdude836 · 2015-10-26T04:51:09Z

Ah ok, thanks

target-mips: fix hflags modified in delay / forbidden slot Fixes qemu#29 Signed-off-by: Leon Alrae leon.alrae@imgtec.com

When a previously removed CPU is hotplugged, ensure that ICP is associated with it. This is required to correctly remove that CPU on the target side after migration. This fixes the following softlock up when CPU removal is attempted on the target side after migration: NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [drmgr:1755] CPU: 5 PID: 1755 Comm: drmgr Not tainted 4.1.0-rc8+ #30 task: c0000000f3fbaa00 ti: c0000000f9ee0000 task.ti: c0000000f9ee0000 NIP: c000000000145d6c LR: c000000000145d34 CTR: c0000000000b9730 REGS: c0000000f9ee30c0 TRAP: 0901 Not tainted (4.1.0-rc8+) MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI> CR: 42008482 XER: 00000000 CFAR: c000000000145d7c SOFTE: 1 00145d34 c0000000f9ee3340 c0000000014d0100 0000000000000000 00000040 0000000000000004 c0000000ffe1fab0 c000000001670100 00000001 0000000000000003 0000000000003800 c0000000008f0760 22000484 c00000000ff21180 NIP [c000000000145d6c] .smp_call_function_single+0x16c/0x1d0 LR [c000000000145d34] .smp_call_function_single+0x134/0x1d0 Call Trace: [c0000000f9ee3340] [c000000000145d34] .smp_call_function_single+0x134/0x1d0 (unreliable) [c0000000f9ee33f0] [c0000000001d1940] .perf_event_exit_cpu+0xd0/0x1a0 [c0000000f9ee34e0] [c0000000001d1c88] .perf_cpu_notify+0x58/0x90 [c0000000f9ee3550] [c0000000000cccfc] .notifier_call_chain+0x8c/0x100 [c0000000f9ee35f0] [c0000000001e4848] ._cpu_down+0xc8/0x3f0 [c0000000f9ee36c0] [c0000000001e4bc0] .cpu_down+0x50/0x90 [c0000000f9ee3740] [c000000000566494] .cpu_subsys_offline+0x24/0x40 [c0000000f9ee37c0] [c00000000055e6e8] .device_offline+0xf8/0x140 [c0000000f9ee3850] [c00000000007ead8] .dlpar_cpu_release+0x278/0x2e0 [c0000000f9ee3940] [c00000000001b214] .arch_cpu_release+0x54/0x80 [c0000000f9ee39c0] [c000000000566980] .cpu_release_store+0x40/0x70 [c0000000f9ee3a40] [c00000000055a754] .dev_attr_store+0x64/0xa0 [c0000000f9ee3ad0] [c00000000032096c] .sysfs_kf_write+0x7c/0xb0 [c0000000f9ee3b60] [c00000000031f900] .kernfs_fop_write+0x190/0x1f0 [c0000000f9ee3c00] [c00000000027afd8] .__vfs_write+0x68/0x190 [c0000000f9ee3cf0] [c00000000027bca8] .vfs_write+0xb8/0x200 [c0000000f9ee3d90] [c00000000027cc44] .SyS_write+0x64/0x110 Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

With a Spice port chardev, it is possible to reenter monitor_qapi_event_queue() (when the client disconnects for example). This will dead-lock on monitor_lock. Instead, use some TLS variables to check for recursion and queue the events. Fixes: (gdb) bt #0 0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0 #1 0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0 #2 0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66 #3 0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645 #4 0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149 #5 0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235 #6 0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316 #7 0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197 #8 0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197 #9 0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388 #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347 #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0 #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341 #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305 #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353 #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199 #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112 #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147 #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42 #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425 #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468 #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517 #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538 #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624 #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649 #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58 #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822 #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862 #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644 Note that error report is now moved to the first caller, which may receive an error for a recursed event. This is probably fine (95% of callers use &error_abort, the rest have NULL error and ignore it) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180731150144.14022-1-marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [*_no_recurse renamed to *_no_reenter, local variables reordered] Signed-off-by: Markus Armbruster <armbru@redhat.com>

In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 #2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 #3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 #4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 #5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 #6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 #11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 #12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 #13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 #2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 #3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 #4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 #5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 #6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 #8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 #9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 #11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 #12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 #13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 #14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 #15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 #16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 #17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 #18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 #19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 #20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 #21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 #22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 #23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 #24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 #25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 #26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 #27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 #28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 #29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 #30 0x00005631fbd81412 in main_loop () at vl.c:1866 #31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 #7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Both virtio-blk and virtio-scsi use virtio_queue_empty() as the loop condition in VQ handlers (virtio_blk_handle_vq, virtio_scsi_handle_cmd_vq). When a device is marked broken in virtqueue_pop, for example if a vIOMMU address translation failed, we want to break out of the loop. This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64 kernel: $ qemu-system-x86_64 \ ... \ -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \ -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0 The dead loop happens immediately when the kernel boots and initializes the device, where virtio_scsi_data_plane_handle_cmd will not return: > ... > qemu#13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > qemu#14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > qemu#15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > qemu#16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > qemu#17 0x00005586607885da in run_poll_handlers_once > qemu#18 0x000055866078880e in try_poll_mode > qemu#19 0x00005586607888eb in aio_poll > qemu#20 0x0000558660784561 in aio_wait_bh_oneshot > qemu#21 0x00005586602b9582 in virtio_scsi_dataplane_stop > qemu#22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > qemu#23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > qemu#24 0x00005586605ab808 in virtio_pci_common_write > qemu#25 0x0000558660242396 in memory_region_write_accessor > qemu#26 0x00005586602425ab in access_with_adjusted_size > qemu#27 0x0000558660245281 in memory_region_dispatch_write > qemu#28 0x00005586601e008e in flatview_write_continue > qemu#29 0x00005586601e01d8 in flatview_write > qemu#30 0x00005586601e04de in address_space_write > qemu#31 0x00005586601e052f in address_space_rw > qemu#32 0x00005586602607f2 in kvm_cpu_exec > qemu#33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > qemu#34 0x000055866078bde7 in qemu_thread_start > qemu#35 0x00007f5784906594 in start_thread > qemu#36 0x00007f5784639e6f in clone With this patch, virtio_queue_empty will now return 1 as soon as the vdev is marked as broken, after a "virtio: zero sized buffers are not allowed" error. To be consistent, update virtio_queue_empty_rcu as well. Signed-off-by: Fam Zheng <famz@redhat.com> --- v2: - Drop ATS condition from the patch description since it is not essential. - Drop patch 2. [Paolo]

Both virtio-blk and virtio-scsi use virtio_queue_empty() as the loop condition in VQ handlers (virtio_blk_handle_vq, virtio_scsi_handle_cmd_vq). When a device is marked broken in virtqueue_pop, for example if a vIOMMU address translation failed, we want to break out of the loop. This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64 kernel with ATS enabled: $ qemu-system-x86_64 \ ... \ -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \ -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0 The dead loop happens immediately when the kernel boots and initializes the device, where virtio_scsi_data_plane_handle_cmd will not return: > ... > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > #17 0x00005586607885da in run_poll_handlers_once > #18 0x000055866078880e in try_poll_mode > #19 0x00005586607888eb in aio_poll > #20 0x0000558660784561 in aio_wait_bh_oneshot > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > #24 0x00005586605ab808 in virtio_pci_common_write > #25 0x0000558660242396 in memory_region_write_accessor > #26 0x00005586602425ab in access_with_adjusted_size > #27 0x0000558660245281 in memory_region_dispatch_write > #28 0x00005586601e008e in flatview_write_continue > #29 0x00005586601e01d8 in flatview_write > #30 0x00005586601e04de in address_space_write > #31 0x00005586601e052f in address_space_rw > #32 0x00005586602607f2 in kvm_cpu_exec > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > #34 0x000055866078bde7 in qemu_thread_start > #35 0x00007f5784906594 in start_thread > #36 0x00007f5784639e6f in clone With this patch, virtio_queue_empty will now return 1 as soon as the vdev is marked as broken, after a "virtio: zero sized buffers are not allowed" error. To be consistent, update virtio_queue_empty_rcu as well. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180910145616.8598-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Both virtio-blk and virtio-scsi use virtio_queue_empty() as the loop condition in VQ handlers (virtio_blk_handle_vq, virtio_scsi_handle_cmd_vq). When a device is marked broken in virtqueue_pop, for example if a vIOMMU address translation failed, we want to break out of the loop. This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64 kernel with ATS enabled: $ qemu-system-x86_64 \ ... \ -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \ -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0 The dead loop happens immediately when the kernel boots and initializes the device, where virtio_scsi_data_plane_handle_cmd will not return: > ... > qemu#13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq > qemu#14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd > qemu#15 0x00005586602ddab7 in virtio_queue_notify_aio_vq > qemu#16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll > qemu#17 0x00005586607885da in run_poll_handlers_once > qemu#18 0x000055866078880e in try_poll_mode > qemu#19 0x00005586607888eb in aio_poll > qemu#20 0x0000558660784561 in aio_wait_bh_oneshot > qemu#21 0x00005586602b9582 in virtio_scsi_dataplane_stop > qemu#22 0x00005586605a7110 in virtio_bus_stop_ioeventfd > qemu#23 0x00005586605a9426 in virtio_pci_stop_ioeventfd > qemu#24 0x00005586605ab808 in virtio_pci_common_write > qemu#25 0x0000558660242396 in memory_region_write_accessor > qemu#26 0x00005586602425ab in access_with_adjusted_size > qemu#27 0x0000558660245281 in memory_region_dispatch_write > qemu#28 0x00005586601e008e in flatview_write_continue > qemu#29 0x00005586601e01d8 in flatview_write > qemu#30 0x00005586601e04de in address_space_write > qemu#31 0x00005586601e052f in address_space_rw > qemu#32 0x00005586602607f2 in kvm_cpu_exec > qemu#33 0x0000558660227148 in qemu_kvm_cpu_thread_fn > qemu#34 0x000055866078bde7 in qemu_thread_start > qemu#35 0x00007f5784906594 in start_thread > qemu#36 0x00007f5784639e6f in clone With this patch, virtio_queue_empty will now return 1 as soon as the vdev is marked as broken, after a "virtio: zero sized buffers are not allowed" error. To be consistent, update virtio_queue_empty_rcu as well. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20180910145616.8598-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 qemu#1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 qemu#2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 qemu#3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 qemu#4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 qemu#5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 qemu#6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 qemu#1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 qemu#2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 qemu#3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 qemu#4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 qemu#5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 qemu#6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

RH-Author: Kevin Wolf <kwolf@redhat.com> Message-id: <20181010202213.7372-14-kwolf@redhat.com> Patchwork-id: 82603 O-Subject: [RHEL-8 qemu-kvm PATCH 23/44] block/linux-aio: acquire AioContext before qemu_laio_process_completions Bugzilla: 1637976 RH-Acked-by: Max Reitz <mreitz@redhat.com> RH-Acked-by: John Snow <jsnow@redhat.com> RH-Acked-by: Thomas Huth <thuth@redhat.com> From: Sergio Lopez <slp@redhat.com> In qemu_laio_process_completions_and_submit, the AioContext is acquired before the ioq_submit iteration and after qemu_laio_process_completions, but the latter is not thread safe either. This change avoids a number of random crashes when the Main Thread and an IO Thread collide processing completions for the same AioContext. This is an example of such crash: - The IO Thread is trying to acquire the AioContext at aio_co_enter, which evidences that it didn't lock it before: Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)): #0 0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 qemu#1 0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0 qemu#2 0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0) at ../nptl/pthread_mutex_lock.c:114 qemu#3 0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66 qemu#4 0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493 qemu#5 0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478 qemu#6 0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true) at util/aio-posix.c:693 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 - The Main Thread is also processing completions from the same AioContext, and crashes due to failed assertion at util/iov.c:78: Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)): #0 0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 qemu#1 0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90 qemu#2 0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:92 qemu#3 0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101 qemu#4 0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78 qemu#5 0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410 qemu#6 0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 qemu#8 0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670) at block/linux-aio.c:237 qemu#9 0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false) at util/aio-posix.c:604 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080) at block/commit.c:92 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4647 - A closer examination shows that s->io_q.in_flight appears to have gone backwards: (gdb) frame 7 qemu#7 0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670) at block/linux-aio.c:222 222 qemu_laio_process_completion(laiocb); (gdb) p s $2 = (LinuxAioState *) 0x7fdfc0297670 (gdb) p *s $3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0, in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0, sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241} (gdb) p/x s->io_q.in_flight $4 = 0xfffffff0 Signed-off-by: Sergio Lopez <slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit e091f0e) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>

repo-lockdown · 2020-04-08T01:10:13Z

Thank you for your interest in the QEMU project.

This repository is a read-only mirror of the project's master
repostories hosted on https://git.qemu.org/git/qemu.git.
The project does not process merge requests filed on GitHub.

QEMU welcomes contributions of code (either fixing bugs or adding new
functionality). However, we get a lot of patches, and so we have some
guidelines about contributing on the project website:
https://www.qemu.org/contribute/

Fix address resolution issue with -s option

0e874f2

`port` is set in `addr`, but `has_port` is not, so `port` isn't copied in `qapi_copy_SocketAddress`

quantumdude836 closed this Oct 26, 2015

lalrae pushed a commit to lalrae/qemu that referenced this pull request Apr 18, 2016

Merge pull request qemu#30 from lalrae/fix-hflags-in-delay-slot

dbea488

target-mips: fix hflags modified in delay / forbidden slot Fixes qemu#29 Signed-off-by: Leon Alrae leon.alrae@imgtec.com

repo-lockdown bot locked and limited conversation to collaborators Apr 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix address resolution issue with `-s` option #30

Fix address resolution issue with `-s` option #30

quantumdude836 commented Oct 25, 2015

up2wing commented Oct 26, 2015

quantumdude836 commented Oct 26, 2015

repo-lockdown bot commented Apr 8, 2020

Fix address resolution issue with -s option #30

Fix address resolution issue with -s option #30

Conversation

quantumdude836 commented Oct 25, 2015

up2wing commented Oct 26, 2015

quantumdude836 commented Oct 26, 2015

repo-lockdown bot commented Apr 8, 2020

Fix address resolution issue with `-s` option #30

Fix address resolution issue with `-s` option #30