Memory hot unplug feature is not available in 'hostos-devel' branch #1

nasastry · 2016-08-30T10:37:51Z

Memory hot unplug feature not available in 'hostos-devel' branch.

How to re-produce:

Hot plug the memory using the following command and xml

cat mem_hp.xml

262144

Before attaching memory at Guest:
[root@localhost ~]# cat /proc/meminfo | grep -i memtotal
MemTotal: 7838848 kB

Attach memory, here

virsh attach-device centos72 mem_hp.xml --live

Device attached successfully

After attach guest memory:
[root@localhost ~]# cat /proc/meminfo | grep -i memtotal
MemTotal: 8100992 kB

Try to detach memory using the following command and xml
cat /tmp/t1 # Detach memory xml, taken from virsh dumpxml
262144

virsh detach-device centos72 /tmp/t1 --live

error: Failed to detach device from /tmp/t1
error: internal error: unable to execute QEMU command 'device_del': Memory hot unplug not supported by sPAPR

Additional information:
Guest is having powerpc-utils at 1.3.0 level.

nasastry · 2016-08-30T10:41:30Z

xml files for memory hot plug and memory hot unplug got mangled so created as attachments.
mem_hp.xml.txt
mem_unplug.xml.txt

rmatinata-ibm · 2016-08-31T18:38:26Z

@aik can you, or somebody else appropriate, take a look into this ?

aik · 2016-09-01T03:03:14Z

The appropriate persons are Mike Roth @mdroth and Bharata B Rao bharata@linux.vnet.ibm.com (who does not seem to be on github and we need to invite him here).

The cssid 255 is reserved but still valid from an architectural point of view. However, feeding a bogus schid of 0xffffffff into the virtio hypercall will lead to a crash: Stack trace of thread 138363: #0 0x00000000100d168c css_find_subch (qemu-system-s390x) #1 0x00000000100d3290 virtio_ccw_hcall_notify #2 0x00000000100cbf60 s390_virtio_hypercall #3 0x000000001010ff7a handle_hypercall #4 0x0000000010079ed4 kvm_cpu_exec (qemu-system-s390x) #5 0x00000000100609b4 qemu_kvm_cpu_thread_fn #6 0x000003ff8b887bb4 start_thread (libpthread.so.0) #7 0x000003ff8b78df0a thread_start (libc.so.6) This is because the css array was only allocated for 0..254 instead of 0..255. Let's fix this by bumping MAX_CSSID to 255 and fencing off the reserved cssid of 255 during css image allocation. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: qemu-stable@nongnu.org Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>

Running cpuid instructions with a simple run like: i386-linux-user/qemu-i386 tests/tcg/sha1-i386 Results in the following assert: #0 0x00007ffff64246f5 in raise () from /lib64/libc.so.6 #1 0x00007ffff64262fa in abort () from /lib64/libc.so.6 #2 0x00007ffff7937ec5 in g_assertion_message () from /lib64/libglib-2.0.so.0 #3 0x00007ffff7937f5a in g_assertion_message_expr () from /lib64/libglib-2.0.so.0 #4 0x000055555561b54c in apicid_bitwidth_for_count (count=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:58 #5 0x000055555561b58a in apicid_smt_width (nr_cores=0, nr_threads=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:67 #6 0x000055555561b5c3 in apicid_core_offset (nr_cores=0, nr_threads=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:82 #7 0x000055555561b5e3 in apicid_pkg_offset (nr_cores=0, nr_threads=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:89 #8 0x000055555561dd86 in cpu_x86_cpuid (env=0x555557999550, index=4, count=3, eax=0x7fffffffcae8, ebx=0x7fffffffcaec, ecx=0x7fffffffcaf0, edx=0x7fffffffcaf4) at /home/elmarco/src/qemu/target-i386/cpu.c:2405 #9 0x0000555555638e8e in helper_cpuid (env=0x555557999550) at /home/elmarco/src/qemu/target-i386/misc_helper.c:106 #10 0x000055555599dc5e in static_code_gen_buffer () #11 0x00005555555952f8 in cpu_tb_exec (cpu=0x5555579912d0, itb=0x7ffff4371ab0) at /home/elmarco/src/qemu/cpu-exec.c:166 #12 0x0000555555595c8e in cpu_loop_exec_tb (cpu=0x5555579912d0, tb=0x7ffff4371ab0, last_tb=0x7fffffffd088, tb_exit=0x7fffffffd084, sc=0x7fffffffd0a0) at /home/elmarco/src/qemu/cpu-exec.c:517 #13 0x0000555555595e50 in cpu_exec (cpu=0x5555579912d0) at /home/elmarco/src/qemu/cpu-exec.c:612 #14 0x00005555555c065b in cpu_loop (env=0x555557999550) at /home/elmarco/src/qemu/linux-user/main.c:297 #15 0x00005555555c25b2 in main (argc=2, argv=0x7fffffffd848, envp=0x7fffffffd860) at /home/elmarco/src/qemu/linux-user/main.c:4803 The fields are set in qemu_init_vcpu() with softmmu, but it's a stub with linux-user. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Segfault happens when leaving qemu with msmouse backend: #0 0x00007fa8526ac975 in raise () at /lib64/libc.so.6 #1 0x00007fa8526add8a in abort () at /lib64/libc.so.6 #2 0x0000558be78846ab in error_exit (err=16, msg=0x558be799da10 ... #3 0x0000558be7884717 in qemu_mutex_destroy (mutex=0x558be93be750) at ... #4 0x0000558be7549951 in qemu_chr_free_common (chr=0x558be93be750) at ... #5 0x0000558be754999c in qemu_chr_free (chr=0x558be93be750) at ... #6 0x0000558be7549a20 in qemu_chr_delete (chr=0x558be93be750) at ... #7 0x0000558be754a8ef in qemu_chr_cleanup () at qemu-char.c:4643 #8 0x0000558be755843e in main (argc=5, argv=0x7ffe925d7118, ... The chr was freed by msmouse close callback before chardev cleanup, Then qemu_mutex_destroy triggered raise(). Because freeing chr is handled by qemu_chr_free_common, Remove the free from msmouse_chr_close to avoid double free. Fixes: c1111a2 Cc: qemu-stable@nongnu.org Signed-off-by: Lin Ma <lma@suse.com> Message-Id: <20160915143158.4796-1-lma@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Commit 0ed93d8 ("linux-aio: process completions from ioq_submit()") added an optimization that processes completions each time ioq_submit() returns with requests in flight. This commit introduces a "Co-routine re-entered recursively" error which can be triggered with -drive format=qcow2,aio=native. Fam Zheng <famz@redhat.com>, Kevin Wolf <kwolf@redhat.com>, and I debugged the following backtrace: (gdb) bt #0 0x00007ffff0a046f5 in raise () at /lib64/libc.so.6 #1 0x00007ffff0a062fa in abort () at /lib64/libc.so.6 #2 0x0000555555ac0013 in qemu_coroutine_enter (co=0x5555583464d0) at util/qemu-coroutine.c:113 #3 0x0000555555a4b663 in qemu_laio_process_completions (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:218 #4 0x0000555555a4b874 in ioq_submit (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:331 #5 0x0000555555a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x555559d38ae0, offset=offset@entry=2932727808, type=type@entry=1) at block/linux-aio.c:383 #6 0x0000555555a4bbd3 in laio_co_submit (bs=<optimized out>, s=0x555557e2f7f0, fd=13, offset=2932727808, qiov=0x555559d38e20, type=1) at block/linux-aio.c:402 #7 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x55555663bcb0, offset=offset@entry=2932727808, bytes=bytes@entry=8192, qiov=qiov@entry=0x555559d38e20, flags=0) at block/io.c:804 #8 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x55555663bcb0, req=req@entry=0x555559d38d20, offset=offset@entry=2932727808, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x555559d38e20, flags=0) at block/io.c:1041 #9 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=2932727808, bytes=8192, qiov=qiov@entry=0x555559d38e20, flags=flags@entry=0) at block/io.c:1133 #10 0x0000555555a29629 in qcow2_co_preadv (bs=0x555556635890, offset=6178725888, bytes=8192, qiov=0x555557527840, flags=<optimized out>) at block/qcow2.c:1509 #11 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x555556635890, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x555557527840, flags=0) at block/io.c:804 #12 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x555556635890, req=req@entry=0x555559d39000, offset=offset@entry=6178725888, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x555557527840, flags=0) at block/io.c:1041 #13 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x555557527840, flags=flags@entry=0) at block/io.c:1133 #14 0x0000555555a4515a in blk_co_preadv (blk=0x5555566356d0, offset=6178725888, bytes=8192, qiov=0x555557527840, flags=0) at block/block-backend.c:783 #15 0x0000555555a45266 in blk_aio_read_entry (opaque=0x5555577025e0) at block/block-backend.c:991 #16 0x0000555555ac0cfa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78 It turned out that re-entrant ioq_submit() and completion processing between three requests caused this error. The following check is not sufficient to prevent recursively entering coroutines: if (laiocb->co != qemu_coroutine_self()) { qemu_coroutine_enter(laiocb->co); } As the following coroutine backtrace shows, not just the current coroutine (self) can be entered. There might also be other coroutines that are currently entered and transferred control due to the qcow2 lock (CoMutex): (gdb) qemu coroutine 0x5555583464d0 #0 0x0000555555ac0c90 in qemu_coroutine_switch (from_=from_@entry=0x5555583464d0, to_=to_@entry=0x5555572f9890, action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175 #1 0x0000555555abfe54 in qemu_coroutine_enter (co=0x5555572f9890) at util/qemu-coroutine.c:117 #2 0x0000555555ac031c in qemu_co_queue_run_restart (co=co@entry=0x5555583462c0) at util/qemu-coroutine-lock.c:60 #3 0x0000555555abfe5e in qemu_coroutine_enter (co=0x5555583462c0) at util/qemu-coroutine.c:119 #4 0x0000555555a4b663 in qemu_laio_process_completions (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:218 #5 0x0000555555a4b874 in ioq_submit (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:331 #6 0x0000555555a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x55555a338b40, offset=offset@entry=2911477760, type=type@entry=1) at block/linux-aio.c:383 #7 0x0000555555a4bbd3 in laio_co_submit (bs=<optimized out>, s=0x555557e2f7f0, fd=13, offset=2911477760, qiov=0x55555a338e80, type=1) at block/linux-aio.c:402 #8 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x55555663bcb0, offset=offset@entry=2911477760, bytes=bytes@entry=8192, qiov=qiov@entry=0x55555a338e80, flags=0) at block/io.c:804 #9 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x55555663bcb0, req=req@entry=0x55555a338d80, offset=offset@entry=2911477760, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x55555a338e80, flags=0) at block/io.c:1041 #10 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=2911477760, bytes=8192, qiov=qiov@entry=0x55555a338e80, flags=flags@entry=0) at block/io.c:1133 #11 0x0000555555a29629 in qcow2_co_preadv (bs=0x555556635890, offset=6157475840, bytes=8192, qiov=0x5555575df720, flags=<optimized out>) at block/qcow2.c:1509 #12 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x555556635890, offset=offset@entry=6157475840, bytes=bytes@entry=8192, qiov=qiov@entry=0x5555575df720, flags=0) at block/io.c:804 #13 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x555556635890, req=req@entry=0x55555a339060, offset=offset@entry=6157475840, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x5555575df720, flags=0) at block/io.c:1041 #14 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=offset@entry=6157475840, bytes=bytes@entry=8192, qiov=qiov@entry=0x5555575df720, flags=flags@entry=0) at block/io.c:1133 #15 0x0000555555a4515a in blk_co_preadv (blk=0x5555566356d0, offset=6157475840, bytes=8192, qiov=0x5555575df720, flags=0) at block/block-backend.c:783 #16 0x0000555555a45266 in blk_aio_read_entry (opaque=0x555557231aa0) at block/block-backend.c:991 #17 0x0000555555ac0cfa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78 Use the new qemu_coroutine_entered() function instead of comparing against qemu_coroutine_self(). This is correct because: 1. If a coroutine is not entered then it must have yielded to wait for I/O completion. It is therefore safe to enter. 2. If a coroutine is entered then it must be in ioq_submit()/qemu_laio_process_completions() because otherwise it would be yielded while waiting for I/O completion. Therefore it will check laio->ret and return from ioq_submit() instead of yielding, i.e. it's guaranteed not to hang. Reported-by: Fam Zheng <famz@redhat.com> Tested-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1474989516-18255-4-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

The vhost-user & colo code is poking at the QemuOpts instance in the CharDriverState struct, not realizing that it is valid for this to be NULL. e.g. the following crash shows a codepath where it will be NULL: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617 617 QTAILQ_FOREACH(opt, &opts->head, next) { [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))] (gdb) bt #0 0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617 #1 0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314 #2 0x000055baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250, name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at net/vhost-user.c:360 #3 0x000055baf6960216 in net_client_init1 (object=0x55baf8ff9250, is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051 #4 0x000055baf6960518 in net_client_init (opts=0x55baf776e7e0, is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108 #5 0x000055baf696083f in netdev_add (opts=0x55baf776e7e0, errp=0x7ffc51368f00) at net/net.c:1186 #6 0x000055baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60, ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205 #7 0x000055baf6622135 in handle_qmp_command (parser=0x55baf77fb590, tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978 #8 0x000055baf6a9d099 in json_message_process_token (lexer=0x55baf77fb598, input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19) at qobject/json-streamer.c:105 #9 0x000055baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598, ch=125 '}', flush=false) at qobject/json-lexer.c:319 #10 0x000055baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369 #11 0x000055baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-streamer.c:124 #12 0x000055baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530, buf=0x7ffc51369170 "}R\204\367\272U", size=1) at /path/to/qemu.git/monitor.c:3994 #13 0x000055baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387 #14 0x000055baf6757076 in qemu_chr_be_write (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399 #15 0x000055baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN, opaque=0x55baf7610a40) at qemu-char.c:2927 #16 0x000055baf6a5d655 in qio_channel_fd_source_dispatch (source=0x55baf7610df0, callback=0x55baf675b25a <tcp_chr_read>, user_data=0x55baf7610a40) at io/channel-watch.c:84 #17 0x00007f1d3e80cbbd in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #18 0x000055baf69d3720 in glib_pollfds_poll () at main-loop.c:213 #19 0x000055baf69d37fd in os_host_main_loop_wait (timeout=126000000) at main-loop.c:258 #20 0x000055baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506 #21 0x000055baf676587b in main_loop () at vl.c:1908 #22 0x000055baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8, envp=0x7ffc5136a9f8) at vl.c:4604 (gdb) p opts $1 = (QemuOpts *) 0x0 The crash occurred when attaching vhost-user net via QMP: { "execute": "chardev-add", "arguments": { "id": "charnet2", "backend": { "type": "socket", "data": { "addr": { "type": "unix", "data": { "path": "/var/run/openvswitch/vhost-user1" } }, "wait": false, "server": false } } }, "id": "libvirt-19" } { "return": { }, "id": "libvirt-19" } { "execute": "netdev_add", "arguments": { "type": "vhost-user", "chardev": "charnet2", "id": "hostnet2" }, "id": "libvirt-20" } Code using chardevs should not be poking at the internals of the CharDriverState struct. What vhost-user wants is a chardev that is operating as reconnectable network service, along with the ability to do FD passing over the connection. The colo code simply wants a network service. Add a feature concept to the char drivers so that chardev users can query the actual features they wish to have supported. The QemuOpts member is removed to prevent future mistakes in this area. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

ASAN complains about buffer overflow when running: aarch64-softmmu/qemu-system-aarch64 -machine xilinx-zynq-a9 ==476==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000035e38 at pc 0x000000f75253 bp 0x7ffc597e0ec0 sp 0x7ffc597e0eb0 READ of size 8 at 0x602000035e38 thread T0 #0 0xf75252 in xilinx_spips_realize hw/ssi/xilinx_spips.c:623 #1 0xb9ef6c in device_set_realized hw/core/qdev.c:918 #2 0x129ae01 in property_set_bool qom/object.c:1854 #3 0x1296e70 in object_property_set qom/object.c:1088 #4 0x129dd1b in object_property_set_qobject qom/qom-qobject.c:27 #5 0x1297168 in object_property_set_bool qom/object.c:1157 #6 0xb9aeac in qdev_init_nofail hw/core/qdev.c:358 #7 0x78a5bf in zynq_init_spi_flashes /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:125 #8 0x78af60 in zynq_init /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:238 #9 0x998eac in main /home/elmarco/src/qemu/vl.c:4534 #10 0x7f96ed692730 in __libc_start_main (/lib64/libc.so.6+0x20730) #11 0x41d0a8 in _start (/home/elmarco/src/qemu/aarch64-softmmu/qemu-system-aarch64+0x41d0a8) 0x602000035e38 is located 0 bytes to the right of 8-byte region [0x602000035e30,0x602000035e38) allocated by thread T0 here: #0 0x7f970b014e60 in malloc (/lib64/libasan.so.3+0xc6e60) #1 0x7f96f15b0e18 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee18) #2 0xb9ef6c in device_set_realized hw/core/qdev.c:918 #3 0x129ae01 in property_set_bool qom/object.c:1854 #4 0x1296e70 in object_property_set qom/object.c:1088 #5 0x129dd1b in object_property_set_qobject qom/qom-qobject.c:27 #6 0x1297168 in object_property_set_bool qom/object.c:1157 #7 0xb9aeac in qdev_init_nofail hw/core/qdev.c:358 #8 0x78a5bf in zynq_init_spi_flashes /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:125 #9 0x78af60 in zynq_init /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:238 #10 0x998eac in main /home/elmarco/src/qemu/vl.c:4534 #11 0x7f96ed692730 in __libc_start_main (/lib64/libc.so.6+0x20730) s->spi is allocated with the size of num_busses which may be 1 (by default). Change to use a loop up to s->num_busses also for the call to ssi_auto_connect_slaves(). Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nasastry · 2016-12-01T08:40:21Z

Closing the issue here and this is handled at bugzilla.

Qemu crash in the source side while migrating, after starting ipmi service inside vm. ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 \ -drive file=/work/suse/suse11_sp3_64_vt,format=raw,if=none,id=drive-virtio-disk0,cache=none \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \ -vnc :99 -monitor vc -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-kcs,bmc=bmc0,ioport=0xca2 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffec4268700 (LWP 7657)] __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757 (gdb) bt #0 __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757 #1 0x00005555559ef775 in memcpy (__len=3, __src=0xc1421c, __dest=<optimized out>) at /usr/include/bits/string3.h:51 #2 qemu_put_buffer (f=0x555557a97690, buf=0xc1421c <Address 0xc1421c out of bounds>, size=3) at migration/qemu-file.c:346 #3 0x00005555559eef66 in vmstate_save_state (f=f@entry=0x555557a97690, vmsd=0x555555f8a5a0 <vmstate_ISAIPMIKCSDevice>, opaque=0x555557231160, vmdesc=vmdesc@entry=0x55555798cc40) at migration/vmstate.c:333 #4 0x00005555557cfe45 in vmstate_save (f=f@entry=0x555557a97690, se=se@entry=0x555557231de0, vmdesc=vmdesc@entry=0x55555798cc40) at /mnt/sdb/zyy/qemu/migration/savevm.c:720 #5 0x00005555557d2be7 in qemu_savevm_state_complete_precopy (f=0x555557a97690, iterable_only=iterable_only@entry=false) at /mnt/sdb/zyy/qemu/migration/savevm.c:1128 #6 0x00005555559ea102 in migration_completion (start_time=<synthetic pointer>, old_vm_running=<synthetic pointer>, current_active_state=<optimized out>, s=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1707 #7 migration_thread (opaque=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1855 #8 0x00007ffff3900dc5 in start_thread (arg=0x7ffec4268700) at pthread_create.c:308 #9 0x00007fffefc6c71d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Running postcopy-test with ASAN produces the following error: QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64 tests/postcopy-test ... ================================================================= ==23641==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1556600000 at pc 0x55b8e9d28208 bp 0x7f1555f4d3c0 sp 0x7f1555f4d3b0 READ of size 8 at 0x7f1556600000 thread T6 #0 0x55b8e9d28207 in htab_save_first_pass /home/elmarco/src/qq/hw/ppc/spapr.c:1528 #1 0x55b8e9d2939c in htab_save_iterate /home/elmarco/src/qq/hw/ppc/spapr.c:1665 #2 0x55b8e9beae3a in qemu_savevm_state_iterate /home/elmarco/src/qq/migration/savevm.c:1044 #3 0x55b8ea677733 in migration_thread /home/elmarco/src/qq/migration/migration.c:1976 #4 0x7f15845f46c9 in start_thread (/lib64/libpthread.so.0+0x76c9) #5 0x7f157d9d0f7e in clone (/lib64/libc.so.6+0x107f7e) 0x7f1556600000 is located 0 bytes to the right of 2097152-byte region [0x7f1556400000,0x7f1556600000) allocated by thread T0 here: #0 0x7f159bb76980 in posix_memalign (/lib64/libasan.so.3+0xc7980) #1 0x55b8eab185b2 in qemu_try_memalign /home/elmarco/src/qq/util/oslib-posix.c:106 #2 0x55b8eab186c8 in qemu_memalign /home/elmarco/src/qq/util/oslib-posix.c:122 #3 0x55b8e9d268a8 in spapr_reallocate_hpt /home/elmarco/src/qq/hw/ppc/spapr.c:1214 #4 0x55b8e9d26e04 in ppc_spapr_reset /home/elmarco/src/qq/hw/ppc/spapr.c:1261 #5 0x55b8ea12e913 in qemu_system_reset /home/elmarco/src/qq/vl.c:1697 #6 0x55b8ea13fa40 in main /home/elmarco/src/qq/vl.c:4679 #7 0x7f157d8e9400 in __libc_start_main (/lib64/libc.so.6+0x20400) Thread T6 created by T0 here: #0 0x7f159bae0488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488) #1 0x55b8eab1d9cb in qemu_thread_create /home/elmarco/src/qq/util/qemu-thread-posix.c:465 #2 0x55b8ea67874c in migrate_fd_connect /home/elmarco/src/qq/migration/migration.c:2096 #3 0x55b8ea66cbb0 in migration_channel_connect /home/elmarco/src/qq/migration/migration.c:500 #4 0x55b8ea678f38 in socket_outgoing_migration /home/elmarco/src/qq/migration/socket.c:87 #5 0x55b8eaa5a03a in qio_task_complete /home/elmarco/src/qq/io/task.c:142 #6 0x55b8eaa599cc in gio_task_thread_result /home/elmarco/src/qq/io/task.c:88 #7 0x7f15823e38e6 (/lib64/libglib-2.0.so.0+0x468e6) SUMMARY: AddressSanitizer: heap-buffer-overflow /home/elmarco/src/qq/hw/ppc/spapr.c:1528 in htab_save_first_pass index seems to be wrongly incremented, unless I miss something that would be worth a comment. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

During block job completion, nothing is preventing block_job_defer_to_main_loop_bh from being called in a nested aio_poll(), which is a trouble, such as in this code path: qmp_block_commit commit_active_start bdrv_reopen bdrv_reopen_multiple bdrv_reopen_prepare bdrv_flush aio_poll aio_bh_poll aio_bh_call block_job_defer_to_main_loop_bh stream_complete bdrv_reopen block_job_defer_to_main_loop_bh is the last step of the stream job, which should have been "paused" by the bdrv_drained_begin/end in bdrv_reopen_multiple, but it is not done because it's in the form of a main loop BH. Similar to why block jobs should be paused between drained_begin and drained_end, BHs they schedule must be excluded as well. To achieve this, this patch forces draining the BH in BDRV_POLL_WHILE. As a side effect this fixes a hang in block_job_detach_aio_context during system_reset when a block job is ready: #0 0x0000555555aa79f3 in bdrv_drain_recurse #1 0x0000555555aa825d in bdrv_drained_begin #2 0x0000555555aa8449 in bdrv_drain #3 0x0000555555a9c356 in blk_drain #4 0x0000555555aa3cfd in mirror_drain #5 0x0000555555a66e11 in block_job_detach_aio_context #6 0x0000555555a62f4d in bdrv_detach_aio_context #7 0x0000555555a63116 in bdrv_set_aio_context #8 0x0000555555a9d326 in blk_set_aio_context #9 0x00005555557e38da in virtio_blk_data_plane_stop #10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd #11 0x00005555559fa49b in virtio_bus_stop_ioeventfd #12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd #13 0x00005555559f6a18 in virtio_pci_reset #14 0x00005555559139a9 in qdev_reset_one #15 0x0000555555916738 in qbus_walk_children #16 0x0000555555913318 in qdev_walk_children #17 0x0000555555916738 in qbus_walk_children #18 0x00005555559168ca in qemu_devices_reset #19 0x000055555581fcbb in pc_machine_reset #20 0x00005555558a4d96 in qemu_system_reset #21 0x000055555577157a in main_loop_should_exit #22 0x000055555577157a in main_loop #23 0x000055555577157a in main The rationale is that the loop in block_job_detach_aio_context cannot make any progress in pausing/completing the job, because bs->in_flight is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop BH. With this patch, it does. Reported-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170418143044.12187-3-famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>

RH-Author: Andrea Arcangeli <aarcange@redhat.com> Message-id: <1392211818-14964-2-git-send-email-aarcange@redhat.com> Patchwork-id: 57245 O-Subject: [RHEL-7.0 qemu-kvm PATCH] fix guest physical bits to match host, to go beyond 1TB guests Bugzilla: 989677 RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com> RH-Acked-by: Andrew Jones <drjones@redhat.com> RH-Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Without this patch the guest physical bits are advertised as 40, not 44 or more depending on the hardware capability of the host. That leads to guest kernel crashes with injection of page faults 9 (see oops: 0009) as bits above 40 in the guest pagetables are considered reserved. exregion-0206 [324572448] [17] ex_system_memory_space: System-Memory (width 32) R/W 0 Address=00000000FED00000 BUG: unable to handle kernel paging request at ffffc9006030e000 IP: [<ffffffff812fbb6f>] acpi_ex_system_memory_space_handler+0x23e/0x2cb PGD e01f875067 PUD 1001f075067 PMD e0178d8067 PTE 80000000fed00173 Oops: 0009 [#1] SMP (see PUD with bit >=40 set) Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Chegu Vinod <chegu_vinod@hp.com>

RH-Author: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: <20160704205732.27022-6-marcandre.lureau@redhat.com> Patchwork-id: 70942 O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH v2 5/5] char: do not use atexit cleanup handler Bugzilla: 1347077 RH-Acked-by: Xiao Wang <jasowang@redhat.com> RH-Acked-by: Laurent Vivier <lvivier@redhat.com> RH-Acked-by: Victor Kaplansky <vkaplans@redhat.com> RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com> From: Marc-André Lureau <marcandre.lureau@redhat.com> It turns out qemu is calling exit() in various places from various threads without taking much care of resources state. The atexit() cleanup handlers cannot easily destroy resources that are in use (by the same thread or other). Since c1111a2, TCG arm guests run into the following abort() when running tests, the chardev mutex is locked during the write, so qemu_mutex_destroy() returns an error: #0 0x00007fffdbb806f5 in raise () at /lib64/libc.so.6 #1 0x00007fffdbb822fa in abort () at /lib64/libc.so.6 #2 0x00005555557616fe in error_exit (err=<optimized out>, msg=msg@entry=0x555555c38c30 <__func__.14622> "qemu_mutex_destroy") at /home/drjones/code/qemu/util/qemu-thread-posix.c:39 #3 0x0000555555b0be20 in qemu_mutex_destroy (mutex=mutex@entry=0x5555566aa0e0) at /home/drjones/code/qemu/util/qemu-thread-posix.c:57 open-power-host-os#4 0x00005555558aab00 in qemu_chr_free_common (chr=0x5555566aa0e0) at /home/drjones/code/qemu/qemu-char.c:4029 open-power-host-os#5 0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4038 open-power-host-os#6 0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4044 open-power-host-os#7 0x00005555558b062c in qemu_chr_cleanup () at /home/drjones/code/qemu/qemu-char.c:4557 open-power-host-os#8 0x00007fffdbb851e8 in __run_exit_handlers () at /lib64/libc.so.6 open-power-host-os#9 0x00007fffdbb85235 in () at /lib64/libc.so.6 open-power-host-os#10 0x00005555558d1b39 in testdev_write (testdev=0x5555566aa0a0) at /home/drjones/code/qemu/backends/testdev.c:71 open-power-host-os#11 0x00005555558d1b39 in testdev_write (chr=<optimized out>, buf=0x7fffc343fd9a "", len=0) at /home/drjones/code/qemu/backends/testdev.c:95 open-power-host-os#12 0x00005555558adced in qemu_chr_fe_write (s=0x5555566aa0e0, buf=buf@entry=0x7fffc343fd98 "0q", len=len@entry=2) at /home/drjones/code/qemu/qemu-char.c:282 Instead of using a atexit() handler, only run the chardev cleanup as initially proposed at the end of main(), where there are less chances (hic) of conflicts or other races. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reported-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit dd9250d5d99a2671584f65b7d0745355bbbe353f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

RH-Author: Markus Armbruster <armbru@redhat.com> Message-id: <1469602394-4025-2-git-send-email-armbru@redhat.com> Patchwork-id: 71467 O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH] block: drop support for using qcow[2] encryption with system emulators Bugzilla: 1336659 RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com> RH-Acked-by: Laurent Vivier <lvivier@redhat.com> RH-Acked-by: Daniel P. Berrange <berrange@redhat.com> From: "Daniel P. Berrange" <berrange@redhat.com> Back in the 2.3.0 release we declared qcow[2] encryption as deprecated, warning people that it would be removed in a future release. commit a1f688f Author: Markus Armbruster <armbru@redhat.com> Date: Fri Mar 13 21:09:40 2015 +0100 block: Deprecate QCOW/QCOW2 encryption The code still exists today, but by a (happy?) accident we entirely broke the ability to use qcow[2] encryption in the system emulators in the 2.4.0 release due to commit 8336aaf Author: Daniel P. Berrange <berrange@redhat.com> Date: Tue May 12 17:09:18 2015 +0100 qcow2/qcow: protect against uninitialized encryption key This commit was designed to prevent future coding bugs which might cause QEMU to read/write data on an encrypted block device in plain text mode before a decryption key is set. It turns out this preventative measure was a little too good, because we already had a long standing bug where QEMU read encrypted data in plain text mode during system emulator startup, in order to guess disk geometry: Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)): #0 0x00007fffe90b1a28 in raise () at /lib64/libc.so.6 #1 0x00007fffe90b362a in abort () at /lib64/libc.so.6 #2 0x00007fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007fffe90aa2d2 in () at /lib64/libc.so.6 open-power-host-os#4 0x000055555587ae19 in qcow2_co_readv (bs=0x5555562accb0, sector_num=0, remaining_sectors=1, qiov=0x7fffffffd260) at block/qcow2.c:1229 open-power-host-os#5 0x000055555589b60d in bdrv_aligned_preadv (bs=bs@entry=0x5555562accb0, req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7fffffffd260, flags=0) at block/io.c:908 open-power-host-os#6 0x000055555589b8bc in bdrv_co_do_preadv (bs=0x5555562accb0, offset=0, bytes=512, qiov=0x7fffffffd260, flags=<optimized out>) at block/io.c:999 open-power-host-os#7 0x000055555589c375 in bdrv_rw_co_entry (opaque=0x7fffffffd210) at block/io.c:544 open-power-host-os#8 0x000055555586933b in coroutine_thread (opaque=0x555557876310) at coroutine-gthread.c:134 open-power-host-os#9 0x00007ffff64e1835 in g_thread_proxy (data=0x5555562b5590) at gthread.c:778 open-power-host-os#10 0x00007ffff6bb760a in start_thread () at /lib64/libpthread.so.0 open-power-host-os#11 0x00007fffe917f59d in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7ffff7ecab40 (LWP 30343)): #0 0x00007fffe91797a9 in syscall () at /lib64/libc.so.6 #1 0x00007ffff64ff87f in g_cond_wait (cond=cond@entry=0x555555e085f0 <coroutine_cond>, mutex=mutex@entry=0x555555e08600 <coroutine_lock>) at gthread-posix.c:1397 #2 0x00005555558692c3 in qemu_coroutine_switch (co=<optimized out>) at coroutine-gthread.c:117 #3 0x00005555558692c3 in qemu_coroutine_switch (from_=0x5555562b5e30, to_=to_@entry=0x555557876310, action=action@entry=COROUTINE_ENTER) at coroutine-gthread.c:175 open-power-host-os#4 0x0000555555868a90 in qemu_coroutine_enter (co=0x555557876310, opaque=0x0) at qemu-coroutine.c:116 open-power-host-os#5 0x0000555555859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) at thread-pool.c:187 open-power-host-os#6 0x0000555555859514 in aio_bh_poll (ctx=ctx@entry=0x5555562953b0) at async.c:85 open-power-host-os#7 0x0000555555864d10 in aio_dispatch (ctx=ctx@entry=0x5555562953b0) at aio-posix.c:135 open-power-host-os#8 0x0000555555864f75 in aio_poll (ctx=ctx@entry=0x5555562953b0, blocking=blocking@entry=true) at aio-posix.c:291 open-power-host-os#9 0x000055555589c40d in bdrv_prwv_co (bs=bs@entry=0x5555562accb0, offset=offset@entry=0, qiov=qiov@entry=0x7fffffffd260, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591 open-power-host-os#10 0x000055555589c503 in bdrv_rw_co (bs=bs@entry=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:614 open-power-host-os#11 0x000055555589c562 in bdrv_read_unthrottled (nb_sectors=21845, buf=0x7fffffffd2e0 "\321,", sector_num=0, bs=0x5555562accb0) at block/io.c:622 open-power-host-os#12 0x000055555589c562 in bdrv_read_unthrottled (bs=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845) at block/io.c:634 nb_sectors@entry=1) at block/block-backend.c:504 open-power-host-os#14 0x0000555555752e9f in guess_disk_lchs (blk=blk@entry=0x5555562a5290, pcylinders=pcylinders@entry=0x7fffffffd52c, pheads=pheads@entry=0x7fffffffd530, psectors=psectors@entry=0x7fffffffd534) at hw/block/hd-geometry.c:68 open-power-host-os#15 0x0000555555752ff7 in hd_geometry_guess (blk=0x5555562a5290, pcyls=pcyls@entry=0x555557875d1c, pheads=pheads@entry=0x555557875d20, psecs=psecs@entry=0x555557875d24, ptrans=ptrans@entry=0x555557875d28) at hw/block/hd-geometry.c:133 open-power-host-os#16 0x0000555555752b87 in blkconf_geometry (conf=conf@entry=0x555557875d00, ptrans=ptrans@entry=0x555557875d28, cyls_max=cyls_max@entry=65536, heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, errp=errp@entry=0x7fffffffd5e0) at hw/block/block.c:71 open-power-host-os#17 0x0000555555799bc4 in ide_dev_initfn (dev=0x555557875c80, kind=IDE_HD) at hw/ide/qdev.c:174 open-power-host-os#18 0x0000555555768394 in device_realize (dev=0x555557875c80, errp=0x7fffffffd640) at hw/core/qdev.c:247 open-power-host-os#19 0x0000555555769a81 in device_set_realized (obj=0x555557875c80, value=<optimized out>, errp=0x7fffffffd730) at hw/core/qdev.c:1058 open-power-host-os#20 0x00005555558240ce in property_set_bool (obj=0x555557875c80, v=<optimized out>, opaque=0x555557875de0, name=<optimized out>, errp=0x7fffffffd730) at qom/object.c:1514 open-power-host-os#21 0x0000555555826c87 in object_property_set_qobject (obj=obj@entry=0x555557875c80, value=value@entry=0x55555784bcb0, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/qom-qobject.c:24 open-power-host-os#22 0x0000555555825760 in object_property_set_bool (obj=obj@entry=0x555557875c80, value=value@entry=true, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/object.c:905 open-power-host-os#23 0x000055555576897b in qdev_init_nofail (dev=dev@entry=0x555557875c80) at hw/core/qdev.c:380 open-power-host-os#24 0x0000555555799ead in ide_create_drive (bus=bus@entry=0x555557629630, unit=unit@entry=0, drive=0x5555562b77e0) at hw/ide/qdev.c:122 open-power-host-os#25 0x000055555579a746 in pci_ide_create_devs (dev=dev@entry=0x555557628db0, hd_table=hd_table@entry=0x7fffffffd830) at hw/ide/pci.c:440 open-power-host-os#26 0x000055555579b165 in pci_piix3_ide_init (bus=<optimized out>, hd_table=0x7fffffffd830, devfn=<optimized out>) at hw/ide/piix.c:218 open-power-host-os#27 0x000055555568ca55 in pc_init1 (machine=0x5555562960a0, pci_enabled=1, kvmclock_enabled=<optimized out>) at /home/berrange/src/virt/qemu/hw/i386/pc_piix.c:256 open-power-host-os#28 0x0000555555603ab2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4249 So the safety net is correctly preventing QEMU reading cipher text as if it were plain text, during startup and aborting QEMU to avoid bad usage of this data. For added fun this bug only happens if the encrypted qcow2 file happens to have data written to the first cluster, otherwise the cluster won't be allocated and so qcow2 would not try the decryption routines at all, just return all 0's. That no one even noticed, let alone reported, this bug that has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number of actual users of encrypted qcow2 is approximately zero. So rather than fix the crash, and backport it to stable releases, just go ahead with what we have warned users about and disable any use of qcow2 encryption in the system emulators. qemu-img/qemu-io/qemu-nbd are still able to access qcow2 encrypted images for the sake of data conversion. In the future, qcow2 will gain support for the alternative luks format, but when this happens it'll be using the '-object secret' infrastructure for getting keys, which avoids this problematic scenario entirely. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 8c0dcbc) Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

RH-Author: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: <20160810200144.31810-2-marcandre.lureau@redhat.com> Patchwork-id: 71914 O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH 1/2] monitor: fix crash when leaving qemu with spice audio Bugzilla: 1355704 RH-Acked-by: Thomas Huth <thuth@redhat.com> RH-Acked-by: Markus Armbruster <armbru@redhat.com> RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com> Since aa5cb7f, the chardevs are being cleaned up when leaving qemu. However, the monitor has still references to them, which may lead to crashes when running atexit() and trying to send monitor events: #0 0x00007fffdb18f6f5 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x00007fffdb1912fa in __GI_abort () at abort.c:89 #2 0x0000555555c263e7 in error_exit (err=22, msg=0x555555d47980 <__func__.13537> "qemu_mutex_lock") at util/qemu-thread-posix.c:39 #3 0x0000555555c26488 in qemu_mutex_lock (mutex=0x5555567a2420) at util/qemu-thread-posix.c:66 open-power-host-os#4 0x00005555558c52db in qemu_chr_fe_write (s=0x5555567a2420, buf=0x55555740dc40 "{\"timestamp\": {\"seconds\": 1470041716, \"microseconds\": 989699}, \"event\": \"SPICE_DISCONNECTED\", \"data\": {\"server\": {\"port\": \"5900\", \"family\": \"ipv4\", \"host\": \"127.0.0.1\"}, \"client\": {\"port\": \"40272\", \"f"..., len=240) at qemu-char.c:280 open-power-host-os#5 0x0000555555787cad in monitor_flush_locked (mon=0x5555567bd9e0) at /home/elmarco/src/qemu/monitor.c:311 open-power-host-os#6 0x0000555555787e46 in monitor_puts (mon=0x5555567bd9e0, str=0x5555567a44ef "") at /home/elmarco/src/qemu/monitor.c:353 open-power-host-os#7 0x00005555557880fe in monitor_json_emitter (mon=0x5555567bd9e0, data=0x5555567c73a0) at /home/elmarco/src/qemu/monitor.c:401 open-power-host-os#8 0x00005555557882d2 in monitor_qapi_event_emit (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x5555567c73a0) at /home/elmarco/src/qemu/monitor.c:472 open-power-host-os#9 0x000055555578838f in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x5555567c73a0, errp=0x7fffffffca88) at /home/elmarco/src/qemu/monitor.c:497 open-power-host-os#10 0x0000555555c15541 in qapi_event_send_spice_disconnected (server=0x5555571139d0, client=0x5555570d0db0, errp=0x5555566c0428 <error_abort>) at qapi-event.c:1038 open-power-host-os#11 0x0000555555b11bc6 in channel_event (event=3, info=0x5555570d6c00) at ui/spice-core.c:248 open-power-host-os#12 0x00007fffdcc9983a in adapter_channel_event (event=3, info=0x5555570d6c00) at reds.c:120 open-power-host-os#13 0x00007fffdcc99a25 in reds_handle_channel_event (reds=0x5555567a9d60, event=3, info=0x5555570d6c00) at reds.c:324 open-power-host-os#14 0x00007fffdcc7d4c4 in main_dispatcher_self_handle_channel_event (self=0x5555567b28b0, event=3, info=0x5555570d6c00) at main-dispatcher.c:175 open-power-host-os#15 0x00007fffdcc7d5b1 in main_dispatcher_channel_event (self=0x5555567b28b0, event=3, info=0x5555570d6c00) at main-dispatcher.c:194 open-power-host-os#16 0x00007fffdcca7674 in reds_stream_push_channel_event (s=0x5555570d9910, event=3) at reds-stream.c:354 open-power-host-os#17 0x00007fffdcca749b in reds_stream_free (s=0x5555570d9910) at reds-stream.c:323 open-power-host-os#18 0x00007fffdccb5dad in snd_disconnect_channel (channel=0x5555576a89a0) at sound.c:229 open-power-host-os#19 0x00007fffdccb9e57 in snd_detach_common (worker=0x555557739720) at sound.c:1589 open-power-host-os#20 0x00007fffdccb9f0e in snd_detach_playback (sin=0x5555569fe3f8) at sound.c:1602 open-power-host-os#21 0x00007fffdcca3373 in spice_server_remove_interface (sin=0x5555569fe3f8) at reds.c:3387 open-power-host-os#22 0x00005555558ff6e2 in line_out_fini (hw=0x5555569fe370) at audio/spiceaudio.c:152 open-power-host-os#23 0x00005555558f909e in audio_atexit () at audio/audio.c:1754 open-power-host-os#24 0x00007fffdb1941e8 in __run_exit_handlers (status=0, listp=0x7fffdb5175d8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82 open-power-host-os#25 0x00007fffdb194235 in __GI_exit (status=<optimized out>) at exit.c:104 open-power-host-os#26 0x00007fffdb17b738 in __libc_start_main (main=0x5555558d7874 <main>, argc=67, argv=0x7fffffffcf48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffcf38) at ../csu/libc-start.c:323 Add a monitor_cleanup() functions to remove all the monitors before cleaning up the chardev. Note that we are "losing" some events that used to be sent during atexit(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20160801112343.29082-2-marcandre.lureau@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> (cherry picked from commit 2ef4571) BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1355704 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

@entry

RH-Author: Gerd Hoffmann <kraxel@redhat.com> Message-id: <1470920000-23113-2-git-send-email-kraxel@redhat.com> Patchwork-id: 71946 O-Subject: [RHEL-7.3 qemu-kvm-rhev PATCH 1/3] vnc: don't crash getting server info if lsock is NULL Bugzilla: 1359655 RH-Acked-by: Thomas Huth <thuth@redhat.com> RH-Acked-by: Marcel Apfelbaum <marcel@redhat.com> RH-Acked-by: Markus Armbruster <armbru@redhat.com> From: "Daniel P. Berrange" <berrange@redhat.com> When VNC is started with '-vnc none' there will be no listener socket present. When we try to populate the VncServerInfo we'll crash accessing a NULL 'lsock' field. #0 qio_channel_socket_get_local_address (ioc=0x0, errp=errp@entry=0x7ffd5b8aa0f0) at io/channel-socket.c:33 #1 0x00007f4b9a297d6f in vnc_init_basic_info_from_server_addr (errp=0x7ffd5b8aa0f0, info=0x7f4b9d425460, ioc=<optimized out>) at ui/vnc.c:146 #2 vnc_server_info_get (vd=0x7f4b9e858000) at ui/vnc.c:223 #3 0x00007f4b9a29d318 in vnc_qmp_event (vs=0x7f4b9ef82000, vs=0x7f4b9ef82000, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:279 open-power-host-os#4 vnc_connect (vd=vd@entry=0x7f4b9e858000, sioc=sioc@entry=0x7f4b9e8b3a20, skipauth=skipauth@entry=true, websocket=websocket @entry=false) at ui/vnc.c:2994 open-power-host-os#5 0x00007f4b9a29e8c8 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/v nc.c:3825 open-power-host-os#6 0x00007f4b9a18d8a1 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7ffd5b8aa230) at qmp-marsh al.c:123 open-power-host-os#7 0x00007f4b9a0b53f5 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.6.0/mon itor.c:3922 open-power-host-os#8 0x00007f4b9a348580 in json_message_process_token (lexer=0x7f4b9c78dfe8, input=0x7f4b9c7350e0, type=JSON_RCURLY, x=111, y=5 9) at qobject/json-streamer.c:94 open-power-host-os#9 0x00007f4b9a35cfeb in json_lexer_feed_char (lexer=lexer@entry=0x7f4b9c78dfe8, ch=125 '}', flush=flush@entry=false) at qobj ect/json-lexer.c:310 open-power-host-os#10 0x00007f4b9a35d0ae in json_lexer_feed (lexer=0x7f4b9c78dfe8, buffer=<optimized out>, size=<optimized out>) at qobject/json -lexer.c:360 open-power-host-os#11 0x00007f4b9a348679 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at q object/json-streamer.c:114 open-power-host-os#12 0x00007f4b9a0b3a1b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/deb ug/qemu-2.6.0/monitor.c:3938 open-power-host-os#13 0x00007f4b9a186751 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x7f4b9c7add40) at qemu-char.c:2895 open-power-host-os#14 0x00007f4b92b5c79a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 open-power-host-os#15 0x00007f4b9a2bb0c0 in glib_pollfds_poll () at main-loop.c:213 open-power-host-os#16 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:258 open-power-host-os#17 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506 open-power-host-os#18 0x00007f4b9a0835cf in main_loop () at vl.c:1934 open-power-host-os#19 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4667 Do an upfront check for a NULL lsock and report an error to the caller, which matches behaviour from before commit 04d2529 Author: Daniel P. Berrange <berrange@redhat.com> Date: Fri Feb 27 16:20:57 2015 +0000 ui: convert VNC server to use QIOChannelSocket where getsockname() would be given a FD value -1 and thus report an error to the caller. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1470134726-15697-2-git-send-email-berrange@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit 624cdd4) Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

RH-Author: Gerd Hoffmann <kraxel@redhat.com> Message-id: <1470920000-23113-3-git-send-email-kraxel@redhat.com> Patchwork-id: 71947 O-Subject: [RHEL-7.3 qemu-kvm-rhev PATCH 2/3] vnc: fix crash when vnc_server_info_get has an error Bugzilla: 1359655 RH-Acked-by: Thomas Huth <thuth@redhat.com> RH-Acked-by: Marcel Apfelbaum <marcel@redhat.com> RH-Acked-by: Markus Armbruster <armbru@redhat.com> From: "Daniel P. Berrange" <berrange@redhat.com> The vnc_server_info_get will allocate the VncServerInfo struct and then call vnc_init_basic_info_from_server_addr to populate the basic fields. If this returns an error though, the qapi_free_VncServerInfo call will then crash because the VncServerInfo struct instance was not properly NULL-initialized and thus contains random stack garbage. #0 0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6 #1 0x00007f1987c902fa in abort () at /lib64/libc.so.6 #2 0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6 #3 0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6 open-power-host-os#4 0x00007f1987cdb2ac in free () at /lib64/libc.so.6 open-power-host-os#5 0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0 open-power-host-os#6 0x0000559193cdcf54 in visit_type_str (v=v@entry= 0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80) at qapi/qapi-visit-core.c:255 open-power-host-os#7 0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry= 0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307 open-power-host-os#8 0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry= 0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632 open-power-host-os#9 0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry= 0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658 open-power-host-os#10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970 open-power-host-os#11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233 open-power-host-os#12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284 open-power-host-os#13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039 open-power-host-os#14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/vnc.c:3877 open-power-host-os#15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90) at qmp-marshal.c:105 open-power-host-os#16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3971 open-power-host-os#17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105 open-power-host-os#18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false) at qobject/json-lexer.c:319 open-power-host-os#19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369 open-power-host-os#20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at qobject/json-streamer.c:124 open-power-host-os#21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /home/berrange/src/virt/qemu/monitor.c:3987 open-power-host-os#22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900) at qemu-char.c:2895 open-power-host-os#23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 open-power-host-os#24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213 open-power-host-os#25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258 open-power-host-os#26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506 open-power-host-os#27 0x0000559193964c55 in main () at vl.c:1908 open-power-host-os#28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603 This was introduced in commit 98481bf Author: Eric Blake <eblake@redhat.com> Date: Mon Oct 26 16:34:45 2015 -0600 vnc: Hoist allocation of VncBasicInfo to callers which added error reporting for vnc_init_basic_info_from_server_addr but didn't change the g_malloc calls to g_malloc0. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 1470134726-15697-3-git-send-email-berrange@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit 3e7f136) Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

RH-Author: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: <20161129073543.13711-11-marcandre.lureau@redhat.com> Patchwork-id: 72914 O-Subject: [RHEV-7.3.z qemu-kvm-rhev PATCH 10/10] ahci: fix sglist leak on retry Bugzilla: 1397745 RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com> RH-Acked-by: Laurent Vivier <lvivier@redhat.com> RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com> ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20) #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58) #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896 #3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367 open-power-host-os#4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844 open-power-host-os#5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333 open-power-host-os#6 0x556a187b650b in ide_start_dma hw/ide/core.c:921 open-power-host-os#7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911 open-power-host-os#8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486 open-power-host-os#9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027 open-power-host-os#10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204 open-power-host-os#11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254 open-power-host-os#12 0x556a187d168a in check_cmd hw/ide/ahci.c:510 open-power-host-os#13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314 open-power-host-os#14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435 open-power-host-os#15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525 open-power-host-os#16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591 open-power-host-os#17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262 open-power-host-os#18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578 open-power-host-os#19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635 open-power-host-os#20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737 open-power-host-os#21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746 open-power-host-os#22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72 open-power-host-os#23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382 open-power-host-os#24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573 open-power-host-os#25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585 open-power-host-os#26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387 open-power-host-os#27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399 open-power-host-os#28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902 open-power-host-os#29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84 Follow John Snow recommendation: Everywhere else ncq_err is used, it is accompanied by a list cleanup except for ncq_cb, which is the case you are fixing here. Move the sglist destruction inside of ncq_err and then delete it from the other two locations to keep it tidy. Call dma_buf_commit in ide_dma_cb after the early return. Though, this is also a little wonky because this routine does more than clear the list, but it is at the moment the centralized "we're done with the sglist" function and none of the other side effects that occur in dma_buf_commit will interfere with the reset that occurs from ide_restart_bh, I think Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> (cherry picked from commit 5839df7) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

RH-Author: Marc-André Lureau <marcandre.lureau@redhat.com> Message-id: <20170104133125.1134-1-marcandre.lureau@redhat.com> Patchwork-id: 73171 O-Subject: [RHEV-7.3.z qemu-kvm-rhev PATCH] net: don't poke at chardev internal QemuOpts Bugzilla: 1410200 RH-Acked-by: Michael S. Tsirkin <mst@redhat.com> RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com> RH-Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> RH-Acked-by: Victor Kaplansky <vkaplans@redhat.com> From: "Daniel P. Berrange" <berrange@redhat.com> The vhost-user & colo code is poking at the QemuOpts instance in the CharDriverState struct, not realizing that it is valid for this to be NULL. e.g. the following crash shows a codepath where it will be NULL: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617 617 QTAILQ_FOREACH(opt, &opts->head, next) { [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))] (gdb) bt #0 0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617 #1 0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314 #2 0x000055baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250, name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at net/vhost-user.c:360 #3 0x000055baf6960216 in net_client_init1 (object=0x55baf8ff9250, is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051 open-power-host-os#4 0x000055baf6960518 in net_client_init (opts=0x55baf776e7e0, is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108 open-power-host-os#5 0x000055baf696083f in netdev_add (opts=0x55baf776e7e0, errp=0x7ffc51368f00) at net/net.c:1186 open-power-host-os#6 0x000055baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60, ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205 open-power-host-os#7 0x000055baf6622135 in handle_qmp_command (parser=0x55baf77fb590, tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978 open-power-host-os#8 0x000055baf6a9d099 in json_message_process_token (lexer=0x55baf77fb598, input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19) at qobject/json-streamer.c:105 open-power-host-os#9 0x000055baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598, ch=125 '}', flush=false) at qobject/json-lexer.c:319 open-power-host-os#10 0x000055baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369 open-power-host-os#11 0x000055baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-streamer.c:124 open-power-host-os#12 0x000055baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530, buf=0x7ffc51369170 "}R\204\367\272U", size=1) at /path/to/qemu.git/monitor.c:3994 open-power-host-os#13 0x000055baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387 open-power-host-os#14 0x000055baf6757076 in qemu_chr_be_write (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399 open-power-host-os#15 0x000055baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN, opaque=0x55baf7610a40) at qemu-char.c:2927 open-power-host-os#16 0x000055baf6a5d655 in qio_channel_fd_source_dispatch (source=0x55baf7610df0, callback=0x55baf675b25a <tcp_chr_read>, user_data=0x55baf7610a40) at io/channel-watch.c:84 open-power-host-os#17 0x00007f1d3e80cbbd in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 open-power-host-os#18 0x000055baf69d3720 in glib_pollfds_poll () at main-loop.c:213 open-power-host-os#19 0x000055baf69d37fd in os_host_main_loop_wait (timeout=126000000) at main-loop.c:258 open-power-host-os#20 0x000055baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506 open-power-host-os#21 0x000055baf676587b in main_loop () at vl.c:1908 open-power-host-os#22 0x000055baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8, envp=0x7ffc5136a9f8) at vl.c:4604 (gdb) p opts $1 = (QemuOpts *) 0x0 The crash occurred when attaching vhost-user net via QMP: { "execute": "chardev-add", "arguments": { "id": "charnet2", "backend": { "type": "socket", "data": { "addr": { "type": "unix", "data": { "path": "/var/run/openvswitch/vhost-user1" } }, "wait": false, "server": false } } }, "id": "libvirt-19" } { "return": { }, "id": "libvirt-19" } { "execute": "netdev_add", "arguments": { "type": "vhost-user", "chardev": "charnet2", "id": "hostnet2" }, "id": "libvirt-20" } Code using chardevs should not be poking at the internals of the CharDriverState struct. What vhost-user wants is a chardev that is operating as reconnectable network service, along with the ability to do FD passing over the connection. The colo code simply wants a network service. Add a feature concept to the char drivers so that chardev users can query the actual features they wish to have supported. The QemuOpts member is removed to prevent future mistakes in this area. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 0a73336) BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1394140 Brew: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12298577 [ Marc-André - backport: drop colo-compare bits ] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>

"nc" is freed after hotplug vhost-user, but the watcher is not removed. The QEMU crash when the watcher access the "nc" when socket disconnects. Program received signal SIGSEGV, Segmentation fault. #0 object_get_class (obj=obj@entry=0x2) at qom/object.c:750 #1 0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>) at chardev/char-fe.c:372 #2 0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188 #3 0x00007f9baf97f99a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #4 0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213 #5 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261 #6 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515 #7 0x00007f9bb3e266a7 in main_loop () at vl.c:1917 #8 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4786 Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

currently for sh4 cpu_model argument for '-cpu' option could be either 'cpu model' name or cpu_typename. however typically '-cpu' takes 'cpu model' name and cpu type for sh4 target isn't advertised publicly ('-cpu help' prints only 'cpu model' names) so we shouldn't care about this use case (it's more of a bug). 1. Drop '-cpu cpu_typename' to align with the rest of targets. 2. Compose searched for typename from cpu model and use it with object_class_by_name() directly instead of over-complicated object_class_get_list() g_slist_find_custom() + superh_cpu_name_compare() With #1 droped, #2 could be used for both lookups which simplifies superh_cpu_class_by_name() quite a bit. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <1507211474-188400-23-git-send-email-imammedo@redhat.com> [ehabkost: Include fixup sent by Igor] Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

when qemu is started with '-no-acpi' CLI option, an attempt to unplug a CPU using device_del results in null pointer dereference at: #0 object_get_class #1 pc_machine_device_unplug_request_cb #2 qmp_marshal_device_del which is caused by pcms->acpi_dev == NULL due to ACPI support being disabled. Considering that ACPI support is necessary for unplug to work, check that it's enabled and fail unplug request gracefully if no acpi device were found. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

The passed-through device might be an express device. In this case the old code allocated a too small emulated config space in pci_config_alloc() since pci_config_size() returned the size for a non-express device. This leads to an out-of-bound write in xen_pt_config_reg_init(), which sometimes results in crashes. So set is_express as already done for KVM in vfio-pci. Shortened ASan report: ==17512==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000041648 at pc 0x55e0fdac51ff bp 0x7ffe4af07410 sp 0x7ffe4af07408 WRITE of size 2 at 0x611000041648 thread T0 #0 0x55e0fdac51fe in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53 #1 0x55e0fdac51fe in stw_he_p include/qemu/bswap.h:330 #2 0x55e0fdac51fe in stw_le_p include/qemu/bswap.h:379 #3 0x55e0fdac51fe in pci_set_word include/hw/pci/pci.h:490 #4 0x55e0fdac51fe in xen_pt_config_reg_init hw/xen/xen_pt_config_init.c:1991 #5 0x55e0fdac51fe in xen_pt_config_init hw/xen/xen_pt_config_init.c:2067 #6 0x55e0fdabcf4d in xen_pt_realize hw/xen/xen_pt.c:830 #7 0x55e0fdf59666 in pci_qdev_realize hw/pci/pci.c:2034 #8 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914 [...] 0x611000041648 is located 8 bytes to the right of 256-byte region [0x611000041540,0x611000041640) allocated by thread T0 here: #0 0x7ff596a94bb8 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd9bb8) #1 0x7ff57da66580 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x50580) #2 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914 [...] Signed-off-by: Simon Gaiser <hw42@ipsumj.de> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>

bdrv_unref() requires the AioContext lock because bdrv_flush() uses BDRV_POLL_WHILE(), which assumes the AioContext is currently held. If BDRV_POLL_WHILE() runs without AioContext held the pthread_mutex_unlock() call in aio_context_release() fails. This patch moves bdrv_unref() into the AioContext locked region to solve the following pthread_mutex_unlock() failure: #0 0x00007f566181969b in raise () at /lib64/libc.so.6 #1 0x00007f566181b3b1 in abort () at /lib64/libc.so.6 #2 0x00005592cd590458 in error_exit (err=<optimized out>, msg=msg@entry=0x5592cdaf6d60 <__func__.23977> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36 #3 0x00005592cd96e738 in qemu_mutex_unlock (mutex=mutex@entry=0x5592ce9505e0) at util/qemu-thread-posix.c:96 #4 0x00005592cd969b69 in aio_context_release (ctx=ctx@entry=0x5592ce950580) at util/async.c:507 #5 0x00005592cd8ead78 in bdrv_flush (bs=bs@entry=0x5592cfa87210) at block/io.c:2478 #6 0x00005592cd89df30 in bdrv_close (bs=0x5592cfa87210) at block.c:3207 #7 0x00005592cd89df30 in bdrv_delete (bs=0x5592cfa87210) at block.c:3395 #8 0x00005592cd89df30 in bdrv_unref (bs=0x5592cfa87210) at block.c:4418 #9 0x00005592cd6b7f86 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7ffe4a1fc9d8) at blockdev.c:2308 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171206144550.22295-2-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

There is a small chance that iothread_stop() hangs as follows: Thread 3 (Thread 0x7f63eba5f700 (LWP 16105)): #0 0x00007f64012c09b6 in ppoll () at /lib64/libc.so.6 #1 0x000055959992eac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77 #2 0x000055959992eac9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322 #3 0x0000559599930711 in aio_poll (ctx=0x55959bdb83c0, blocking=blocking@entry=true) at util/aio-posix.c:629 #4 0x00005595996806fe in iothread_run (opaque=0x55959bd78400) at iothread.c:59 #5 0x00007f640159f609 in start_thread () at /lib64/libpthread.so.0 #6 0x00007f64012cce6f in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7f640b45b280 (LWP 16103)): #0 0x00007f64015a0b6d in pthread_join () at /lib64/libpthread.so.0 #1 0x00005595999332ef in qemu_thread_join (thread=<optimized out>) at util/qemu-thread-posix.c:547 #2 0x00005595996808ae in iothread_stop (iothread=<optimized out>) at iothread.c:91 #3 0x000055959968094d in iothread_stop_iter (object=<optimized out>, opaque=<optimized out>) at iothread.c:102 #4 0x0000559599857d97 in do_object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0, recurse=recurse@entry=false) at qom/object.c:852 #5 0x0000559599859477 in object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0) at qom/object.c:867 #6 0x0000559599680a6e in iothread_stop_all () at iothread.c:341 #7 0x000055959955b1d5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4913 The relevant code from iothread_run() is: while (!atomic_read(&iothread->stopping)) { aio_poll(iothread->ctx, true); and iothread_stop(): iothread->stopping = true; aio_notify(iothread->ctx); ... qemu_thread_join(&iothread->thread); The following scenario can occur: 1. IOThread: while (!atomic_read(&iothread->stopping)) -> stopping=false 2. Main loop: iothread->stopping = true; aio_notify(iothread->ctx); 3. IOThread: aio_poll(iothread->ctx, true); -> hang The bug is explained by the AioContext->notify_me doc comments: "If this field is 0, everything (file descriptors, bottom halves, timers) will be re-evaluated before the next blocking poll(), thus the event_notifier_set call can be skipped." The problem is that "everything" does not include checking iothread->stopping. This means iothread_run() will block in aio_poll() if aio_notify() was called just before aio_poll(). This patch fixes the hang by replacing aio_notify() with aio_bh_schedule_oneshot(). This makes aio_poll() or g_main_loop_run() to return. Implementing this properly required a new bool running flag. The new flag prevents races that are tricky if we try to use iothread->stopping. Now iothread->stopping is purely for iothread_stop() and iothread->running is purely for the iothread_run() thread. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20171207201320.19284-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault with low probability. The backtrace is: #0 0x00007f46c60291d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f46c602a8c8 in __GI_abort () at abort.c:90 #2 0x00000000008543c9 in PAT_abort () #3 0x000000000085140d in patchIllInsHandler () #4 <signal handler called> #5 pthread_detach (th=139933037614848) at pthread_detach.c:50 #6 0x0000000000829759 in qemu_thread_create (thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-task-worker", start_routine=start_routine@entry=0x7eb9a0 <qio_task_thread_worker>, arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512 #7 0x00000000007ebc96 in qio_task_run_in_thread (task=0x31db2c0, worker=worker@entry=0x7e7e40 <qio_channel_socket_connect_worker>, opaque=0xcd23380, destroy=0x7f1180 <qapi_free_SocketAddress>) at io/task.c:141 #8 0x00000000007e7f33 in qio_channel_socket_connect_async (ioc=ioc@entry=0x626c0b0, addr=<optimized out>, callback=callback@entry=0x55e080 <qemu_chr_socket_connected>, opaque=opaque@entry=0x42862c0, destroy=destroy@entry=0x0) at io/channel_socket.c:194 #9 0x000000000055bdd1 in socket_reconnect_timeout (opaque=0x42862c0) at qemu_char.c:4744 #10 0x00007f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-2.0.so.0 #11 0x00007f46c724799a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #12 0x000000000076c646 in glib_pollfds_poll () at main_loop.c:228 #13 0x000000000076c6eb in os_host_main_loop_wait (timeout=348000000) at main_loop.c:273 #14 0x000000000076c815 in main_loop_wait (nonblocking=nonblocking@entry=0) at main_loop.c:521 #15 0x000000000056a511 in main_loop () at vl.c:2076 #16 0x0000000000420705 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4940 The cause of this problem is a glibc bug; for more information, see https://sourceware.org/bugzilla/show_bug.cgi?id=19951. The solution for this bug is to use pthread_attr_setdetachstate. There is a similar issue with pthread_setname_np, which is moved from creating thread to created thread. Signed-off-by: linzhecheng <linzhecheng@huawei.com> Message-Id: <20171128044656.10592-1-linzhecheng@huawei.com> Reviewed-by: Fam Zheng <famz@redhat.com> [Simplify the code by removing qemu_thread_set_name, and free the arguments before invoking the start routine. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

The find_desc_by_name() from util/qemu-option.c relies on the .name not being NULL to call strcmp(). This check becomes unsafe when the list is not NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can result in segmentation fault when strcmp() tries to access an invalid memory: #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6 #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166 #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026 #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406 #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135 #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395 >From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an invalid memory: >>> p desc[5] $8 = { name = 0x1037f098 "R27A", type = 1561964883, help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>, def_value_str = 0x2 <error: Cannot access memory at address 0x2> } >>> p desc[6] $9 = { name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001", type = 272101528, help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>, def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9> } This patch fixes the segmentation fault in strcmp() by adding a NULL element at the end of nbd_runtime_opts.desc list, which is the common practice to most of other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL check becomes safe because it will not evaluate to true when .desc list reached its end. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1727259 Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com> Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com> CC: qemu-stable@nongnu.org Fixes: 7ccc44f Signed-off-by: Eric Blake <eblake@redhat.com>

/public/qobject_is_equal_conversion: OK ================================================================= ==14396==ERROR: LeakSanitizer: detected memory leaks Direct leak of 56 byte(s) in 1 object(s) allocated from: #0 0x7f07682c5850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f0767d12f0c in g_malloc ../glib/gmem.c:94 #2 0x7f0767d131cf in g_malloc_n ../glib/gmem.c:331 #3 0x562bd767371f in do_test_equality /home/elmarco/src/qq/tests/check-qobject.c:49 #4 0x562bd7674a35 in qobject_is_equal_dict_test /home/elmarco/src/qq/tests/check-qobject.c:267 #5 0x7f0767d37b04 in test_case_run ../glib/gtestutils.c:2237 #6 0x7f0767d37ec4 in g_test_run_suite_internal ../glib/gtestutils.c:2321 #7 0x7f0767d37f6d in g_test_run_suite_internal ../glib/gtestutils.c:2333 #8 0x7f0767d38184 in g_test_run_suite ../glib/gtestutils.c:2408 #9 0x7f0767d36e0d in g_test_run ../glib/gtestutils.c:1674 #10 0x562bd7674e75 in main /home/elmarco/src/qq/tests/check-qobject.c:327 #11 0x7f0766009039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180104160523.22995-9-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Note that data_dir[] will now point to allocated strings. Fixes: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7f1448181850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f1446ed8f0c in g_malloc ../glib/gmem.c:94 #2 0x7f1446ed91cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f1446ef739a in g_strsplit ../glib/gstrfuncs.c:2364 #4 0x55cf276439d7 in main /home/elmarco/src/qq/vl.c:4311 #5 0x7f143dfad039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180104160523.22995-10-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Fixes leaks such as: Direct leak of 2 byte(s) in 1 object(s) allocated from: #0 0x7eff58beb850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7eff57942f0c in g_malloc ../glib/gmem.c:94 #2 0x7eff579431cf in g_malloc_n ../glib/gmem.c:331 #3 0x7eff5795f6eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55db720f1d46 in readline_hist_add /home/elmarco/src/qq/util/readline.c:258 #5 0x55db720f2d34 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:387 #6 0x55db71539d00 in monitor_read /home/elmarco/src/qq/monitor.c:3896 #7 0x55db71f9be35 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167 #8 0x55db71f9bed3 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179 #9 0x55db71fa013c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66 #10 0x55db71fe18a8 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84 #11 0x7eff5793a90b in g_main_dispatch ../glib/gmain.c:3182 #12 0x7eff5793b7ac in g_main_context_dispatch ../glib/gmain.c:3847 #13 0x55db720af3bd in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214 #14 0x55db720af505 in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261 #15 0x55db720af6d6 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515 #16 0x55db7184e0de in main_loop /home/elmarco/src/qq/vl.c:1995 #17 0x55db7185e956 in main /home/elmarco/src/qq/vl.c:4914 #18 0x7eff4ea17039 in __libc_start_main (/lib64/libc.so.6+0x21039) (while at it, use g_new0(ReadLineState), it's a bit easier to read) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180104160523.22995-11-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Direct leak of 12 byte(s) in 2 object(s) allocated from: #0 0x7f50d403c850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f50d1ddf98f in vasprintf (/lib64/libc.so.6+0x8098f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180104160523.22995-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ASAN complains about: ==8856==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd8a1fe168 at pc 0x561136cb4451 bp 0x7ffd8a1fe130 sp 0x7ffd8a1fd8e0 READ of size 16 at 0x7ffd8a1fe168 thread T0 #0 0x561136cb4450 in __asan_memcpy (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) #1 0x561136d2a6a7 in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:83:5 #2 0x561136d29af8 in qcrypto_ivgen_calculate /home/elmarco/src/qq/crypto/ivgen.c:72:12 #3 0x561136d07c8e in test_ivgen /home/elmarco/src/qq/tests/test-crypto-ivgen.c:148:5 #4 0x7f77772c3b04 in test_case_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2237 #5 0x7f77772c3ec4 in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2321 #6 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #7 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #8 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333 #9 0x7f77772c4184 in g_test_run_suite /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2408 #10 0x7f77772c2e0d in g_test_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:1674 #11 0x561136d0799b in main /home/elmarco/src/qq/tests/test-crypto-ivgen.c:173:12 #12 0x7f77756e6039 in __libc_start_main (/lib64/libc.so.6+0x21039) #13 0x561136c13d89 in _start (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x6fd89) Address 0x7ffd8a1fe168 is located in stack of thread T0 at offset 40 in frame #0 0x561136d2a40f in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:76 This frame has 1 object(s): [32, 40) 'sector.addr' <== Memory access at offset 40 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) in __asan_memcpy Shadow bytes around the buggy address: 0x100031437bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x100031437c20: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[f3]f3 f3 0x100031437c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x100031437c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb It looks like the rest of the code copes with ndata being larger than sizeof(sector), so limit the memcpy() range. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <20180104160523.22995-13-marcandre.lureau@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Direct leak of 160 byte(s) in 4 object(s) allocated from: #0 0x55ed7678cda8 in calloc (/home/elmarco/src/qq/build/x86_64-softmmu/qemu-system-x86_64+0x797da8) #1 0x7f3f5e725f75 in g_malloc0 /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:124 #2 0x55ed778aa3a7 in query_option_descs /home/elmarco/src/qq/util/qemu-config.c:60:16 #3 0x55ed778aa307 in get_drive_infolist /home/elmarco/src/qq/util/qemu-config.c:140:19 #4 0x55ed778a9f40 in qmp_query_command_line_options /home/elmarco/src/qq/util/qemu-config.c:254:36 #5 0x55ed76d4868c in qmp_marshal_query_command_line_options /home/elmarco/src/qq/build/qmp-marshal.c:3078:14 #6 0x55ed77855dd5 in do_qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:104:5 #7 0x55ed778558cc in qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:131:11 #8 0x55ed768b592f in handle_qmp_command /home/elmarco/src/qq/monitor.c:3840:11 #9 0x55ed7786ccfe in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105:5 #10 0x55ed778fe37c in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323:13 #11 0x55ed778fdde6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373:15 #12 0x55ed7786cd83 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124:12 #13 0x55ed768b559e in monitor_qmp_read /home/elmarco/src/qq/monitor.c:3882:5 #14 0x55ed77714f29 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167:9 #15 0x55ed77714fde in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179:9 #16 0x55ed7772ffad in tcp_chr_read /home/elmarco/src/qq/chardev/char-socket.c:440:13 #17 0x55ed7777113b in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84:12 #18 0x7f3f5e71d90b in g_main_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3182 #19 0x7f3f5e71e7ac in g_main_context_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3847 #20 0x55ed77886ffc in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214:9 #21 0x55ed778865fd in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261:5 #22 0x55ed77886222 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515:11 #23 0x55ed76d2a4df in main_loop /home/elmarco/src/qq/vl.c:1995:9 #24 0x55ed76d1cb4a in main /home/elmarco/src/qq/vl.c:4914:5 #25 0x7f3f555f6039 in __libc_start_main (/lib64/libc.so.6+0x21039) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20180104160523.22995-14-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Direct leak of 913 byte(s) in 43 object(s) allocated from: #0 0x55880a15df60 in __interceptor_malloc (/home/elmarco/src/qq/build/tests/qmp-test+0x110f60) #1 0x7f3f20fd098f in _IO_vasprintf (/lib64/libc.so.6+0x8098f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180104160523.22995-15-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

The coroutine is not finished by the time the test ends, resulting in ASAN warning: ==7005==ERROR: LeakSanitizer: detected memory leaks Direct leak of 312 byte(s) in 1 object(s) allocated from: #0 0x7fd35290fa38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7fd3506c5f75 in g_malloc0 ../glib/gmem.c:124 #2 0x55994af03e47 in qemu_coroutine_new /home/elmarco/src/qemu/util/coroutine-ucontext.c:144 #3 0x55994aefed99 in qemu_coroutine_create /home/elmarco/src/qemu/util/qemu-coroutine.c:76 #4 0x55994ac1eb50 in verify_entered_step_1 /home/elmarco/src/qemu/tests/test-coroutine.c:80 #5 0x55994af03c75 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:119 #6 0x7fd34ec02bef (/lib64/libc.so.6+0x50bef) Do not yield() to let the coroutine terminate. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20180104160523.22995-17-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Spotted thanks to ASAN: ==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350 READ of size 1 at 0x556715a1f120 thread T0 #0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219 #1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294 #2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635 #3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324 #4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418 #5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109 #6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613 #7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704 #8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104 #9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131 #10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839 #11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105 #12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323 #13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373 #14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124 #15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881 #16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172 #17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184 #18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440 #19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84 #20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182 #21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847 #22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950 #26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865 #27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009) #28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9) 0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104 0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600 This fix is based on Andreas Arnez <arnez@linux.vnet.ibm.com> upstream commit: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b 2014-08-19 Andreas Arnez <arnez@linux.vnet.ibm.com> * s390-dis.c (init_disasm): Simplify initialization of opc_index[]. This also fixes an access after the last element of s390_opcodes[]. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180104160523.22995-19-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

…e object missed in 60765b6. Thread 1 "qemu-system-aarch64" received signal SIGSEGV, Segmentation fault. address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050 3050 as->root = root; (gdb) bt #0 address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050 #1 0x0000555555af62c3 in sdhci_sysbus_realize (dev=<optimized out>, errp=0x7fff7f931150) at hw/sd/sdhci.c:1564 #2 0x00005555558b25e5 in zynqmp_sdhci_realize (dev=0x555557051520, errp=0x7fff7f931150) at hw/sd/zynqmp-sdhci.c:151 #3 0x0000555555a2e7f3 in device_set_realized (obj=0x555557051520, value=<optimized out>, errp=0x7fff7f931270) at hw/core/qdev.c:966 #4 0x0000555555ba3f74 in property_set_bool (obj=0x555557051520, v=<optimized out>, name=<optimized out>, opaque=0x555556e04a20, errp=0x7fff7f931270) at qom/object.c:1906 #5 0x0000555555ba51f4 in object_property_set (obj=obj@entry=0x555557051520, v=v@entry=0x5555576dbd60, name=name@entry=0x555555dd6306 "realized", errp=errp@entry=0x7fff7f931270) at qom/object.c:1102 Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180123132051.24448-1-f4bug@amsat.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Fixes the following ASAN report: Direct leak of 128 byte(s) in 8 object(s) allocated from: #0 0x7fefce311850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7fefcdd5ef0c in g_malloc ../glib/gmem.c:94 #2 0x559b976faff0 in create_ahci_io_test /home/elmarco/src/qemu/tests/ahci-test.c:1810 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180215212552.26997-6-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Fix the following ASAN reports: ==20125==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f0faea03a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f0fae450f75 in g_malloc0 ../glib/gmem.c:124 #2 0x562fffd526fc in machine_start /home/elmarco/src/qemu/tests/sdhci-test.c:180 Indirect leak of 152 byte(s) in 1 object(s) allocated from: #0 0x7f0faea03850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f0fae450f0c in g_malloc ../glib/gmem.c:94 #2 0x562fffd5d21d in qpci_init_pc /home/elmarco/src/qemu/tests/libqos/pci-pc.c:122 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180215212552.26997-7-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Spotted by ASAN: QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124 #2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59 #3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95 #4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393 #5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830 #6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022 #7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40 #8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716 #9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679 #10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Spotted by ASAN: elmarco@boraha:~/src/qemu/build (master *%)$ QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test /aarch64/boot-serial/virt: ** (process:19740): DEBUG: 18:39:30.275: foo /tmp/qtest-boot-serial-cXaS94D ================================================================= ==19740==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000069648 at pc 0x7f1d2201cc54 bp 0x7fff331f6a40 sp 0x7fff331f61e8 READ of size 4 at 0x603000069648 thread T0 #0 0x7f1d2201cc53 (/lib64/libasan.so.4+0xafc53) #1 0x55bc86685ee3 in load_aarch64_image /home/elmarco/src/qemu/hw/arm/boot.c:894 #2 0x55bc86687217 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1047 #3 0x55bc877363b5 in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40 #4 0x55bc869331ea in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716 #5 0x55bc8693bc39 in main /home/elmarco/src/qemu/vl.c:4679 #6 0x7f1d1652c009 in __libc_start_main (/lib64/libc.so.6+0x21009) #7 0x55bc86255cc9 in _start (/home/elmarco/src/qemu/build/aarch64-softmmu/qemu-system-aarch64+0x1ae5cc9) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Spotted thanks to ASAN: QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest ==30302==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124 #2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203 #3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 #11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 #12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 #13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 #17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 #18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Indirect leak of 45 byte(s) in 1 object(s) allocated from: #0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850) #1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94 #2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331 #3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363 #4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204 #5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139 #6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150 #7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411 #8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81 #9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85 #10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142 #11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88 #12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552 #13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182 #14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847 #15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214 #16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261 #17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515 #18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942 #19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724 #20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180306170959.3921-1-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Found thanks to ASAN: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124 #2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355 #3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414 #4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684 #5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333 Fixes: 6303631 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180319175823.22111-1-marcandre.lureau@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Fix leak spotted by ASAN: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7fe1abb80a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7fe1aaf1bf75 in g_malloc0 ../glib/gmem.c:124 #2 0x7fe1aaf1c249 in g_malloc0_n ../glib/gmem.c:355 #3 0x55f4841cfaa9 in postcopy_ram_fault_thread /home/elmarco/src/qemu/migration/postcopy-ram.c:596 #4 0x55f48479447b in qemu_thread_start /home/elmarco/src/qemu/util/qemu-thread-posix.c:504 #5 0x7fe1a043550a in start_thread (/lib64/libpthread.so.0+0x750a) Regression introduced with commit 00fa4fc. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20180321113644.21899-1-marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

This fixes leaks found by ASAN such as: GTESTER tests/test-blockjob ================================================================= ==31442==ERROR: LeakSanitizer: detected memory leaks Direct leak of 24 byte(s) in 1 object(s) allocated from: #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38) #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129 #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360 #3 0x5584d2732498 in block_job_txn_new /home/elmarco/src/qemu/blockjob.c:172 #4 0x5584d2739b28 in block_job_create /home/elmarco/src/qemu/blockjob.c:973 #5 0x5584d270ae31 in mk_job /home/elmarco/src/qemu/tests/test-blockjob.c:34 #6 0x5584d270b1c1 in do_test_id /home/elmarco/src/qemu/tests/test-blockjob.c:57 #7 0x5584d270b65c in test_job_ids /home/elmarco/src/qemu/tests/test-blockjob.c:118 #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255 #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339 #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351 #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426 #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692 #13 0x5584d270d6e2 in main /home/elmarco/src/qemu/tests/test-blockjob.c:377 #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29) Add an assert to make sure that the job doesn't have associated txn before free(). [Jeff Cody: N.B., used updated patch provided by John Snow] Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com>

Free the AIO context earlier than the GMainContext (if we have) to workaround a glib2 bug that GSource context pointer is not cleared even if the context has already been destroyed (while it should). The patch itself only changed the order to destroy the objects, no functional change at all. Without this workaround, we can encounter qmp-test hang with oob (and possibly any other use case when iothread is used with GMainContexts): #0 0x00007f35ffe45334 in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f35ffe405d8 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007f35ffe404a7 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f35fc5b9c9d in g_source_unref_internal (source=0x24f0600, context=0x7f35f0000960, have_lock=0) at gmain.c:1685 #4 0x0000000000aa6672 in aio_context_unref (ctx=0x24f0600) at /root/qemu/util/async.c:497 #5 0x000000000065851c in iothread_instance_finalize (obj=0x24f0380) at /root/qemu/iothread.c:129 #6 0x0000000000962d79 in object_deinit (obj=0x24f0380, type=0x242e960) at /root/qemu/qom/object.c:462 #7 0x0000000000962e0d in object_finalize (data=0x24f0380) at /root/qemu/qom/object.c:476 #8 0x0000000000964146 in object_unref (obj=0x24f0380) at /root/qemu/qom/object.c:924 #9 0x0000000000965880 in object_finalize_child_property (obj=0x24ec640, name=0x24efca0 "mon_iothread", opaque=0x24f0380) at /root/qemu/qom/object.c:1436 #10 0x0000000000962c33 in object_property_del_child (obj=0x24ec640, child=0x24f0380, errp=0x0) at /root/qemu/qom/object.c:436 #11 0x0000000000962d26 in object_unparent (obj=0x24f0380) at /root/qemu/qom/object.c:455 #12 0x0000000000658f00 in iothread_destroy (iothread=0x24f0380) at /root/qemu/iothread.c:365 #13 0x00000000004c67a8 in monitor_cleanup () at /root/qemu/monitor.c:4663 #14 0x0000000000669e27 in main (argc=16, argv=0x7ffc8b1ae2f8, envp=0x7ffc8b1ae380) at /root/qemu/vl.c:4749 The glib2 bug is fixed in commit 26056558b ("gmain: allow g_source_get_context() on destroyed sources", 2012-07-30), so the first good version is glib2 2.33.10. But we still support building with glib as old as 2.28, so we need the workaround. Let's make sure we destroy the GSources first before its owner context until we drop support for glib older than 2.33.10. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180409083956.1780-1-peterx@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>

Eric Auger reported the problem days ago that OOB broke ARM when running with libvirt: http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html The problem was that the monitor dispatcher bottom half was bound to qemu_aio_context now, which could be polled unexpectedly in block code. We should keep the dispatchers run in iohandler_ctx just like what we did before the Out-Of-Band series (chardev uses qio, and qio binds everything with iohandler_ctx). If without this change, QMP dispatcher might be run even before reaching main loop in block IO path, for example, in a stack like (the ARM case, "cont" command handler run even during machine init phase): #0 qmp_cont () #1 0x00000000006bd210 in qmp_marshal_cont () #2 0x0000000000ac05c4 in do_qmp_dispatch () #3 0x0000000000ac07a0 in qmp_dispatch () #4 0x0000000000472d60 in monitor_qmp_dispatch_one () #5 0x000000000047302c in monitor_qmp_bh_dispatcher () #6 0x0000000000acf374 in aio_bh_call () #7 0x0000000000acf428 in aio_bh_poll () #8 0x0000000000ad5110 in aio_poll () #9 0x0000000000a08ab8 in blk_prw () #10 0x0000000000a091c4 in blk_pread () #11 0x0000000000734f94 in pflash_cfi01_realize () #12 0x000000000075a3a4 in device_set_realized () #13 0x00000000009a26cc in property_set_bool () #14 0x00000000009a0a40 in object_property_set () #15 0x00000000009a3a08 in object_property_set_qobject () #16 0x00000000009a0c8c in object_property_set_bool () #17 0x0000000000758f94 in qdev_init_nofail () #18 0x000000000058e190 in create_one_flash () #19 0x000000000058e2f4 in create_flash () #20 0x00000000005902f0 in machvirt_init () #21 0x00000000007635cc in machine_run_board_init () #22 0x00000000006b135c in main () Actually the problem is more severe than that. After we switched to the qemu AIO handler it means the monitor dispatcher code can even be called with nested aio_poll(), then it can be an explicit aio_poll() inside another main loop aio_poll() which could be racy too; breaking code like TPM and 9p that use nested event loops. Switch to use the iohandler_ctx for monitor dispatchers. My sincere thanks to Eric Auger who offered great help during both debugging and verifying the problem. The ARM test was carried out by applying this patch upon QEMU 2.12.0-rc0 and problem is gone after the patch. A quick test of mine shows that after this patch applied we can pass all raw iotests even with OOB on by default. CC: Eric Blake <eblake@redhat.com> CC: Markus Armbruster <armbru@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Fam Zheng <famz@redhat.com> Reported-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180410044942.17059-1-peterx@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>

nasastry closed this as completed Dec 1, 2016

This was referenced Oct 9, 2017

qemu segfaults with dump-guest-memory #8

Closed

qemu crashes with qemu_event_set: Assertion `ev->initialized' failed error #20

Closed

cdeadmin mentioned this issue Oct 23, 2017

Guest shuts-off with call trace after migrating between Boston and ZZ systems #19

Closed

nasastry mentioned this issue Oct 30, 2017

qemu-io crashes with SIGABRT and Assertion `c->entries[i].offset != 0' failed #25

Closed

nasastry mentioned this issue Nov 21, 2017

numastat crashes with buffer overflow #30

Closed

nasastry mentioned this issue Nov 28, 2017

[libvirt] With default cpu mode guest is not starting #33

Closed

cdeadmin mentioned this issue Dec 20, 2017

qemu crashed during "info cpus" in monitor with change in default cpu in hotplug/unplug sequence #11

Closed

cdeadmin mentioned this issue Jan 1, 2018

qemu guest is not booting with qemu's max-cpu-compat=power7+ #10

Closed

cdeadmin mentioned this issue Feb 2, 2018

[Power9] [Qemu] Migration started after postcopy_ram enabled causes guest reboot in destination and guest remains running state in source as well #34

Closed

cdeadmin mentioned this issue Mar 7, 2018

[qemu] loadvm fails with "Failed to load virtio-net:virtio" #37

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory hot unplug feature is not available in 'hostos-devel' branch #1

Memory hot unplug feature is not available in 'hostos-devel' branch #1

nasastry commented Aug 30, 2016

cat /tmp/t1 # Detach memory xml, taken from virsh dumpxml

nasastry commented Aug 30, 2016

rmatinata-ibm commented Aug 31, 2016

aik commented Sep 1, 2016

nasastry commented Dec 1, 2016

Memory hot unplug feature is not available in 'hostos-devel' branch #1

Memory hot unplug feature is not available in 'hostos-devel' branch #1

Comments

nasastry commented Aug 30, 2016

cat mem_hp.xml

virsh attach-device centos72 mem_hp.xml --live

cat /tmp/t1 # Detach memory xml, taken from virsh dumpxml

virsh detach-device centos72 /tmp/t1 --live

nasastry commented Aug 30, 2016

rmatinata-ibm commented Aug 31, 2016

aik commented Sep 1, 2016

nasastry commented Dec 1, 2016