New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs and prints "main-loop: WARNING: I/O thread spun for 1000 iterations" #3

Closed
manuelafm opened this Issue Jan 17, 2015 · 10 comments

Comments

Projects
None yet
7 participants
@manuelafm

manuelafm commented Jan 17, 2015

With riscv-qemu from commit bfd1ee7 (29th of Dec of 2014) and using the kernel currently provided as a download from the website (version 3.14.15-g4073e84-dirty (skarandikar@a8)), the emulation seems to stop after printing this message:

main-loop: WARNING: I/O thread spun for 1000 iterations

Sometimes it happens very early, other times takes hours before this happens.

When that happens, I am able to quit with "ctrl-a x", but not to cancel running programs with "ctrl-c" nor to type new commands if this happens at the shell prompt.

This is possibly a bug in riscv-linux instead of riscv-qemu. Sadly I cannot provide any useful information, except possibly a backtrace of qemu-system-riscv when this happens, if you think that it's useful.

@NCommander

This comment has been minimized.

Contributor

NCommander commented Feb 16, 2015

I can confirm this behavior, it seems related to heavy net I/O for me.

@NCommander

This comment has been minimized.

Contributor

NCommander commented Feb 16, 2015

I applied this patch: http://comments.gmane.org/gmane.comp.emulators.qemu/236931 manually to the QEMU source tree and rebuild. This seems to allow it to get past the deadlock, you'll get the I/O blocking messages occassionally, but after a delay, QEMU will recover and continue where it left off. Seems to be an OK workaround for now, I was able to native-compile gmake using this w/ a NFS root.

@NCommander

This comment has been minimized.

Contributor

NCommander commented Feb 16, 2015

Ok, I wasn't 100% right on that being a fix. The issue appears to be coming from something setting a blocking I/O, which seems to become a non-blocking I/O which causes the mutexs to deadlock. My attempts to add debugging printfs seem to have "fixed" the problem from occuring so trying to debug this is stalled ...

@lambdafu

This comment has been minimized.

lambdafu commented Mar 4, 2015

+1. this is a real show stopper.

@NCommander

This comment has been minimized.

Contributor

NCommander commented Mar 5, 2015

Eek, I forgot to post I found a workaround. If you force everything to be blocking, the issue goes away, you have to set a variable in the main loop which controls blocking.

In vl.c:

Find:

    nonblocking = !kvm_enabled() && !xen_enabled() && last_io > 0;

and set it to 0.

I think. It might be one. I don't have my machine handy to verify which value fixes it. Performance takes a bit of a dive but it works; I was able to native build GCC like this.

@lambdafu

This comment has been minimized.

lambdafu commented Mar 5, 2015

yeah, that helps a lot! thanks!

@sagark

This comment has been minimized.

sagark commented Feb 11, 2016

Closing this since I haven't seen it happen with the privileged update + rebase. Let me know if it reappears.

@sagark sagark closed this Feb 11, 2016

@visbits

This comment has been minimized.

visbits commented Apr 17, 2016

Seeing this currently.

root@osc-1011.prd > rpm -qa | grep qemu
qemu-kvm-1.5.3-105.el7_2.3.x86_64
qemu-kvm-common-1.5.3-105.el7_2.3.x86_64
qemu-img-1.5.3-105.el7_2.3.x86_64
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64
ipxe-roms-qemu-20130517-7.gitc4bce43.el7.noarch
root@osc-1011.prd > rpm -qa | grep libvirt
libvirt-client-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-storage-1.2.17-13.el7_2.4.x86_64
libvirt-python-1.2.17-2.el7.x86_64
libvirt-daemon-driver-nodedev-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-network-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-interface-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-nwfilter-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-driver-secret-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-1.2.17-13.el7_2.4.x86_64
libvirt-daemon-kvm-1.2.17-13.el7_2.4.x86_64

Linux osc-1011.prd 3.10.0-327.13.1.el7.x86_64 #1 SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

sagark pushed a commit that referenced this issue Sep 23, 2016

linux-user-i386: Fix crash on cpuid
Running cpuid instructions with a simple run like:
i386-linux-user/qemu-i386 tests/tcg/sha1-i386

Results in the following assert:
 #0  0x00007ffff64246f5 in raise () from /lib64/libc.so.6
 #1  0x00007ffff64262fa in abort () from /lib64/libc.so.6
 #2  0x00007ffff7937ec5 in g_assertion_message () from /lib64/libglib-2.0.so.0
 #3  0x00007ffff7937f5a in g_assertion_message_expr () from /lib64/libglib-2.0.so.0
 #4  0x000055555561b54c in apicid_bitwidth_for_count (count=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:58
 #5  0x000055555561b58a in apicid_smt_width (nr_cores=0, nr_threads=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:67
 #6  0x000055555561b5c3 in apicid_core_offset (nr_cores=0, nr_threads=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:82
 #7  0x000055555561b5e3 in apicid_pkg_offset (nr_cores=0, nr_threads=0) at /home/elmarco/src/qemu/include/hw/i386/topology.h:89
 #8  0x000055555561dd86 in cpu_x86_cpuid (env=0x555557999550, index=4, count=3, eax=0x7fffffffcae8, ebx=0x7fffffffcaec, ecx=0x7fffffffcaf0, edx=0x7fffffffcaf4) at /home/elmarco/src/qemu/target-i386/cpu.c:2405
 #9  0x0000555555638e8e in helper_cpuid (env=0x555557999550) at /home/elmarco/src/qemu/target-i386/misc_helper.c:106
 #10 0x000055555599dc5e in static_code_gen_buffer ()
 #11 0x00005555555952f8 in cpu_tb_exec (cpu=0x5555579912d0, itb=0x7ffff4371ab0) at /home/elmarco/src/qemu/cpu-exec.c:166
 #12 0x0000555555595c8e in cpu_loop_exec_tb (cpu=0x5555579912d0, tb=0x7ffff4371ab0, last_tb=0x7fffffffd088, tb_exit=0x7fffffffd084, sc=0x7fffffffd0a0) at /home/elmarco/src/qemu/cpu-exec.c:517
 #13 0x0000555555595e50 in cpu_exec (cpu=0x5555579912d0) at /home/elmarco/src/qemu/cpu-exec.c:612
 #14 0x00005555555c065b in cpu_loop (env=0x555557999550) at /home/elmarco/src/qemu/linux-user/main.c:297
 #15 0x00005555555c25b2 in main (argc=2, argv=0x7fffffffd848, envp=0x7fffffffd860) at /home/elmarco/src/qemu/linux-user/main.c:4803

The fields are set in qemu_init_vcpu() with softmmu, but it's a stub
with linux-user.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

sagark pushed a commit that referenced this issue Sep 23, 2016

msmouse: Fix segfault caused by free the chr before chardev cleanup.
Segfault happens when leaving qemu with msmouse backend:

 #0  0x00007fa8526ac975 in raise () at /lib64/libc.so.6
 #1  0x00007fa8526add8a in abort () at /lib64/libc.so.6
 #2  0x0000558be78846ab in error_exit (err=16, msg=0x558be799da10 ...
 #3  0x0000558be7884717 in qemu_mutex_destroy (mutex=0x558be93be750) at ...
 #4  0x0000558be7549951 in qemu_chr_free_common (chr=0x558be93be750) at ...
 #5  0x0000558be754999c in qemu_chr_free (chr=0x558be93be750) at ...
 #6  0x0000558be7549a20 in qemu_chr_delete (chr=0x558be93be750) at ...
 #7  0x0000558be754a8ef in qemu_chr_cleanup () at qemu-char.c:4643
 #8  0x0000558be755843e in main (argc=5, argv=0x7ffe925d7118, ...

The chr was freed by msmouse close callback before chardev cleanup,
Then qemu_mutex_destroy triggered raise().

Because freeing chr is handled by qemu_chr_free_common, Remove the free from
msmouse_chr_close to avoid double free.

Fixes: c1111a2
Cc: qemu-stable@nongnu.org
Signed-off-by: Lin Ma <lma@suse.com>
Message-Id: <20160915143158.4796-1-lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
@marmistrz

This comment has been minimized.

marmistrz commented Apr 7, 2017

I experience the same while running Minix under Arch Linux, qemu 1.8.0. During heavy IO such as compiling or greping recursively.

@specing

This comment has been minimized.

specing commented Jul 19, 2017

While I am not using riscv-qemu, I have this exact issue only when qemu is configured with 2 network interface cards, with one the above does not occur.

xorgy pushed a commit to xorgy/riscv-qemu that referenced this issue Aug 22, 2017

Stefan Hajnoczi
linux-aio: fix re-entrant completion processing
Commit 0ed93d8 ("linux-aio: process
completions from ioq_submit()") added an optimization that processes
completions each time ioq_submit() returns with requests in flight.
This commit introduces a "Co-routine re-entered recursively" error which
can be triggered with -drive format=qcow2,aio=native.

Fam Zheng <famz@redhat.com>, Kevin Wolf <kwolf@redhat.com>, and I
debugged the following backtrace:

  (gdb) bt
  #0  0x00007ffff0a046f5 in raise () at /lib64/libc.so.6
  riscv#1  0x00007ffff0a062fa in abort () at /lib64/libc.so.6
  riscv#2  0x0000555555ac0013 in qemu_coroutine_enter (co=0x5555583464d0) at util/qemu-coroutine.c:113
  riscv#3  0x0000555555a4b663 in qemu_laio_process_completions (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:218
  riscv#4  0x0000555555a4b874 in ioq_submit (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:331
  riscv#5  0x0000555555a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x555559d38ae0, offset=offset@entry=2932727808, type=type@entry=1) at block/linux-aio.c:383
  riscv#6  0x0000555555a4bbd3 in laio_co_submit (bs=<optimized out>, s=0x555557e2f7f0, fd=13, offset=2932727808, qiov=0x555559d38e20, type=1) at block/linux-aio.c:402
  riscv#7  0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x55555663bcb0, offset=offset@entry=2932727808, bytes=bytes@entry=8192, qiov=qiov@entry=0x555559d38e20, flags=0) at block/io.c:804
  riscv#8  0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x55555663bcb0, req=req@entry=0x555559d38d20, offset=offset@entry=2932727808, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x555559d38e20, flags=0) at block/io.c:1041
  riscv#9  0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=2932727808, bytes=8192, qiov=qiov@entry=0x555559d38e20, flags=flags@entry=0) at block/io.c:1133
  riscv#10 0x0000555555a29629 in qcow2_co_preadv (bs=0x555556635890, offset=6178725888, bytes=8192, qiov=0x555557527840, flags=<optimized out>) at block/qcow2.c:1509
  riscv#11 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x555556635890, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x555557527840, flags=0) at block/io.c:804
  riscv#12 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x555556635890, req=req@entry=0x555559d39000, offset=offset@entry=6178725888, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x555557527840, flags=0) at block/io.c:1041
  riscv#13 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=offset@entry=6178725888, bytes=bytes@entry=8192, qiov=qiov@entry=0x555557527840, flags=flags@entry=0) at block/io.c:1133
  riscv#14 0x0000555555a4515a in blk_co_preadv (blk=0x5555566356d0, offset=6178725888, bytes=8192, qiov=0x555557527840, flags=0) at block/block-backend.c:783
  riscv#15 0x0000555555a45266 in blk_aio_read_entry (opaque=0x5555577025e0) at block/block-backend.c:991
  riscv#16 0x0000555555ac0cfa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78

It turned out that re-entrant ioq_submit() and completion processing
between three requests caused this error.  The following check is not
sufficient to prevent recursively entering coroutines:

  if (laiocb->co != qemu_coroutine_self()) {
      qemu_coroutine_enter(laiocb->co);
  }

As the following coroutine backtrace shows, not just the current
coroutine (self) can be entered.  There might also be other coroutines
that are currently entered and transferred control due to the qcow2 lock
(CoMutex):

  (gdb) qemu coroutine 0x5555583464d0
  #0  0x0000555555ac0c90 in qemu_coroutine_switch (from_=from_@entry=0x5555583464d0, to_=to_@entry=0x5555572f9890, action=action@entry=COROUTINE_ENTER) at util/coroutine-ucontext.c:175
  riscv#1  0x0000555555abfe54 in qemu_coroutine_enter (co=0x5555572f9890) at util/qemu-coroutine.c:117
  riscv#2  0x0000555555ac031c in qemu_co_queue_run_restart (co=co@entry=0x5555583462c0) at util/qemu-coroutine-lock.c:60
  riscv#3  0x0000555555abfe5e in qemu_coroutine_enter (co=0x5555583462c0) at util/qemu-coroutine.c:119
  riscv#4  0x0000555555a4b663 in qemu_laio_process_completions (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:218
  riscv#5  0x0000555555a4b874 in ioq_submit (s=s@entry=0x555557e2f7f0) at block/linux-aio.c:331
  riscv#6  0x0000555555a4ba12 in laio_do_submit (fd=fd@entry=13, laiocb=laiocb@entry=0x55555a338b40, offset=offset@entry=2911477760, type=type@entry=1) at block/linux-aio.c:383
  riscv#7  0x0000555555a4bbd3 in laio_co_submit (bs=<optimized out>, s=0x555557e2f7f0, fd=13, offset=2911477760, qiov=0x55555a338e80, type=1) at block/linux-aio.c:402
  riscv#8  0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x55555663bcb0, offset=offset@entry=2911477760, bytes=bytes@entry=8192, qiov=qiov@entry=0x55555a338e80, flags=0) at block/io.c:804
  riscv#9  0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x55555663bcb0, req=req@entry=0x55555a338d80, offset=offset@entry=2911477760, bytes=bytes@entry=8192, align=align@entry=512, qiov=qiov@entry=0x55555a338e80, flags=0) at block/io.c:1041
  riscv#10 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=2911477760, bytes=8192, qiov=qiov@entry=0x55555a338e80, flags=flags@entry=0) at block/io.c:1133
  riscv#11 0x0000555555a29629 in qcow2_co_preadv (bs=0x555556635890, offset=6157475840, bytes=8192, qiov=0x5555575df720, flags=<optimized out>) at block/qcow2.c:1509
  riscv#12 0x0000555555a4fd23 in bdrv_driver_preadv (bs=bs@entry=0x555556635890, offset=offset@entry=6157475840, bytes=bytes@entry=8192, qiov=qiov@entry=0x5555575df720, flags=0) at block/io.c:804
  riscv#13 0x0000555555a52b34 in bdrv_aligned_preadv (bs=bs@entry=0x555556635890, req=req@entry=0x55555a339060, offset=offset@entry=6157475840, bytes=bytes@entry=8192, align=align@entry=1, qiov=qiov@entry=0x5555575df720, flags=0) at block/io.c:1041
  riscv#14 0x0000555555a52db8 in bdrv_co_preadv (child=<optimized out>, offset=offset@entry=6157475840, bytes=bytes@entry=8192, qiov=qiov@entry=0x5555575df720, flags=flags@entry=0) at block/io.c:1133
  riscv#15 0x0000555555a4515a in blk_co_preadv (blk=0x5555566356d0, offset=6157475840, bytes=8192, qiov=0x5555575df720, flags=0) at block/block-backend.c:783
  riscv#16 0x0000555555a45266 in blk_aio_read_entry (opaque=0x555557231aa0) at block/block-backend.c:991
  riscv#17 0x0000555555ac0cfa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78

Use the new qemu_coroutine_entered() function instead of comparing
against qemu_coroutine_self().  This is correct because:

1. If a coroutine is not entered then it must have yielded to wait for
   I/O completion.  It is therefore safe to enter.

2. If a coroutine is entered then it must be in
   ioq_submit()/qemu_laio_process_completions() because otherwise it
   would be yielded while waiting for I/O completion.  Therefore it will
   check laio->ret and return from ioq_submit() instead of yielding,
   i.e. it's guaranteed not to hang.

Reported-by: Fam Zheng <famz@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1474989516-18255-4-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

xorgy pushed a commit to xorgy/riscv-qemu that referenced this issue Aug 22, 2017

net: don't poke at chardev internal QemuOpts
The vhost-user & colo code is poking at the QemuOpts instance
in the CharDriverState struct, not realizing that it is valid
for this to be NULL. e.g. the following crash shows a codepath
where it will be NULL:

 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
 617         QTAILQ_FOREACH(opt, &opts->head, next) {
 [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))]
 (gdb) bt
 #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
 riscv#1  0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314
 riscv#2  0x000055baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250, name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at net/vhost-user.c:360
 riscv#3  0x000055baf6960216 in net_client_init1 (object=0x55baf8ff9250, is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051
 riscv#4  0x000055baf6960518 in net_client_init (opts=0x55baf776e7e0, is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108
 riscv#5  0x000055baf696083f in netdev_add (opts=0x55baf776e7e0, errp=0x7ffc51368f00) at net/net.c:1186
 riscv#6  0x000055baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60, ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205
 riscv#7  0x000055baf6622135 in handle_qmp_command (parser=0x55baf77fb590, tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978
 riscv#8  0x000055baf6a9d099 in json_message_process_token (lexer=0x55baf77fb598, input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19) at qobject/json-streamer.c:105
 riscv#9  0x000055baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598, ch=125 '}', flush=false) at qobject/json-lexer.c:319
 riscv#10 0x000055baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369
 riscv#11 0x000055baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-streamer.c:124
 riscv#12 0x000055baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530, buf=0x7ffc51369170 "}R\204\367\272U", size=1) at /path/to/qemu.git/monitor.c:3994
 riscv#13 0x000055baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387
 riscv#14 0x000055baf6757076 in qemu_chr_be_write (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399
 riscv#15 0x000055baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN, opaque=0x55baf7610a40) at qemu-char.c:2927
 riscv#16 0x000055baf6a5d655 in qio_channel_fd_source_dispatch (source=0x55baf7610df0, callback=0x55baf675b25a <tcp_chr_read>, user_data=0x55baf7610a40) at io/channel-watch.c:84
 riscv#17 0x00007f1d3e80cbbd in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
 riscv#18 0x000055baf69d3720 in glib_pollfds_poll () at main-loop.c:213
 riscv#19 0x000055baf69d37fd in os_host_main_loop_wait (timeout=126000000) at main-loop.c:258
 riscv#20 0x000055baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506
 riscv#21 0x000055baf676587b in main_loop () at vl.c:1908
 riscv#22 0x000055baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8, envp=0x7ffc5136a9f8) at vl.c:4604
 (gdb) p opts
 $1 = (QemuOpts *) 0x0

The crash occurred when attaching vhost-user net via QMP:

{
    "execute": "chardev-add",
    "arguments": {
        "id": "charnet2",
        "backend": {
            "type": "socket",
            "data": {
                "addr": {
                    "type": "unix",
                    "data": {
                        "path": "/var/run/openvswitch/vhost-user1"
                    }
                },
                "wait": false,
                "server": false
            }
        }
    },
    "id": "libvirt-19"
}
{
    "return": {

    },
    "id": "libvirt-19"
}
{
    "execute": "netdev_add",
    "arguments": {
        "type": "vhost-user",
        "chardev": "charnet2",
        "id": "hostnet2"
    },
    "id": "libvirt-20"
}

Code using chardevs should not be poking at the internals of the
CharDriverState struct. What vhost-user wants is a chardev that is
operating as reconnectable network service, along with the ability
to do FD passing over the connection. The colo code simply wants
a network service. Add a feature concept to the char drivers so
that chardev users can query the actual features they wish to have
supported. The QemuOpts member is removed to prevent future mistakes
in this area.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

xorgy pushed a commit to xorgy/riscv-qemu that referenced this issue Aug 22, 2017

xilinx: fix buffer overflow on realize
ASAN complains about buffer overflow when running:
aarch64-softmmu/qemu-system-aarch64 -machine xilinx-zynq-a9

==476==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000035e38 at pc 0x000000f75253 bp 0x7ffc597e0ec0 sp 0x7ffc597e0eb0
READ of size 8 at 0x602000035e38 thread T0
    #0 0xf75252 in xilinx_spips_realize hw/ssi/xilinx_spips.c:623
    riscv#1 0xb9ef6c in device_set_realized hw/core/qdev.c:918
    riscv#2 0x129ae01 in property_set_bool qom/object.c:1854
    riscv#3 0x1296e70 in object_property_set qom/object.c:1088
    riscv#4 0x129dd1b in object_property_set_qobject qom/qom-qobject.c:27
    riscv#5 0x1297168 in object_property_set_bool qom/object.c:1157
    riscv#6 0xb9aeac in qdev_init_nofail hw/core/qdev.c:358
    riscv#7 0x78a5bf in zynq_init_spi_flashes /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:125
    riscv#8 0x78af60 in zynq_init /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:238
    riscv#9 0x998eac in main /home/elmarco/src/qemu/vl.c:4534
    riscv#10 0x7f96ed692730 in __libc_start_main (/lib64/libc.so.6+0x20730)
    riscv#11 0x41d0a8 in _start (/home/elmarco/src/qemu/aarch64-softmmu/qemu-system-aarch64+0x41d0a8)

0x602000035e38 is located 0 bytes to the right of 8-byte region [0x602000035e30,0x602000035e38)
allocated by thread T0 here:
    #0 0x7f970b014e60 in malloc (/lib64/libasan.so.3+0xc6e60)
    riscv#1 0x7f96f15b0e18 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee18)
    riscv#2 0xb9ef6c in device_set_realized hw/core/qdev.c:918
    riscv#3 0x129ae01 in property_set_bool qom/object.c:1854
    riscv#4 0x1296e70 in object_property_set qom/object.c:1088
    riscv#5 0x129dd1b in object_property_set_qobject qom/qom-qobject.c:27
    riscv#6 0x1297168 in object_property_set_bool qom/object.c:1157
    riscv#7 0xb9aeac in qdev_init_nofail hw/core/qdev.c:358
    riscv#8 0x78a5bf in zynq_init_spi_flashes /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:125
    riscv#9 0x78af60 in zynq_init /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:238
    riscv#10 0x998eac in main /home/elmarco/src/qemu/vl.c:4534
    riscv#11 0x7f96ed692730 in __libc_start_main (/lib64/libc.so.6+0x20730)

s->spi is allocated with the size of num_busses which may be 1 (by
default).  Change to use a loop up to s->num_busses also for the
call to ssi_auto_connect_slaves().

Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

xorgy pushed a commit to xorgy/riscv-qemu that referenced this issue Aug 22, 2017

ipmi: fix qemu crash while migrating with ipmi
Qemu crash in the source side while migrating, after starting ipmi service inside vm.

./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 \
-drive file=/work/suse/suse11_sp3_64_vt,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \
-vnc :99 -monitor vc -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-kcs,bmc=bmc0,ioport=0xca2

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffec4268700 (LWP 7657)]
__memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
(gdb) bt
 #0  __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
 riscv#1  0x00005555559ef775 in memcpy (__len=3, __src=0xc1421c, __dest=<optimized out>)
     at /usr/include/bits/string3.h:51
 riscv#2  qemu_put_buffer (f=0x555557a97690, buf=0xc1421c <Address 0xc1421c out of bounds>, size=3)
     at migration/qemu-file.c:346
 riscv#3  0x00005555559eef66 in vmstate_save_state (f=f@entry=0x555557a97690,
     vmsd=0x555555f8a5a0 <vmstate_ISAIPMIKCSDevice>, opaque=0x555557231160,
     vmdesc=vmdesc@entry=0x55555798cc40) at migration/vmstate.c:333
 riscv#4  0x00005555557cfe45 in vmstate_save (f=f@entry=0x555557a97690, se=se@entry=0x555557231de0,
     vmdesc=vmdesc@entry=0x55555798cc40) at /mnt/sdb/zyy/qemu/migration/savevm.c:720
 riscv#5  0x00005555557d2be7 in qemu_savevm_state_complete_precopy (f=0x555557a97690,
     iterable_only=iterable_only@entry=false) at /mnt/sdb/zyy/qemu/migration/savevm.c:1128
 riscv#6  0x00005555559ea102 in migration_completion (start_time=<synthetic pointer>,
     old_vm_running=<synthetic pointer>, current_active_state=<optimized out>,
     s=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1707
 riscv#7  migration_thread (opaque=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1855
 riscv#8  0x00007ffff3900dc5 in start_thread (arg=0x7ffec4268700) at pthread_create.c:308
 riscv#9  0x00007fffefc6c71d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 10, 2018

block/nbd: fix segmentation fault when .desc is not null-terminated
The find_desc_by_name() from util/qemu-option.c relies on the .name not being
NULL to call strcmp(). This check becomes unsafe when the list is not
NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can
result in segmentation fault when strcmp() tries to access an invalid memory:

    #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6
    #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166
    riscv#2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026
    riscv#3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406
    riscv#4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135
    riscv#5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395

>From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an
invalid memory:

    >>> p desc[5]
    $8 = {
      name = 0x1037f098 "R27A",
      type = 1561964883,
      help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>,
      def_value_str = 0x2 <error: Cannot access memory at address 0x2>
    }
    >>> p desc[6]
    $9 = {
      name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001",
      type = 272101528,
      help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>,
      def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9>
    }

This patch fixes the segmentation fault in strcmp() by adding a NULL element at
the end of nbd_runtime_opts.desc list, which is the common practice to most of
other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL
check becomes safe because it will not evaluate to true when .desc list reached
its end.

Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1727259
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>
Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com>
CC: qemu-stable@nongnu.org
Fixes: 7ccc44f
Signed-off-by: Eric Blake <eblake@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

tests: fix check-qobject leak
/public/qobject_is_equal_conversion: OK

=================================================================
==14396==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 56 byte(s) in 1 object(s) allocated from:
    #0 0x7f07682c5850 in malloc (/lib64/libasan.so.4+0xde850)
    #1 0x7f0767d12f0c in g_malloc ../glib/gmem.c:94
    riscv#2 0x7f0767d131cf in g_malloc_n ../glib/gmem.c:331
    riscv#3 0x562bd767371f in do_test_equality /home/elmarco/src/qq/tests/check-qobject.c:49
    riscv#4 0x562bd7674a35 in qobject_is_equal_dict_test /home/elmarco/src/qq/tests/check-qobject.c:267
    riscv#5 0x7f0767d37b04 in test_case_run ../glib/gtestutils.c:2237
    riscv#6 0x7f0767d37ec4 in g_test_run_suite_internal ../glib/gtestutils.c:2321
    riscv#7 0x7f0767d37f6d in g_test_run_suite_internal ../glib/gtestutils.c:2333
    riscv#8 0x7f0767d38184 in g_test_run_suite ../glib/gtestutils.c:2408
    riscv#9 0x7f0767d36e0d in g_test_run ../glib/gtestutils.c:1674
    riscv#10 0x562bd7674e75 in main /home/elmarco/src/qq/tests/check-qobject.c:327
    riscv#11 0x7f0766009039 in __libc_start_main (/lib64/libc.so.6+0x21039)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180104160523.22995-9-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

vl: fix direct firmware directories leak
Note that data_dir[] will now point to allocated strings.

Fixes:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7f1448181850 in malloc (/lib64/libasan.so.4+0xde850)
    #1 0x7f1446ed8f0c in g_malloc ../glib/gmem.c:94
    riscv#2 0x7f1446ed91cf in g_malloc_n ../glib/gmem.c:331
    riscv#3 0x7f1446ef739a in g_strsplit ../glib/gstrfuncs.c:2364
    riscv#4 0x55cf276439d7 in main /home/elmarco/src/qq/vl.c:4311
    riscv#5 0x7f143dfad039 in __libc_start_main (/lib64/libc.so.6+0x21039)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180104160523.22995-10-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

readline: add a free function
Fixes leaks such as:

Direct leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x7eff58beb850 in malloc (/lib64/libasan.so.4+0xde850)
    #1 0x7eff57942f0c in g_malloc ../glib/gmem.c:94
    riscv#2 0x7eff579431cf in g_malloc_n ../glib/gmem.c:331
    riscv#3 0x7eff5795f6eb in g_strdup ../glib/gstrfuncs.c:363
    riscv#4 0x55db720f1d46 in readline_hist_add /home/elmarco/src/qq/util/readline.c:258
    riscv#5 0x55db720f2d34 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:387
    riscv#6 0x55db71539d00 in monitor_read /home/elmarco/src/qq/monitor.c:3896
    riscv#7 0x55db71f9be35 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167
    riscv#8 0x55db71f9bed3 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179
    riscv#9 0x55db71fa013c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
    riscv#10 0x55db71fe18a8 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
    riscv#11 0x7eff5793a90b in g_main_dispatch ../glib/gmain.c:3182
    riscv#12 0x7eff5793b7ac in g_main_context_dispatch ../glib/gmain.c:3847
    riscv#13 0x55db720af3bd in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214
    riscv#14 0x55db720af505 in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261
    riscv#15 0x55db720af6d6 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515
    riscv#16 0x55db7184e0de in main_loop /home/elmarco/src/qq/vl.c:1995
    riscv#17 0x55db7185e956 in main /home/elmarco/src/qq/vl.c:4914
    riscv#18 0x7eff4ea17039 in __libc_start_main (/lib64/libc.so.6+0x21039)

(while at it, use g_new0(ReadLineState), it's a bit easier to read)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180104160523.22995-11-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

crypto: fix stack-buffer-overflow error
ASAN complains about:

==8856==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd8a1fe168 at pc 0x561136cb4451 bp 0x7ffd8a1fe130 sp 0x7ffd8a1fd8e0
READ of size 16 at 0x7ffd8a1fe168 thread T0
    #0 0x561136cb4450 in __asan_memcpy (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450)
    #1 0x561136d2a6a7 in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:83:5
    riscv#2 0x561136d29af8 in qcrypto_ivgen_calculate /home/elmarco/src/qq/crypto/ivgen.c:72:12
    riscv#3 0x561136d07c8e in test_ivgen /home/elmarco/src/qq/tests/test-crypto-ivgen.c:148:5
    riscv#4 0x7f77772c3b04 in test_case_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2237
    riscv#5 0x7f77772c3ec4 in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2321
    riscv#6 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333
    riscv#7 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333
    riscv#8 0x7f77772c3f6d in g_test_run_suite_internal /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2333
    riscv#9 0x7f77772c4184 in g_test_run_suite /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:2408
    riscv#10 0x7f77772c2e0d in g_test_run /home/elmarco/src/gnome/glib/builddir/../glib/gtestutils.c:1674
    riscv#11 0x561136d0799b in main /home/elmarco/src/qq/tests/test-crypto-ivgen.c:173:12
    riscv#12 0x7f77756e6039 in __libc_start_main (/lib64/libc.so.6+0x21039)
    riscv#13 0x561136c13d89 in _start (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x6fd89)

Address 0x7ffd8a1fe168 is located in stack of thread T0 at offset 40 in frame
    #0 0x561136d2a40f in qcrypto_ivgen_essiv_calculate /home/elmarco/src/qq/crypto/ivgen-essiv.c:76

  This frame has 1 object(s):
    [32, 40) 'sector.addr' <== Memory access at offset 40 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (/home/elmarco/src/qq/build/tests/test-crypto-ivgen+0x110450) in __asan_memcpy
Shadow bytes around the buggy address:
  0x100031437bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x100031437c20: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[f3]f3 f3
  0x100031437c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437c50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437c60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x100031437c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb

It looks like the rest of the code copes with ndata being larger than
sizeof(sector), so limit the memcpy() range.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180104160523.22995-13-marcandre.lureau@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

qemu-config: fix leak in query-command-line-options
Direct leak of 160 byte(s) in 4 object(s) allocated from:
    #0 0x55ed7678cda8 in calloc (/home/elmarco/src/qq/build/x86_64-softmmu/qemu-system-x86_64+0x797da8)
    #1 0x7f3f5e725f75 in g_malloc0 /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:124
    riscv#2 0x55ed778aa3a7 in query_option_descs /home/elmarco/src/qq/util/qemu-config.c:60:16
    riscv#3 0x55ed778aa307 in get_drive_infolist /home/elmarco/src/qq/util/qemu-config.c:140:19
    riscv#4 0x55ed778a9f40 in qmp_query_command_line_options /home/elmarco/src/qq/util/qemu-config.c:254:36
    riscv#5 0x55ed76d4868c in qmp_marshal_query_command_line_options /home/elmarco/src/qq/build/qmp-marshal.c:3078:14
    riscv#6 0x55ed77855dd5 in do_qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:104:5
    riscv#7 0x55ed778558cc in qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:131:11
    riscv#8 0x55ed768b592f in handle_qmp_command /home/elmarco/src/qq/monitor.c:3840:11
    riscv#9 0x55ed7786ccfe in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105:5
    riscv#10 0x55ed778fe37c in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323:13
    riscv#11 0x55ed778fdde6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373:15
    riscv#12 0x55ed7786cd83 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124:12
    riscv#13 0x55ed768b559e in monitor_qmp_read /home/elmarco/src/qq/monitor.c:3882:5
    riscv#14 0x55ed77714f29 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167:9
    riscv#15 0x55ed77714fde in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179:9
    riscv#16 0x55ed7772ffad in tcp_chr_read /home/elmarco/src/qq/chardev/char-socket.c:440:13
    riscv#17 0x55ed7777113b in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84:12
    riscv#18 0x7f3f5e71d90b in g_main_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3182
    riscv#19 0x7f3f5e71e7ac in g_main_context_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3847
    riscv#20 0x55ed77886ffc in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214:9
    riscv#21 0x55ed778865fd in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261:5
    riscv#22 0x55ed77886222 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515:11
    riscv#23 0x55ed76d2a4df in main_loop /home/elmarco/src/qq/vl.c:1995:9
    riscv#24 0x55ed76d1cb4a in main /home/elmarco/src/qq/vl.c:4914:5
    riscv#25 0x7f3f555f6039 in __libc_start_main (/lib64/libc.so.6+0x21039)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180104160523.22995-14-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

tests: fix coroutine leak in /basic/entered
The coroutine is not finished by the time the test ends, resulting in
ASAN warning:

==7005==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 312 byte(s) in 1 object(s) allocated from:
    #0 0x7fd35290fa38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7fd3506c5f75 in g_malloc0 ../glib/gmem.c:124
    riscv#2 0x55994af03e47 in qemu_coroutine_new /home/elmarco/src/qemu/util/coroutine-ucontext.c:144
    riscv#3 0x55994aefed99 in qemu_coroutine_create /home/elmarco/src/qemu/util/qemu-coroutine.c:76
    riscv#4 0x55994ac1eb50 in verify_entered_step_1 /home/elmarco/src/qemu/tests/test-coroutine.c:80
    riscv#5 0x55994af03c75 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:119
    riscv#6 0x7fd34ec02bef  (/lib64/libc.so.6+0x50bef)

Do not yield() to let the coroutine terminate.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180104160523.22995-17-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Jan 19, 2018

disas/s390: fix global-buffer-overflow
Spotted thanks to ASAN:

==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350
READ of size 1 at 0x556715a1f120 thread T0
    #0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219
    #1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294
    riscv#2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635
    riscv#3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324
    riscv#4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418
    riscv#5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109
    riscv#6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613
    riscv#7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704
    riscv#8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104
    riscv#9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131
    riscv#10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839
    riscv#11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105
    riscv#12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323
    riscv#13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373
    riscv#14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124
    riscv#15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881
    riscv#16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172
    riscv#17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184
    riscv#18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440
    riscv#19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84
    riscv#20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182
    riscv#21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847
    riscv#22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
    riscv#23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
    riscv#24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
    riscv#25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950
    riscv#26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865
    riscv#27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009)
    riscv#28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9)

0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104
0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600

This fix is based on Andreas Arnez <arnez@linux.vnet.ibm.com> upstream
commit:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b

2014-08-19  Andreas Arnez  <arnez@linux.vnet.ibm.com>

       * s390-dis.c (init_disasm): Simplify initialization of
       opc_index[].  This also fixes an access after the last element
       of s390_opcodes[].

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180104160523.22995-19-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Feb 4, 2018

sdhci: fix a NULL pointer dereference due to uninitialized AddresSpac…
…e object

missed in 60765b6.

  Thread 1 "qemu-system-aarch64" received signal SIGSEGV, Segmentation fault.
  address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050
  3050	    as->root = root;
  (gdb) bt
  #0  address_space_init (as=0x0, root=0x55555726e410, name=name@entry=0x555555e3f0a7 "sdhci-dma") at memory.c:3050
  #1  0x0000555555af62c3 in sdhci_sysbus_realize (dev=<optimized out>, errp=0x7fff7f931150) at hw/sd/sdhci.c:1564
  riscv#2  0x00005555558b25e5 in zynqmp_sdhci_realize (dev=0x555557051520, errp=0x7fff7f931150) at hw/sd/zynqmp-sdhci.c:151
  riscv#3  0x0000555555a2e7f3 in device_set_realized (obj=0x555557051520, value=<optimized out>, errp=0x7fff7f931270) at hw/core/qdev.c:966
  riscv#4  0x0000555555ba3f74 in property_set_bool (obj=0x555557051520, v=<optimized out>, name=<optimized out>, opaque=0x555556e04a20,
      errp=0x7fff7f931270) at qom/object.c:1906
  riscv#5  0x0000555555ba51f4 in object_property_set (obj=obj@entry=0x555557051520, v=v@entry=0x5555576dbd60,
      name=name@entry=0x555555dd6306 "realized", errp=errp@entry=0x7fff7f931270) at qom/object.c:1102

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180123132051.24448-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Feb 22, 2018

9p: fix leak in synth_name_to_path()
Leak found thanks to ASAN:

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x55995789ac90 in __interceptor_malloc (/home/elmarco/src/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x1510c90)
    #1 0x7f0a91190f0c in g_malloc /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:94
    riscv#2 0x5599580a281c in v9fs_path_copy /home/elmarco/src/qemu/hw/9pfs/9p.c:196:17
    riscv#3 0x559958f9ec5d in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:116:9
    riscv#4 0x7f0a8766ebbf  (/lib64/libc.so.6+0x50bbf)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>

michaeljclark pushed a commit that referenced this issue Mar 16, 2018

arm: fix load ELF error leak
Spotted by ASAN:
QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124
    #2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59
    #3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95
    #4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393
    #5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830
    #6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022
    #7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40
    #8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716
    #9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679
    #10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Mar 16, 2018

arm: avoid heap-buffer-overflow in load_aarch64_image
Spotted by ASAN:

elmarco@boraha:~/src/qemu/build (master *%)$ QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test
/aarch64/boot-serial/virt: ** (process:19740): DEBUG: 18:39:30.275: foo /tmp/qtest-boot-serial-cXaS94D
=================================================================
==19740==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000069648 at pc 0x7f1d2201cc54 bp 0x7fff331f6a40 sp 0x7fff331f61e8
READ of size 4 at 0x603000069648 thread T0
    #0 0x7f1d2201cc53  (/lib64/libasan.so.4+0xafc53)
    #1 0x55bc86685ee3 in load_aarch64_image /home/elmarco/src/qemu/hw/arm/boot.c:894
    riscv#2 0x55bc86687217 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1047
    riscv#3 0x55bc877363b5 in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40
    riscv#4 0x55bc869331ea in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716
    riscv#5 0x55bc8693bc39 in main /home/elmarco/src/qemu/vl.c:4679
    riscv#6 0x7f1d1652c009 in __libc_start_main (/lib64/libc.so.6+0x21009)
    riscv#7 0x55bc86255cc9 in _start (/home/elmarco/src/qemu/build/aarch64-softmmu/qemu-system-aarch64+0x1ae5cc9)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

michaeljclark pushed a commit that referenced this issue Mar 16, 2018

migration: fix minor finalize leak
Spotted thanks to ASAN:
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest

==30302==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124
    #2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203
    #3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139
    #4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150
    #5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411
    #6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81
    #7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85
    #8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142
    #9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88
    #10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552
    #11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182
    #12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847
    #13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
    #14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
    #15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
    #16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942
    #17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724
    #18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009)

Indirect leak of 45 byte(s) in 1 object(s) allocated from:
    #0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850)
    #1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94
    #2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331
    #3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363
    #4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204
    #5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139
    #6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150
    #7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411
    #8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81
    #9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85
    #10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142
    #11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88
    #12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552
    #13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182
    #14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847
    #15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
    #16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
    #17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
    #18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942
    #19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724
    #20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180306170959.3921-1-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

michaeljclark pushed a commit that referenced this issue Mar 20, 2018

hmp: free sev info
Found thanks to ASAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7efe20417a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7efe1f7b2f75 in g_malloc0 ../glib/gmem.c:124
    #2 0x7efe1f7b3249 in g_malloc0_n ../glib/gmem.c:355
    #3 0x558272879162 in sev_get_info /home/elmarco/src/qemu/target/i386/sev.c:414
    #4 0x55827285113b in hmp_info_sev /home/elmarco/src/qemu/target/i386/monitor.c:684
    #5 0x5582724043b8 in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3333

Fixes: 6303631
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180319175823.22111-1-marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

michaeljclark pushed a commit that referenced this issue Mar 31, 2018

migration: fix pfd leak
Fix leak spotted by ASAN:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fe1abb80a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7fe1aaf1bf75 in g_malloc0 ../glib/gmem.c:124
    #2 0x7fe1aaf1c249 in g_malloc0_n ../glib/gmem.c:355
    #3 0x55f4841cfaa9 in postcopy_ram_fault_thread /home/elmarco/src/qemu/migration/postcopy-ram.c:596
    #4 0x55f48479447b in qemu_thread_start /home/elmarco/src/qemu/util/qemu-thread-posix.c:504
    #5 0x7fe1a043550a in start_thread (/lib64/libpthread.so.0+0x750a)

Regression introduced with commit 00fa4fc.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180321113644.21899-1-marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

michaeljclark pushed a commit that referenced this issue Apr 6, 2018

blockjob: leak fix, remove from txn when failing early
This fixes leaks found by ASAN such as:
  GTESTER tests/test-blockjob
=================================================================
==31442==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7f88483cba38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
    #1 0x7f8845e1bd77 in g_malloc0 ../glib/gmem.c:129
    #2 0x7f8845e1c04b in g_malloc0_n ../glib/gmem.c:360
    #3 0x5584d2732498 in block_job_txn_new /home/elmarco/src/qemu/blockjob.c:172
    #4 0x5584d2739b28 in block_job_create /home/elmarco/src/qemu/blockjob.c:973
    #5 0x5584d270ae31 in mk_job /home/elmarco/src/qemu/tests/test-blockjob.c:34
    #6 0x5584d270b1c1 in do_test_id /home/elmarco/src/qemu/tests/test-blockjob.c:57
    #7 0x5584d270b65c in test_job_ids /home/elmarco/src/qemu/tests/test-blockjob.c:118
    #8 0x7f8845e40b69 in test_case_run ../glib/gtestutils.c:2255
    #9 0x7f8845e40f29 in g_test_run_suite_internal ../glib/gtestutils.c:2339
    #10 0x7f8845e40fd2 in g_test_run_suite_internal ../glib/gtestutils.c:2351
    #11 0x7f8845e411e9 in g_test_run_suite ../glib/gtestutils.c:2426
    #12 0x7f8845e3fe72 in g_test_run ../glib/gtestutils.c:1692
    #13 0x5584d270d6e2 in main /home/elmarco/src/qemu/tests/test-blockjob.c:377
    #14 0x7f8843641f29 in __libc_start_main (/lib64/libc.so.6+0x20f29)

Add an assert to make sure that the job doesn't have associated txn before free().

[Jeff Cody: N.B., used updated patch provided by John Snow]

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Apr 12, 2018

iothread: workaround glib bug which hangs qmp-test
Free the AIO context earlier than the GMainContext (if we have) to
workaround a glib2 bug that GSource context pointer is not cleared even
if the context has already been destroyed (while it should).

The patch itself only changed the order to destroy the objects, no
functional change at all. Without this workaround, we can encounter
qmp-test hang with oob (and possibly any other use case when iothread is
used with GMainContexts):

  #0  0x00007f35ffe45334 in __lll_lock_wait () from /lib64/libpthread.so.0
  #1  0x00007f35ffe405d8 in _L_lock_854 () from /lib64/libpthread.so.0
  riscv#2  0x00007f35ffe404a7 in pthread_mutex_lock () from /lib64/libpthread.so.0
  riscv#3  0x00007f35fc5b9c9d in g_source_unref_internal (source=0x24f0600, context=0x7f35f0000960, have_lock=0) at gmain.c:1685
  riscv#4  0x0000000000aa6672 in aio_context_unref (ctx=0x24f0600) at /root/qemu/util/async.c:497
  riscv#5  0x000000000065851c in iothread_instance_finalize (obj=0x24f0380) at /root/qemu/iothread.c:129
  riscv#6  0x0000000000962d79 in object_deinit (obj=0x24f0380, type=0x242e960) at /root/qemu/qom/object.c:462
  riscv#7  0x0000000000962e0d in object_finalize (data=0x24f0380) at /root/qemu/qom/object.c:476
  riscv#8  0x0000000000964146 in object_unref (obj=0x24f0380) at /root/qemu/qom/object.c:924
  riscv#9  0x0000000000965880 in object_finalize_child_property (obj=0x24ec640, name=0x24efca0 "mon_iothread", opaque=0x24f0380) at /root/qemu/qom/object.c:1436
  riscv#10 0x0000000000962c33 in object_property_del_child (obj=0x24ec640, child=0x24f0380, errp=0x0) at /root/qemu/qom/object.c:436
  riscv#11 0x0000000000962d26 in object_unparent (obj=0x24f0380) at /root/qemu/qom/object.c:455
  riscv#12 0x0000000000658f00 in iothread_destroy (iothread=0x24f0380) at /root/qemu/iothread.c:365
  riscv#13 0x00000000004c67a8 in monitor_cleanup () at /root/qemu/monitor.c:4663
  riscv#14 0x0000000000669e27 in main (argc=16, argv=0x7ffc8b1ae2f8, envp=0x7ffc8b1ae380) at /root/qemu/vl.c:4749

The glib2 bug is fixed in commit 26056558b ("gmain: allow
g_source_get_context() on destroyed sources", 2012-07-30), so the first
good version is glib2 2.33.10. But we still support building with
glib as old as 2.28, so we need the workaround.

Let's make sure we destroy the GSources first before its owner context
until we drop support for glib older than 2.33.10.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180409083956.1780-1-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

michaeljclark pushed a commit to michaeljclark/riscv-qemu that referenced this issue Apr 12, 2018

monitor: bind dispatch bh to iohandler context
Eric Auger reported the problem days ago that OOB broke ARM when running
with libvirt:

http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html

The problem was that the monitor dispatcher bottom half was bound to
qemu_aio_context now, which could be polled unexpectedly in block code.
We should keep the dispatchers run in iohandler_ctx just like what we
did before the Out-Of-Band series (chardev uses qio, and qio binds
everything with iohandler_ctx).

If without this change, QMP dispatcher might be run even before reaching
main loop in block IO path, for example, in a stack like (the ARM case,
"cont" command handler run even during machine init phase):

        #0  qmp_cont ()
        #1  0x00000000006bd210 in qmp_marshal_cont ()
        riscv#2  0x0000000000ac05c4 in do_qmp_dispatch ()
        riscv#3  0x0000000000ac07a0 in qmp_dispatch ()
        riscv#4  0x0000000000472d60 in monitor_qmp_dispatch_one ()
        riscv#5  0x000000000047302c in monitor_qmp_bh_dispatcher ()
        riscv#6  0x0000000000acf374 in aio_bh_call ()
        riscv#7  0x0000000000acf428 in aio_bh_poll ()
        riscv#8  0x0000000000ad5110 in aio_poll ()
        riscv#9  0x0000000000a08ab8 in blk_prw ()
        riscv#10 0x0000000000a091c4 in blk_pread ()
        riscv#11 0x0000000000734f94 in pflash_cfi01_realize ()
        riscv#12 0x000000000075a3a4 in device_set_realized ()
        riscv#13 0x00000000009a26cc in property_set_bool ()
        riscv#14 0x00000000009a0a40 in object_property_set ()
        riscv#15 0x00000000009a3a08 in object_property_set_qobject ()
        riscv#16 0x00000000009a0c8c in object_property_set_bool ()
        riscv#17 0x0000000000758f94 in qdev_init_nofail ()
        riscv#18 0x000000000058e190 in create_one_flash ()
        riscv#19 0x000000000058e2f4 in create_flash ()
        riscv#20 0x00000000005902f0 in machvirt_init ()
        riscv#21 0x00000000007635cc in machine_run_board_init ()
        riscv#22 0x00000000006b135c in main ()

Actually the problem is more severe than that.  After we switched to the
qemu AIO handler it means the monitor dispatcher code can even be called
with nested aio_poll(), then it can be an explicit aio_poll() inside
another main loop aio_poll() which could be racy too; breaking code
like TPM and 9p that use nested event loops.

Switch to use the iohandler_ctx for monitor dispatchers.

My sincere thanks to Eric Auger who offered great help during both
debugging and verifying the problem.  The ARM test was carried out by
applying this patch upon QEMU 2.12.0-rc0 and problem is gone after the
patch.

A quick test of mine shows that after this patch applied we can pass all
raw iotests even with OOB on by default.

CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
Reported-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180410044942.17059-1-peterx@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

monitor: temporary fix for dead-lock on event recursion
With a Spice port chardev, it is possible to reenter
monitor_qapi_event_queue() (when the client disconnects for
example). This will dead-lock on monitor_lock.

Instead, use some TLS variables to check for recursion and queue the
events.

Fixes:
 (gdb) bt
 #0  0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0
 #1  0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
 #2  0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
 #3  0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645
 #4  0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149
 #5  0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235
 #6  0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316
 #7  0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197
 #8  0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197
 #9  0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414
 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388
 #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347
 #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
 #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341
 #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
 #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
 #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353
 #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199
 #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112
 #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147
 #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42
 #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425
 #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468
 #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517
 #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538
 #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624
 #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649
 #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58
 #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822
 #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862
 #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644

Note that error report is now moved to the first caller, which may
receive an error for a recursed event. This is probably fine (95% of
callers use &error_abort, the rest have NULL error and ignore it)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180731150144.14022-1-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[*_no_recurse renamed to *_no_reenter, local variables reordered]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

tests: fix crumple/recursive leak
Spotted by ASAN:

=================================================================
==27907==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4120 byte(s) in 1 object(s) allocated from:
    #0 0x7f913458ce50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x7f9133fd641d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
    #2 0x5561c6643c95 in qdict_crumple_test_recursive /home/elmarco/src/qq/tests/check-block-qdict.c:438
    #3 0x7f9133ff7c49  (/lib64/libglib-2.0.so.0+0x73c49)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-2-marcandre.lureau@redhat.com>
[Screwed up in commit 2860b2b]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

monitor: fix oob command leak
Spotted by ASAN, during make check...

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f8e27262c48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7f8e26a5f3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
    #2 0x555ab67078a8 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:67
    #3 0x555ab67071e4 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:24
    #4 0x555ab6713fbf in qstring_from_escaped_str /home/elmarco/src/qq/qobject/json-parser.c:144
    #5 0x555ab671738c in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:506
    #6 0x555ab67179c3 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:569
    #7 0x555ab6715123 in parse_pair /home/elmarco/src/qq/qobject/json-parser.c:306
    #8 0x555ab6715483 in parse_object /home/elmarco/src/qq/qobject/json-parser.c:357
    #9 0x555ab671798b in parse_value /home/elmarco/src/qq/qobject/json-parser.c:561
    #10 0x555ab6717a6b in json_parser_parse_err /home/elmarco/src/qq/qobject/json-parser.c:592
    #11 0x555ab4fd4dcf in handle_qmp_command /home/elmarco/src/qq/monitor.c:4257
    #12 0x555ab6712c4d in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105
    #13 0x555ab67e01e2 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323
    #14 0x555ab67e0af6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373
    #15 0x555ab6713010 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124
    #16 0x555ab4fd58ec in monitor_qmp_read /home/elmarco/src/qq/monitor.c:4337
    #17 0x555ab6559df2 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175
    #18 0x555ab6559e95 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187
    #19 0x555ab6560127 in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
    #20 0x555ab65d9c73 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
    #21 0x7f8e26a598ac in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4c8ac)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180809114417.28718-4-marcandre.lureau@redhat.com>
[Screwed up in commit b273145]
Cc: qemu-stable@nongnu.org
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

migration: implement io_set_aio_fd_handler function for RDMA QIOChannel
if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu
crash.

The backtrace is:
(gdb) bt
    #0  0x0000000000000000 in ?? ()
    #1  0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080,
        io_read=0x8db841 <qio_channel_restart_read>, io_write=0x0, opaque=0x38111e0) at io/channel.c:
    #2  0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438
    #3  0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47
    #4  0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327
        at migration/qemu-file-channel.c:83
    #5  0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299
    #6  0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562
    #7  0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575
    #8  0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655
    #9  0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126
    #10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366
    #11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1
    #12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6
    #13 0x00007f96fe858760 in ?? ()
    #14 0x0000000000000000 in ?? ()

RDMA QIOChannel not implement io_set_aio_fd_handler. so
qio_channel_set_aio_fd_handler will access NULL pointer.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

migration: invoke qio_channel_yield only when qemu_in_coroutine()
when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash.

The backtrace is:
    (gdb) bt
    #0  0x00007fb20aba91d7 in raise () from /lib64/libc.so.6
    #1  0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6
    #2  0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6
    #3  0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6
    #4  0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460
    #5  0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768)
        at migration/qemu-file-channel.c:83
    #6  0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299
    #7  0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562
    #8  0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575
    #9  0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647
    #10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794
    #11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504
    #12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0
    #13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6

This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine().

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

migration: implement the shutdown for RDMA QIOChannel
Because RDMA QIOChannel not implement shutdown function,
If the to_dst_file was set error, the return path thread
will wait forever. and the migration thread will wait
return path thread exit.

the backtrace of return path thread is:

(gdb) bt
    #0  0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6
    #1  0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000)
        at qemu-timer.c:325
    #2  0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000)
        at migration/rdma.c:1501
    #3  0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000,
        byte_len=0x7ef7091d0640) at migration/rdma.c:1580
    #4  0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000,
        head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726
    #5  0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720,
        expecting=3) at migration/rdma.c:1903
    #6  0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8,
        size=32768) at migration/rdma.c:2714
    #7  0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232
    #8  0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0)
        at migration/qemu-file.c:502
    #9  0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515
    #10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591
    #11 0x00000000006a46d3 in source_return_path_thread (
        opaque=0xd826a0 <current_migration.37100>) at migration/migration.c:1331
    #12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0
    #13 0x00007f372a77635d in clone () from /lib64/libc.so.6

the backtrace of migration thread is:

(gdb) bt
    #0  0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0
    #1  0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 <current_migration.37100+88>)
        at util/qemu-thread-posix.c:504
    #2  0x00000000006a4bc5 in await_return_path_close_on_source (
        ms=0xd826a0 <current_migration.37100>) at migration/migration.c:1460
    #3  0x00000000006a53e4 in migration_completion (s=0xd826a0 <current_migration.37100>,
        current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980)
        at migration/migration.c:1695
    #4  0x00000000006a5c54 in migration_thread (opaque=0xd826a0 <current_migration.37100>)
        at migration/migration.c:1837
    #5  0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0
    #6  0x00007f372a77635d in clone () from /lib64/libc.so.6

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

megasas: fix sglist leak
tests/cdrom-test -p /x86_64/cdrom/boot/megasas

Produces the following ASAN leak.

==25700==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7f06f8faac48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7f06f87a73c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
    #2 0x55a729f17738 in pci_dma_sglist_init /home/elmarco/src/qq/include/hw/pci/pci.h:818
    #3 0x55a729f2a706 in megasas_map_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:698
    #4 0x55a729f39421 in megasas_handle_dcmd /home/elmarco/src/qq/hw/scsi/megasas.c:1574
    #5 0x55a729f3f70d in megasas_handle_frame /home/elmarco/src/qq/hw/scsi/megasas.c:1955
    #6 0x55a729f40939 in megasas_mmio_write /home/elmarco/src/qq/hw/scsi/megasas.c:2119
    #7 0x55a729f41102 in megasas_port_write /home/elmarco/src/qq/hw/scsi/megasas.c:2170
    #8 0x55a729220e60 in memory_region_write_accessor /home/elmarco/src/qq/memory.c:527
    #9 0x55a7292212b3 in access_with_adjusted_size /home/elmarco/src/qq/memory.c:594
    #10 0x55a72922cf70 in memory_region_dispatch_write /home/elmarco/src/qq/memory.c:1473
    #11 0x55a7290f5907 in flatview_write_continue /home/elmarco/src/qq/exec.c:3255
    #12 0x55a7290f5ceb in flatview_write /home/elmarco/src/qq/exec.c:3294
    #13 0x55a7290f6457 in address_space_write /home/elmarco/src/qq/exec.c:3384
    #14 0x55a7290f64a8 in address_space_rw /home/elmarco/src/qq/exec.c:3395
    #15 0x55a72929ecb0 in kvm_handle_io /home/elmarco/src/qq/accel/kvm/kvm-all.c:1729
    #16 0x55a7292a0db5 in kvm_cpu_exec /home/elmarco/src/qq/accel/kvm/kvm-all.c:1969
    #17 0x55a7291c4212 in qemu_kvm_cpu_thread_fn /home/elmarco/src/qq/cpus.c:1215
    #18 0x55a72a966a6c in qemu_thread_start /home/elmarco/src/qq/util/qemu-thread-posix.c:504
    #19 0x7f06ed486593 in start_thread (/lib64/libpthread.so.0+0x7593)

Move the qemu_sglist_destroy() from megasas_complete_command() to
megasas_unmap_frame(), so map/unmap are balanced.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180814141247.32336-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

tests: fix bdrv-drain leak
Spotted by ASAN:

=================================================================
==5378==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 65536 byte(s) in 1 object(s) allocated from:
    #0 0x7f788f83bc48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7f788c9923c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
    #2 0x5622a1fe37bc in coroutine_trampoline /home/elmarco/src/qq/util/coroutine-ucontext.c:116
    #3 0x7f788a15d75f in __correctly_grouped_prefixwc (/lib64/libc.so.6+0x4c75f)

(Broken in commit 4c8158e.)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180809114417.28718-3-marcandre.lureau@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

Fix a deadlock case in the CPU hotplug flow
We need to set cs->halted to 1 before calling ppc_set_compat. The reason
is that ppc_set_compat kicks up the new thread created to manage the
hotplugged KVM virtual CPU and the code drives directly to KVM_RUN
ioctl. When cs->halted is 1, the code:

int kvm_cpu_exec(CPUState *cpu)
...
     if (kvm_arch_process_async_events(cpu)) {
         atomic_set(&cpu->exit_request, 0);
         return EXCP_HLT;
     }
...

returns before it reaches KVM_RUN, giving time to the main thread to
finish its job. Otherwise we can fall in a deadlock because the KVM
thread will issue the KVM_RUN ioctl while the main thread is setting up
KVM registers. Depending on how these jobs are scheduled we'll end up
freezing QEMU.

The following output shows kvm_vcpu_ioctl sleeping because it cannot get
the mutex and never will.
PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr.

STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL

PID: 61564  TASK: c000003e981e0780  CPU: 48  COMMAND: "qemu-system-ppc"
 #0 [c000003e982679a0] __schedule at c000000000b10a44
 #1 [c000003e98267a60] schedule at c000000000b113a8
 #2 [c000003e98267a90] schedule_preempt_disabled at c000000000b11910
 #3 [c000003e98267ab0] __mutex_lock at c000000000b132ec
 #4 [c000003e98267bc0] kvm_vcpu_ioctl at c00800000ea03140 [kvm]
 #5 [c000003e98267d20] do_vfs_ioctl at c000000000407d30
 #6 [c000003e98267dc0] ksys_ioctl at c000000000408674
 #7 [c000003e98267e10] sys_ioctl at c0000000004086f8
 #8 [c000003e98267e30] system_call at c00000000000b488

crash> struct -x kvm.vcpus 0xc000003da0000000
vcpus = {0xc000003db4880000, 0xc000003d52b80000, 0xc0000039e9c80000, 0xc000003d0e200000, 0xc000003d58280000, 0x0, 0x0, ...}

crash> struct -x kvm_vcpu.mutex.owner 0xc000003d58280000
  mutex.owner = {
    counter = 0xc000003a23a5c881 <- flag 1: waiters
  },

crash> bt 0xc000003a23a5c880
PID: 61579  TASK: c000003a23a5c880  CPU: 9   COMMAND: "CPU 4/KVM"
(active)

crash> struct -x kvm_vcpu.mutex.wait_list 0xc000003d58280000
  mutex.wait_list = {
    next = 0xc000003e98267b10,
    prev = 0xc000003e98267b10
  },

crash> struct -x mutex_waiter.task 0xc000003e98267b10
  task = 0xc000003e981e0780

The following command-line was used to reproduce the problem (note: gdb
and trace can change the results).

 $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \
     -enable-kvm -m 4096 \
     -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \
     -display none -nographic \
     -drive file=disk1.qcow2,format=qcow2
 ...
 (qemu) device_add host-spapr-cpu-core,core-id=4
[no interaction is possible after it, only SIGKILL to take the terminal
back]

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

hmp: fix migrate status timer leak
Spotted by ASAN doing some manual testing:

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f5fcdc75e50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x7f5fcd47241d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
    #2 0x55f989be92ce in timer_new /home/elmarco/src/qq/include/qemu/timer.h:561
    #3 0x55f989be92ff in timer_new_ms /home/elmarco/src/qq/include/qemu/timer.h:630
    #4 0x55f989c0219d in hmp_migrate /home/elmarco/src/qq/hmp.c:2038
    #5 0x55f98955927b in handle_hmp_command /home/elmarco/src/qq/monitor.c:3498
    #6 0x55f98955fb8c in monitor_command_cb /home/elmarco/src/qq/monitor.c:4371
    #7 0x55f98ad40f11 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:393
    #8 0x55f98955fa4f in monitor_read /home/elmarco/src/qq/monitor.c:4354
    #9 0x55f98aae30d7 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:175
    #10 0x55f98aae317a in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:187
    #11 0x55f98aae940c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
    #12 0x55f98ab63018 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
    #13 0x7f5fcd46c8ac in g_main_dispatch gmain.c:3177

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180901134652.25884-1-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

Sergio Lopez Kevin Wolf
block/linux-aio: acquire AioContext before qemu_laio_process_completions
In qemu_laio_process_completions_and_submit, the AioContext is acquired
before the ioq_submit iteration and after qemu_laio_process_completions,
but the latter is not thread safe either.

This change avoids a number of random crashes when the Main Thread and
an IO Thread collide processing completions for the same AioContext.
This is an example of such crash:

 - The IO Thread is trying to acquire the AioContext at aio_co_enter,
   which evidences that it didn't lock it before:

Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)):
 #0  0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0
 #2  0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0)
    at ../nptl/pthread_mutex_lock.c:114
 #3  0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66
 #4  0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493
 #5  0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478
 #6  0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104
 #7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 #8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 #9  0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406
 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true)
    at util/aio-posix.c:693
 #11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64
 #12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308
 #13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

 - The Main Thread is also processing completions from the same
   AioContext, and crashes due to failed assertion at util/iov.c:78:

Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)):
 #0  0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 #1  0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90
 #2  0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset")
    at assert.c:92
 #3  0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101
 #4  0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78
 #5  0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410
 #6  0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88
 #7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 #8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 #9  0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272
 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497
 #11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574
 #12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false)
    at util/aio-posix.c:604
 #13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273
 #14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390
 #15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590
 #16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774
 #17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401
 #18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449
 #19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080)
    at block/commit.c:92
 #20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973
 #21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90
 #22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118
 #23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436
 #24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
 #25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201
 #26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854
 #27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215
 #28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
 #29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
 #30 0x00005631fbd81412 in main_loop () at vl.c:1866
 #31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4647

 - A closer examination shows that s->io_q.in_flight appears to have
   gone backwards:

(gdb) frame 7
 #7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
222	            qemu_laio_process_completion(laiocb);
(gdb) p s
$2 = (LinuxAioState *) 0x7fdfc0297670
(gdb) p *s
$3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0,
    in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0,
      sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241}
(gdb) p/x s->io_q.in_flight
$4 = 0xfffffff0

Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

migration: fix QEMUFile leak
Spotted by ASAN while running:

$ tests/migration-test -p /x86_64/migration/postcopy/recovery

=================================================================
==18034==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 33864 byte(s) in 1 object(s) allocated from:
    #0 0x7f3da7f31e50 in calloc (/lib64/libasan.so.5+0xeee50)
    #1 0x7f3da644441d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5241d)
    #2 0x55af9db15440 in qemu_fopen_channel_input /home/elmarco/src/qemu/migration/qemu-file-channel.c:183
    #3 0x55af9db15413 in channel_get_output_return_path /home/elmarco/src/qemu/migration/qemu-file-channel.c:159
    #4 0x55af9db0d4ac in qemu_file_get_return_path /home/elmarco/src/qemu/migration/qemu-file.c:78
    #5 0x55af9dad5e4f in open_return_path_on_source /home/elmarco/src/qemu/migration/migration.c:2295
    #6 0x55af9dadb3bf in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3111
    #7 0x55af9dae1bf3 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:91
    #8 0x55af9daddeca in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108
    #9 0x55af9e13d3db in qio_task_complete /home/elmarco/src/qemu/io/task.c:158
    #10 0x55af9e13ca03 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:89
    #11 0x7f3da643b1ca in g_idle_dispatch gmain.c:5535

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180925092245.29565-1-marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

Kevin Wolf
test-replication: Lock AioContext around blk_unref()
Recently, the test case has started failing because some job related
functions want to drop the AioContext lock even though it hasn't been
taken:

    (gdb) bt
    #0  0x00007f51c067c9fb in raise () from /lib64/libc.so.6
    #1  0x00007f51c067e77d in abort () from /lib64/libc.so.6
    #2  0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
    #3  0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96
    #4  0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516
    #5  0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738
    #6  0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986
    #7  0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941
    #8  0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148
    #9  0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420
    #10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629
    #11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685
    #12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783
    #13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402
    #14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457
    #15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478
    #16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
    #17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
    #18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
    #19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0
    #20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0
    #21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581

It is yet unclear whether this should really be considered a bug in the
test case or whether blk_unref() should work for callers that haven't
taken the AioContext lock, but in order to fix the build tests quickly,
just take the AioContext lock around blk_unref().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

chardev: avoid crash if no associated address
A socket chardev may not have associated address (when adding client
fd manually for example). But on disconnect, updating socket filename
expects an address and may lead to this crash:

  Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  0x0000555555d8c70c in SocketAddress_to_str (prefix=0x555556043062 "disconnected:", addr=0x0, is_listen=false, is_telnet=false) at /home/elmarco/src/qq/chardev/char-socket.c:388
  388	    switch (addr->type) {
  (gdb) bt
  #0  0x0000555555d8c70c in SocketAddress_to_str (prefix=0x555556043062 "disconnected:", addr=0x0, is_listen=false, is_telnet=false) at /home/elmarco/src/qq/chardev/char-socket.c:388
  #1  0x0000555555d8c8aa in update_disconnected_filename (s=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:419
  #2  0x0000555555d8c959 in tcp_chr_disconnect (chr=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:438
  #3  0x0000555555d8cba1 in tcp_chr_hup (channel=0x555556b75690, cond=G_IO_HUP, opaque=0x555556b1ed00) at /home/elmarco/src/qq/chardev/char-socket.c:482
  #4  0x0000555555da596e in qio_channel_fd_source_dispatch (source=0x555556bb68b0, callback=0x555555d8cb58 <tcp_chr_hup>, user_data=0x555556b1ed00) at /home/elmarco/src/qq/io/channel-watch.c:84

Replace filename with a generic "disconnected:socket" in this case.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Oct 17, 2018

tests/check-qjson: fix a leak
Spotted by ASAN:
=================================================================
==11893==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1120 byte(s) in 28 object(s) allocated from:
    #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5)
    #2 0x559e708b56a4 in qstring_from_str /home/elmarco/src/qq/qobject/qstring.c:66
    #3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23
    #4 0x559e708bda7d in parse_string /home/elmarco/src/qq/qobject/json-parser.c:143
    #5 0x559e708c1009 in parse_literal /home/elmarco/src/qq/qobject/json-parser.c:484
    #6 0x559e708c1627 in parse_value /home/elmarco/src/qq/qobject/json-parser.c:547
    #7 0x559e708c1c67 in json_parser_parse /home/elmarco/src/qq/qobject/json-parser.c:573
    #8 0x559e708bc0ff in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:92
    #9 0x559e708d1655 in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:292
    #10 0x559e708d1fe1 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:339
    #11 0x559e708bc856 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:121
    #12 0x559e708b8b4b in qobject_from_jsonv /home/elmarco/src/qq/qobject/qjson.c:69
    #13 0x559e708b8d02 in qobject_from_json /home/elmarco/src/qq/qobject/qjson.c:83
    #14 0x559e708a74ae in from_json_str /home/elmarco/src/qq/tests/check-qjson.c:30
    #15 0x559e708a9f83 in utf8_string /home/elmarco/src/qq/tests/check-qjson.c:781
    #16 0x7fd05101bc49 in test_case_run gtestutils.c:2255
    #17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180901211917.10372-1-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Nov 21, 2018

vl: Avoid crash when -mon is underspecified
A quick coredump on an incomplete command line:
./x86_64-softmmu/qemu-system-x86_64 -mon mode=control,pretty=on

 #0  0x00007ffff723d9e4 in g_str_hash () at /lib64/libglib-2.0.so.0
 #1  0x00007ffff723ce38 in g_hash_table_lookup () at /lib64/libglib-2.0.so.0
 #2  0x0000555555cc0073 in object_class_property_find (klass=0x5555566a94b0, name=0x0, errp=0x0) at qom/object.c:1135
 #3  0x0000555555cc004b in object_class_property_find (klass=0x5555566a9440, name=0x0, errp=0x0) at qom/object.c:1129
 #4  0x0000555555cbfe6e in object_property_find (obj=0x5555568348c0, name=0x0, errp=0x0) at qom/object.c:1080
 #5  0x0000555555cc183d in object_resolve_path_component (parent=0x5555568348c0, part=0x0) at qom/object.c:1762
 #6  0x0000555555d82071 in qemu_chr_find (name=0x0) at chardev/char.c:802
 #7  0x00005555559d77cb in mon_init_func (opaque=0x0, opts=0x5555566b65a0, errp=0x0) at vl.c:2291

Fix it to instead fail gracefully.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181023213600.364086-1-eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

palmer-dabbelt pushed a commit that referenced this issue Nov 28, 2018

9p: fix QEMU crash when renaming files
When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:

    while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done

With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:

Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
 flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59          while (*path && fd != -1) {
(gdb) bt
#0  0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8,
 path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
#1  0x0000555555a25e0c in local_opendir_nofollow (fs_ctx=0x555557d958b8,
 path=0x0) at hw/9pfs/9p-local.c:92
#2  0x0000555555a261b8 in local_lstat (fs_ctx=0x555557d958b8,
 fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
#3  0x0000555555a2b367 in v9fs_co_lstat (pdu=0x555557d97498,
 path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
#4  0x0000555555a1e9e2 in v9fs_stat (opaque=0x555557d97498)
 at hw/9pfs/9p.c:1083
#5  0x0000555555e060a2 in coroutine_trampoline (i0=-669165424, i1=32767)
 at util/coroutine-ucontext.c:116
#6  0x00007fffef4f5600 in __start_context () at /lib64/libc.so.6
#7  0x0000000000000000 in  ()
(gdb)

The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().

Impact:  DoS triggered by unprivileged guest users.

Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment