-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Illegal instruction 2f0a when running msgmerge #2
Comments
2f0a is a "move.l %a2, #imm" which uses an invalid destination address mode. (#imm). So it is really invalid. The problem is surely before. Could you run, something like: QEMU_LOG=in_asm,op QEMU_LOG_FILENAME=/qemu.log /usr/bin//msgmerge --add-location=file ar.po ../build/po/apt.pot -o ../build/po/domains/apt/ar.po |
I haven't had the time yet to re-test this and provide the necessary debug information, however, I just noticed that - without changing anything - apt now builds fine on m68k while it now fails the same way on sh4 it previously failed on m68k. Current m68k:
Current sh4:
Fails with: Generating ../build/po/domains/apt/gl.po echo ../build/po/domains/apt/gl.po : gl.po ../build/po/apt.pot > ../build/po/apt_gl.po.d As you can see in the build log on sh4, msgmerge is called multiple times before in similar fashion without failing before apparently the thread just dies. Looks very similar to the issue with the signals with previously had on m68k but it has to be something different as this was already fixed. I guess it will just fail on m68k during the next build. |
Alright, as Michael Karcher has figured out, this is simply register_m68k_insns not being thread-safe and just adding "if(opcode_table[0] != NULL) return;" to the top of this function makes sure that the opcode table is only built once and not destroyed when multiple threads are involved. I have added this change to the qemu-m68k instances running on my buildds and fixes the build failures for many packages including exim4, cups and others. |
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
I think we can close this issue, can't we? The main culprit was the opcode table being destroyed on context switch and that was fixed. Although it's not been upstreamed yet. |
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
"name" is freed after visiting options, instead use the first NetClientState name. Adds a few assert() for clarifying and checking some impossible states. READ of size 1 at 0x602000000990 thread T0 #0 0x7f6b251c570c (/lib64/libasan.so.2+0x4770c) #1 0x5566dc380600 in qemu_find_net_clients_except net/net.c:824 #2 0x5566dc39bac7 in net_vhost_user_event net/vhost-user.c:193 #3 0x5566dbee862a in qemu_chr_be_event /home/elmarco/src/qemu/qemu-char.c:201 #4 0x5566dbef2890 in tcp_chr_disconnect /home/elmarco/src/qemu/qemu-char.c:2790 #5 0x5566dbef2d0b in tcp_chr_sync_read /home/elmarco/src/qemu/qemu-char.c:2835 #6 0x5566dbee8a99 in qemu_chr_fe_read_all /home/elmarco/src/qemu/qemu-char.c:295 #7 0x5566dc39b964 in net_vhost_user_watch net/vhost-user.c:180 #8 0x5566dc5a06c7 in qio_channel_fd_source_dispatch io/channel-watch.c:70 #9 0x7f6b1aa2ab87 in g_main_dispatch /home/elmarco/src/gnome/glib/glib/gmain.c:3154 #10 0x7f6b1aa2b9cb in g_main_context_dispatch /home/elmarco/src/gnome/glib/glib/gmain.c:3769 #11 0x5566dc475ed4 in glib_pollfds_poll /home/elmarco/src/qemu/main-loop.c:212 #12 0x5566dc476029 in os_host_main_loop_wait /home/elmarco/src/qemu/main-loop.c:257 #13 0x5566dc476165 in main_loop_wait /home/elmarco/src/qemu/main-loop.c:505 #14 0x5566dbf08d31 in main_loop /home/elmarco/src/qemu/vl.c:1932 #15 0x5566dbf16783 in main /home/elmarco/src/qemu/vl.c:4646 #16 0x7f6b180bb57f in __libc_start_main (/lib64/libc.so.6+0x2057f) #17 0x5566dbbf5348 in _start (/home/elmarco/src/qemu/x86_64-softmmu/qemu-system-x86_64+0x3f9348) 0x602000000990 is located 0 bytes inside of 5-byte region [0x602000000990,0x602000000995) freed by thread T0 here: #0 0x7f6b2521666a in __interceptor_free (/lib64/libasan.so.2+0x9866a) #1 0x7f6b1aa332a4 in g_free /home/elmarco/src/gnome/glib/glib/gmem.c:189 #2 0x5566dc5f416f in qapi_dealloc_type_str qapi/qapi-dealloc-visitor.c:134 #3 0x5566dc5f3268 in visit_type_str qapi/qapi-visit-core.c:196 #4 0x5566dc5ced58 in visit_type_Netdev_fields /home/elmarco/src/qemu/qapi-visit.c:5936 #5 0x5566dc5cef71 in visit_type_Netdev /home/elmarco/src/qemu/qapi-visit.c:5960 #6 0x5566dc381a8d in net_visit net/net.c:1049 #7 0x5566dc381c37 in net_client_init net/net.c:1076 #8 0x5566dc3839e2 in net_init_netdev net/net.c:1473 #9 0x5566dc63cc0a in qemu_opts_foreach util/qemu-option.c:1112 #10 0x5566dc383b36 in net_init_clients net/net.c:1499 #11 0x5566dbf15d86 in main /home/elmarco/src/qemu/vl.c:4397 #12 0x7f6b180bb57f in __libc_start_main (/lib64/libc.so.6+0x2057f) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Back in the 2.3.0 release we declared qcow[2] encryption as deprecated, warning people that it would be removed in a future release. commit a1f688f Author: Markus Armbruster <armbru@redhat.com> Date: Fri Mar 13 21:09:40 2015 +0100 block: Deprecate QCOW/QCOW2 encryption The code still exists today, but by a (happy?) accident we entirely broke the ability to use qcow[2] encryption in the system emulators in the 2.4.0 release due to commit 8336aaf Author: Daniel P. Berrange <berrange@redhat.com> Date: Tue May 12 17:09:18 2015 +0100 qcow2/qcow: protect against uninitialized encryption key This commit was designed to prevent future coding bugs which might cause QEMU to read/write data on an encrypted block device in plain text mode before a decryption key is set. It turns out this preventative measure was a little too good, because we already had a long standing bug where QEMU read encrypted data in plain text mode during system emulator startup, in order to guess disk geometry: Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)): #0 0x00007fffe90b1a28 in raise () at /lib64/libc.so.6 #1 0x00007fffe90b362a in abort () at /lib64/libc.so.6 #2 0x00007fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6 #3 0x00007fffe90aa2d2 in () at /lib64/libc.so.6 #4 0x000055555587ae19 in qcow2_co_readv (bs=0x5555562accb0, sector_num=0, remaining_sectors=1, qiov=0x7fffffffd260) at block/qcow2.c:1229 #5 0x000055555589b60d in bdrv_aligned_preadv (bs=bs@entry=0x5555562accb0, req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7fffffffd260, flags=0) at block/io.c:908 #6 0x000055555589b8bc in bdrv_co_do_preadv (bs=0x5555562accb0, offset=0, bytes=512, qiov=0x7fffffffd260, flags=<optimized out>) at block/io.c:999 #7 0x000055555589c375 in bdrv_rw_co_entry (opaque=0x7fffffffd210) at block/io.c:544 #8 0x000055555586933b in coroutine_thread (opaque=0x555557876310) at coroutine-gthread.c:134 #9 0x00007ffff64e1835 in g_thread_proxy (data=0x5555562b5590) at gthread.c:778 #10 0x00007ffff6bb760a in start_thread () at /lib64/libpthread.so.0 #11 0x00007fffe917f59d in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7ffff7ecab40 (LWP 30343)): #0 0x00007fffe91797a9 in syscall () at /lib64/libc.so.6 #1 0x00007ffff64ff87f in g_cond_wait (cond=cond@entry=0x555555e085f0 <coroutine_cond>, mutex=mutex@entry=0x555555e08600 <coroutine_lock>) at gthread-posix.c:1397 #2 0x00005555558692c3 in qemu_coroutine_switch (co=<optimized out>) at coroutine-gthread.c:117 #3 0x00005555558692c3 in qemu_coroutine_switch (from_=0x5555562b5e30, to_=to_@entry=0x555557876310, action=action@entry=COROUTINE_ENTER) at coroutine-gthread.c:175 #4 0x0000555555868a90 in qemu_coroutine_enter (co=0x555557876310, opaque=0x0) at qemu-coroutine.c:116 #5 0x0000555555859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) at thread-pool.c:187 #6 0x0000555555859514 in aio_bh_poll (ctx=ctx@entry=0x5555562953b0) at async.c:85 #7 0x0000555555864d10 in aio_dispatch (ctx=ctx@entry=0x5555562953b0) at aio-posix.c:135 #8 0x0000555555864f75 in aio_poll (ctx=ctx@entry=0x5555562953b0, blocking=blocking@entry=true) at aio-posix.c:291 #9 0x000055555589c40d in bdrv_prwv_co (bs=bs@entry=0x5555562accb0, offset=offset@entry=0, qiov=qiov@entry=0x7fffffffd260, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591 #10 0x000055555589c503 in bdrv_rw_co (bs=bs@entry=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:614 #11 0x000055555589c562 in bdrv_read_unthrottled (nb_sectors=21845, buf=0x7fffffffd2e0 "\321,", sector_num=0, bs=0x5555562accb0) at block/io.c:622 #12 0x000055555589c562 in bdrv_read_unthrottled (bs=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845) at block/io.c:634 nb_sectors@entry=1) at block/block-backend.c:504 #14 0x0000555555752e9f in guess_disk_lchs (blk=blk@entry=0x5555562a5290, pcylinders=pcylinders@entry=0x7fffffffd52c, pheads=pheads@entry=0x7fffffffd530, psectors=psectors@entry=0x7fffffffd534) at hw/block/hd-geometry.c:68 #15 0x0000555555752ff7 in hd_geometry_guess (blk=0x5555562a5290, pcyls=pcyls@entry=0x555557875d1c, pheads=pheads@entry=0x555557875d20, psecs=psecs@entry=0x555557875d24, ptrans=ptrans@entry=0x555557875d28) at hw/block/hd-geometry.c:133 #16 0x0000555555752b87 in blkconf_geometry (conf=conf@entry=0x555557875d00, ptrans=ptrans@entry=0x555557875d28, cyls_max=cyls_max@entry=65536, heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, errp=errp@entry=0x7fffffffd5e0) at hw/block/block.c:71 #17 0x0000555555799bc4 in ide_dev_initfn (dev=0x555557875c80, kind=IDE_HD) at hw/ide/qdev.c:174 #18 0x0000555555768394 in device_realize (dev=0x555557875c80, errp=0x7fffffffd640) at hw/core/qdev.c:247 #19 0x0000555555769a81 in device_set_realized (obj=0x555557875c80, value=<optimized out>, errp=0x7fffffffd730) at hw/core/qdev.c:1058 #20 0x00005555558240ce in property_set_bool (obj=0x555557875c80, v=<optimized out>, opaque=0x555557875de0, name=<optimized out>, errp=0x7fffffffd730) at qom/object.c:1514 #21 0x0000555555826c87 in object_property_set_qobject (obj=obj@entry=0x555557875c80, value=value@entry=0x55555784bcb0, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/qom-qobject.c:24 #22 0x0000555555825760 in object_property_set_bool (obj=obj@entry=0x555557875c80, value=value@entry=true, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/object.c:905 #23 0x000055555576897b in qdev_init_nofail (dev=dev@entry=0x555557875c80) at hw/core/qdev.c:380 #24 0x0000555555799ead in ide_create_drive (bus=bus@entry=0x555557629630, unit=unit@entry=0, drive=0x5555562b77e0) at hw/ide/qdev.c:122 #25 0x000055555579a746 in pci_ide_create_devs (dev=dev@entry=0x555557628db0, hd_table=hd_table@entry=0x7fffffffd830) at hw/ide/pci.c:440 #26 0x000055555579b165 in pci_piix3_ide_init (bus=<optimized out>, hd_table=0x7fffffffd830, devfn=<optimized out>) at hw/ide/piix.c:218 #27 0x000055555568ca55 in pc_init1 (machine=0x5555562960a0, pci_enabled=1, kvmclock_enabled=<optimized out>) at /home/berrange/src/virt/qemu/hw/i386/pc_piix.c:256 #28 0x0000555555603ab2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4249 So the safety net is correctly preventing QEMU reading cipher text as if it were plain text, during startup and aborting QEMU to avoid bad usage of this data. For added fun this bug only happens if the encrypted qcow2 file happens to have data written to the first cluster, otherwise the cluster won't be allocated and so qcow2 would not try the decryption routines at all, just return all 0's. That no one even noticed, let alone reported, this bug that has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number of actual users of encrypted qcow2 is approximately zero. So rather than fix the crash, and backport it to stable releases, just go ahead with what we have warned users about and disable any use of qcow2 encryption in the system emulators. qemu-img/qemu-io/qemu-nbd are still able to access qcow2 encrypted images for the sake of data conversion. In the future, qcow2 will gain support for the alternative luks format, but when this happens it'll be using the '-object secret' infrastructure for getting keys, which avoids this problematic scenario entirely. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch modifies SoftFloat library so that it can be configured in run-time in relation to the meaning of signaling NaN bit, while, at the same time, strictly preserving its behavior on all existing platforms. Background: In floating-point calculations, there is a need for denoting undefined or unrepresentable values. This is achieved by defining certain floating-point numerical values to be NaNs (which stands for "not a number"). For additional reasons, virtually all modern floating-point unit implementations use two kinds of NaNs: quiet and signaling. The binary representations of these two kinds of NaNs, as a rule, differ only in one bit (that bit is, traditionally, the first bit of mantissa). Up to 2008, standards for floating-point did not specify all details about binary representation of NaNs. More specifically, the meaning of the bit that is used for distinguishing between signaling and quiet NaNs was not strictly prescribed. (IEEE 754-2008 was the first floating-point standard that defined that meaning clearly, see [1], p. 35) As a result, different platforms took different approaches, and that presented considerable challenge for multi-platform emulators like QEMU. Mips platform represents the most complex case among QEMU-supported platforms regarding signaling NaN bit. Up to the Release 6 of Mips architecture, "1" in signaling NaN bit denoted signaling NaN, which is opposite to IEEE 754-2008 standard. From Release 6 on, Mips architecture adopted IEEE standard prescription, and "0" denotes signaling NaN. On top of that, Mips architecture for SIMD (also known as MSA, or vector instructions) also specifies signaling bit in accordance to IEEE standard. MSA unit can be implemented with both pre-Release 6 and Release 6 main processor units. QEMU uses SoftFloat library to implement various floating-point-related instructions on all platforms. The current QEMU implementation allows for defining meaning of signaling NaN bit during build time, and is implemented via preprocessor macro called SNAN_BIT_IS_ONE. On the other hand, the change in this patch enables SoftFloat library to be configured in run-time. This configuration is meant to occur during CPU initialization, at the moment when it is definitely known what desired behavior for particular CPU (or any additional FPUs) is. The change is implemented so that it is consistent with existing implementation of similar cases. This means that structure float_status is used for passing the information about desired signaling NaN bit on each invocation of SoftFloat functions. The additional field in float_status is called snan_bit_is_one, which supersedes macro SNAN_BIT_IS_ONE. IMPORTANT: This change is not meant to create any change in emulator behavior or functionality on any platform. It just provides the means for SoftFloat library to be used in a more flexible way - in other words, it will just prepare SoftFloat library for usage related to Mips platform and its specifics regarding signaling bit meaning, which is done in some of subsequent patches from this series. Further break down of changes: 1) Added field snan_bit_is_one to the structure float_status, and correspondent setter function set_snan_bit_is_one(). 2) Constants <float16|float32|float64|floatx80|float128>_default_nan (used both internally and externally) converted to functions <float16|float32|float64|floatx80|float128>_default_nan(float_status*). This is necessary since they are dependent on signaling bit meaning. At the same time, for the sake of code cleanup and simplicity, constants <floatx80|float128>_default_nan_<low|high> (used only internally within SoftFloat library) are removed, as not needed. 3) Added a float_status* argument to SoftFloat library functions XXX_is_quiet_nan(XXX a_), XXX_is_signaling_nan(XXX a_), XXX_maybe_silence_nan(XXX a_). This argument must be present in order to enable correct invocation of new version of functions XXX_default_nan(). (XXX is <float16|float32|float64|floatx80|float128> here) 4) Updated code for all platforms to reflect changes in SoftFloat library. This change is twofolds: it includes modifications of SoftFloat library functions invocations, and an addition of invocation of function set_snan_bit_is_one() during CPU initialization, with arguments that are appropriate for each particular platform. It was established that all platforms zero their main CPU data structures, so snan_bit_is_one(0) in appropriate places is not added, as it is not needed. [1] "IEEE Standard for Floating-Point Arithmetic", IEEE Computer Society, August 29, 2008. Signed-off-by: Thomas Schwinge <thomas@codesourcery.com> Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com> Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com> Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by: Leon Alrae <leon.alrae@imgtec.com> Tested-by: Leon Alrae <leon.alrae@imgtec.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [leon.alrae@imgtec.com: * cherry-picked 2 chunks from patch #2 to fix compilation warnings] Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
It turns out qemu is calling exit() in various places from various threads without taking much care of resources state. The atexit() cleanup handlers cannot easily destroy resources that are in use (by the same thread or other). Since c1111a2, TCG arm guests run into the following abort() when running tests, the chardev mutex is locked during the write, so qemu_mutex_destroy() returns an error: #0 0x00007fffdbb806f5 in raise () at /lib64/libc.so.6 #1 0x00007fffdbb822fa in abort () at /lib64/libc.so.6 #2 0x00005555557616fe in error_exit (err=<optimized out>, msg=msg@entry=0x555555c38c30 <__func__.14622> "qemu_mutex_destroy") at /home/drjones/code/qemu/util/qemu-thread-posix.c:39 #3 0x0000555555b0be20 in qemu_mutex_destroy (mutex=mutex@entry=0x5555566aa0e0) at /home/drjones/code/qemu/util/qemu-thread-posix.c:57 #4 0x00005555558aab00 in qemu_chr_free_common (chr=0x5555566aa0e0) at /home/drjones/code/qemu/qemu-char.c:4029 #5 0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4038 #6 0x00005555558b05f9 in qemu_chr_delete (chr=<optimized out>) at /home/drjones/code/qemu/qemu-char.c:4044 #7 0x00005555558b062c in qemu_chr_cleanup () at /home/drjones/code/qemu/qemu-char.c:4557 #8 0x00007fffdbb851e8 in __run_exit_handlers () at /lib64/libc.so.6 #9 0x00007fffdbb85235 in () at /lib64/libc.so.6 #10 0x00005555558d1b39 in testdev_write (testdev=0x5555566aa0a0) at /home/drjones/code/qemu/backends/testdev.c:71 #11 0x00005555558d1b39 in testdev_write (chr=<optimized out>, buf=0x7fffc343fd9a "", len=0) at /home/drjones/code/qemu/backends/testdev.c:95 #12 0x00005555558adced in qemu_chr_fe_write (s=0x5555566aa0e0, buf=buf@entry=0x7fffc343fd98 "0q", len=len@entry=2) at /home/drjones/code/qemu/qemu-char.c:282 Instead of using a atexit() handler, only run the chardev cleanup as initially proposed at the end of main(), where there are less chances (hic) of conflicts or other races. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reported-by: Andrew Jones <drjones@redhat.com> Message-Id: <20160704153823.16879-1-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch allows to manage instructions like "fcmpd #2.2, %fp0". Original function manages only data accessed through an address register. Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Not checking this can lead to invalid dev->vdev member access in vhost_device_iotlb_miss if backend issue an iotlb message in a bad timing, either maliciously or by a bug. Reproduced rebooting a guest with testpmd in txonly forward mode. #0 0x0000559ffff94394 in vhost_device_iotlb_miss ( dev=dev@entry=0x55a0012f6680, iova=10245279744, write=1) at ../hw/virtio/vhost.c:1013 #1 0x0000559ffff9ac31 in vhost_backend_handle_iotlb_msg ( imsg=0x7ffddcfd32c0, dev=0x55a0012f6680) at ../hw/virtio/vhost-backend.c:411 #2 vhost_backend_handle_iotlb_msg (dev=dev@entry=0x55a0012f6680, imsg=imsg@entry=0x7ffddcfd32c0) at ../hw/virtio/vhost-backend.c:404 #3 0x0000559fffeded7b in slave_read (opaque=0x55a0012f6680) at ../hw/virtio/vhost-user.c:1464 #4 0x000055a0000c541b in aio_dispatch_handler ( ctx=ctx@entry=0x55a0010a2120, node=0x55a0012d9e00) at ../util/aio-posix.c:329 Fixes: 020e571 ("vhost: rework IOTLB messaging") Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Message-Id: <20210129090728.831208-1-eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit v5.2.0-190-g0546c0609c ("vl: split various early command line options to a separate function") moved the trace backend init code to the qemu_process_early_options(). Which is now being called before os_daemonize() via qemu_maybe_daemonize(). Turns out that this change of order causes a problem when executing QEMU in daemon mode and with CONFIG_TRACE_SIMPLE. The trace thread is now being created by the parent, and the parent is left waiting for a trace file flush that was registered via st_init(). The result is that the parent process never exits. To reproduce, fire up a QEMU process with -daemonize and with CONFIG_TRACE_SIMPLE enabled. Two QEMU process will be left in the host: $ sudo ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults \ -nographic -machine none,accel=kvm:tcg -daemonize $ ps axf | grep qemu 529710 pts/3 S+ 0:00 | \_ grep --color=auto qemu 529697 ? Ssl 0:00 \_ ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -daemonize 529699 ? Sl 0:00 \_ ./x86_64-softmmu/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -machine none,accel=kvm:tcg -daemonize The parent thread is hang in flush_trace_file: $ sudo gdb ./x86_64-softmmu/qemu-system-x86_64 529697 (..) (gdb) bt #0 0x00007f9dac6a137d in syscall () at /lib64/libc.so.6 #1 0x00007f9dacc3c4f3 in g_cond_wait () at /lib64/libglib-2.0.so.0 #2 0x0000555d12f952da in flush_trace_file (wait=true) at ../trace/simple.c:140 #3 0x0000555d12f95b4c in st_flush_trace_buffer () at ../trace/simple.c:383 #4 0x00007f9dac5e43a7 in __run_exit_handlers () at /lib64/libc.so.6 #5 0x00007f9dac5e4550 in on_exit () at /lib64/libc.so.6 #6 0x0000555d12d454de in os_daemonize () at ../os-posix.c:255 #7 0x0000555d12d0bd5c in qemu_maybe_daemonize (pid_file=0x0) at ../softmmu/vl.c:2408 #8 0x0000555d12d0e566 in qemu_init (argc=8, argv=0x7fffc594d9b8, envp=0x7fffc594da00) at ../softmmu/vl.c:3459 #9 0x0000555d128edac1 in main (argc=8, argv=0x7fffc594d9b8, envp=0x7fffc594da00) at ../softmmu/main.c:49 (gdb) Aside from the 'zombie' process in the host, this is directly impacting Libvirt. Libvirt waits for the parent process to exit to be sure that the QMP monitor is available in the daemonized process to fetch QEMU capabilities, and as is now Libvirt hangs at daemon start waiting for the parent thread to exit. The fix is simple: just move the trace backend related code back to be executed after daemonizing. Fixes: 0546c06 Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20210105181437.538366-2-danielhb413@gmail.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We can't know the caller read enough data in the memory pointed by ext_hdr to cast it as a ip6_ext_hdr_routing. Declare rt_hdr on the stack and fill it again from the iovec. Since we already checked there is enough data in the iovec buffer, simply add an assert() call to consume the bytes_read variable. This fix a 2 bytes buffer overrun in eth_parse_ipv6_hdr() reported by QEMU fuzzer: $ cat << EOF | ./qemu-system-i386 -M pc-q35-5.0 \ -accel qtest -monitor none \ -serial none -nographic -qtest stdio outl 0xcf8 0x80001010 outl 0xcfc 0xe1020000 outl 0xcf8 0x80001004 outw 0xcfc 0x7 write 0x25 0x1 0x86 write 0x26 0x1 0xdd write 0x4f 0x1 0x2b write 0xe1020030 0x4 0x190002e1 write 0xe102003a 0x2 0x0807 write 0xe1020048 0x4 0x12077cdd write 0xe1020400 0x4 0xba077cdd write 0xe1020420 0x4 0x190002e1 write 0xe1020428 0x4 0x3509d807 write 0xe1020438 0x1 0xe2 EOF ================================================================= ==2859770==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdef904902 at pc 0x561ceefa78de bp 0x7ffdef904820 sp 0x7ffdef904818 READ of size 1 at 0x7ffdef904902 thread T0 #0 0x561ceefa78dd in _eth_get_rss_ex_dst_addr net/eth.c:410:17 #1 0x561ceefa41fb in eth_parse_ipv6_hdr net/eth.c:532:17 #2 0x561cef7de639 in net_tx_pkt_parse_headers hw/net/net_tx_pkt.c:228:14 #3 0x561cef7dbef4 in net_tx_pkt_parse hw/net/net_tx_pkt.c:273:9 #4 0x561ceec29f22 in e1000e_process_tx_desc hw/net/e1000e_core.c:730:29 #5 0x561ceec28eac in e1000e_start_xmit hw/net/e1000e_core.c:927:9 #6 0x561ceec1baab in e1000e_set_tdt hw/net/e1000e_core.c:2444:9 #7 0x561ceebf300e in e1000e_core_write hw/net/e1000e_core.c:3256:9 #8 0x561cef3cd4cd in e1000e_mmio_write hw/net/e1000e.c:110:5 Address 0x7ffdef904902 is located in stack of thread T0 at offset 34 in frame #0 0x561ceefa320f in eth_parse_ipv6_hdr net/eth.c:486 This frame has 1 object(s): [32, 34) 'ext_hdr' (line 487) <== Memory access at offset 34 overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: stack-buffer-overflow net/eth.c:410:17 in _eth_get_rss_ex_dst_addr Shadow bytes around the buggy address: 0x10003df188d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df188e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df188f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18910: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 =>0x10003df18920:[02]f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10003df18970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Stack left redzone: f1 Stack right redzone: f3 ==2859770==ABORTING Add the corresponding qtest case with the fuzzer reproducer. FWIW GCC 11 similarly reported: net/eth.c: In function 'eth_parse_ipv6_hdr': net/eth.c:410:15: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds] 410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) { | ~~~~~^~~~~~~ net/eth.c:485:24: note: while referencing 'ext_hdr' 485 | struct ip6_ext_hdr ext_hdr; | ^~~~~~~ net/eth.c:410:38: error: array subscript 'struct ip6_ext_hdr_routing[0]' is partly outside array bounds of 'struct ip6_ext_hdr[1]' [-Werror=array-bounds] 410 | if ((rthdr->rtype == 2) && (rthdr->segleft == 1)) { | ~~~~~^~~~~~~~~ net/eth.c:485:24: note: while referencing 'ext_hdr' 485 | struct ip6_ext_hdr ext_hdr; | ^~~~~~~ Cc: qemu-stable@nongnu.org Buglink: https://bugs.launchpad.net/qemu/+bug/1879531 Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Miroslav Rezanina <mrezanin@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com> Fixes: eb70002 ("net_pkt: Extend packet abstraction as required by e1000e functionality") Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Incoming enabled bitmaps are busy, because we do bdrv_dirty_bitmap_create_successor() for them. But disabled bitmaps being migrated are not marked busy, and user can remove them during the incoming migration. Then we may crash in cancel_incoming_locked() when try to remove the bitmap that was already removed by user, like this: #0 qemu_mutex_lock_impl (mutex=0x5593d88c50d1, file=0x559680554b20 "../block/dirty-bitmap.c", line=64) at ../util/qemu-thread-posix.c:77 #1 bdrv_dirty_bitmaps_lock (bs=0x5593d88c0ee9) at ../block/dirty-bitmap.c:64 #2 bdrv_release_dirty_bitmap (bitmap=0x5596810e9570) at ../block/dirty-bitmap.c:362 #3 cancel_incoming_locked (s=0x559680be8208 <dbm_state+40>) at ../migration/block-dirty-bitmap.c:918 #4 dirty_bitmap_load (f=0x559681d02b10, opaque=0x559680be81e0 <dbm_state>, version_id=1) at ../migration/block-dirty-bitmap.c:1194 #5 vmstate_load (f=0x559681d02b10, se=0x559680fb5810) at ../migration/savevm.c:908 #6 qemu_loadvm_section_part_end (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2473 #7 qemu_loadvm_state_main (f=0x559681d02b10, mis=0x559680fb4a30) at ../migration/savevm.c:2626 #8 postcopy_ram_listen_thread (opaque=0x0) at ../migration/savevm.c:1871 #9 qemu_thread_start (args=0x5596817ccd10) at ../util/qemu-thread-posix.c:521 #10 start_thread () at /lib64/libpthread.so.0 #11 clone () at /lib64/libc.so.6 Note bs pointer taken from bitmap: it's definitely bad aligned. That's because we are in use after free, bitmap is already freed. So, let's make disabled bitmaps (being migrated) busy during incoming migration. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20210322094906.5079-2-vsementsov@virtuozzo.com>
When building with --enable-sanitizers we get: Direct leak of 32 byte(s) in 2 object(s) allocated from: #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf) #1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958) #2 0x561847f02ca2 in usb_packet_init hw/usb/core.c:531:5 #3 0x561848df4df4 in usb_ehci_init hw/usb/hcd-ehci.c:2575:5 #4 0x561847c119ac in ehci_sysbus_init hw/usb/hcd-ehci-sysbus.c:73:5 #5 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9 #6 0x56184a5bd955 in object_init_with_type qom/object.c:371:9 #7 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5 #8 0x56184a5a24d5 in object_initialize qom/object.c:536:5 #9 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5 #10 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10 #11 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5 #12 0x561849542d18 in npcm7xx_init hw/arm/npcm7xx.c:427:5 Similarly to commit d710e1e ("usb: ehci: fix memory leak in ehci"), fix by calling usb_ehci_finalize() to free the USBPacket. Fixes: 7341ea0 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210323183701.281152-1-f4bug@amsat.org> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When building with --enable-sanitizers we get: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x5618479ec7cf in malloc (qemu-system-aarch64+0x233b7cf) #1 0x7f675745f958 in g_malloc (/lib64/libglib-2.0.so.0+0x58958) #2 0x561847c2dcc9 in xlnx_dp_init hw/display/xlnx_dp.c:1259:5 #3 0x56184a5bdab8 in object_init_with_type qom/object.c:375:9 #4 0x56184a5a2bda in object_initialize_with_type qom/object.c:517:5 #5 0x56184a5a24d5 in object_initialize qom/object.c:536:5 #6 0x56184a5a2f6c in object_initialize_child_with_propsv qom/object.c:566:5 #7 0x56184a5a2e60 in object_initialize_child_with_props qom/object.c:549:10 #8 0x56184a5a3a1e in object_initialize_child_internal qom/object.c:603:5 #9 0x5618495aa431 in xlnx_zynqmp_init hw/arm/xlnx-zynqmp.c:273:5 The RX/TX FIFOs are created in xlnx_dp_init(), add xlnx_dp_finalize() to destroy them. Fixes: 58ac482 ("introduce xlnx-dp") Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20210323182958.277654-1-f4bug@amsat.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
g_hash_table_add always retains ownership of the pointer passed in as the key. Its return status merely indicates whether the added entry was new, or replaced an existing entry. Thus key must never be freed after this method returns. Spotted by ASAN: ==2407186==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ac4f0 at pc 0x7ffff766659c bp 0x7fffffffd1d0 sp 0x7fffffffc980 READ of size 1 at 0x6020003ac4f0 thread T0 #0 0x7ffff766659b (/lib64/libasan.so.6+0x8a59b) #1 0x7ffff6bfa843 in g_str_equal ../glib/ghash.c:2303 #2 0x7ffff6bf8167 in g_hash_table_lookup_node ../glib/ghash.c:493 #3 0x7ffff6bf9b78 in g_hash_table_insert_internal ../glib/ghash.c:1598 #4 0x7ffff6bf9c32 in g_hash_table_add ../glib/ghash.c:1689 #5 0x5555596caad4 in module_load_one ../util/module.c:233 #6 0x5555596ca949 in module_load_one ../util/module.c:225 #7 0x5555596ca949 in module_load_one ../util/module.c:225 #8 0x5555596cbdf4 in module_load_qom_all ../util/module.c:349 Typical C bug... Fixes: 9062912 ("module: use g_hash_table_add()") Cc: qemu-stable@nongnu.org Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210316134456.3243102-1-marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org> CC: Li Zhijian <lizhijian@cn.fujitsu.com> Date: Sat, 31 Jul 2021 22:05:51 +0800 (5 weeks, 4 days, 17 hours ago) multifd with unsupported protocol will cause a segment fault. (gdb) bt #0 0x0000563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at ../util/qemu-sockets.c:1190 #1 0x0000563b4a797a03 in qio_channel_socket_connect_sync (ioc=0x563b4d16e8c0, addr=0x0, errp=0x7f7f02675410) at ../io/channel-socket.c:145 #2 0x0000563b4a797abf in qio_channel_socket_connect_worker (task=0x563b4cd86c30, opaque=0x0) at ../io/channel-socket.c:168 #3 0x0000563b4a792631 in qio_task_thread_worker (opaque=0x563b4cd86c30) at ../io/task.c:124 #4 0x0000563b4a91da69 in qemu_thread_start (args=0x563b4c44bb80) at ../util/qemu-thread-posix.c:541 #5 0x00007f7fe9b5b3f9 in ?? () #6 0x0000000000000000 in ?? () It's enough to check migrate_multifd_is_allowed() in multifd cleanup() and multifd setup() though there are so many other places using migrate_use_multifd(). Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
PCI resource reserve capability should use LE format as all other PCI things. If we don't then seabios won't boot: === PCI new allocation pass #1 === PCI: check devices PCI: QEMU resource reserve cap: size 10000000000000 type io PCI: secondary bus 1 size 10000000000000 type io PCI: secondary bus 1 size 00200000 type mem PCI: secondary bus 1 size 00200000 type prefmem === PCI new allocation pass #2 === PCI: out of I/O address space This became more important since we started reserving IO by default, previously no one noticed. Fixes: e2a6290 ("hw/pcie-root-port: Fix hotplug for PCI devices requiring IO") Cc: marcel.apfelbaum@gmail.com Fixes: 226263f ("hw/pci: add QEMU-specific PCI capability to the Generic PCI Express Root Port") Cc: zuban32s@gmail.com Fixes: 6755e61 ("hw/pci: add PCI resource reserve capability to legacy PCI bridge") Cc: jing2.liu@linux.intel.com Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
This patch fixed as follows: Thread 1 (Thread 0x7f34ee738d80 (LWP 11212)): #0 __pthread_clockjoin_ex (threadid=139847152957184, thread_return=0x7f30b1febf30, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145 #1 0x0000563401998e36 in qemu_thread_join (thread=0x563402d66610) at util/qemu-thread-posix.c:587 #2 0x00005634017a79fa in process_incoming_migration_co (opaque=0x0) at migration/migration.c:502 #3 0x00005634019b59c9 in coroutine_trampoline (i0=63395504, i1=22068) at util/coroutine-ucontext.c:115 #4 0x00007f34ef860660 in ?? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 from /lib/x86_64-linux-gnu/libc.so.6 #5 0x00007f30b21ee730 in ?? () #6 0x0000000000000000 in ?? () Thread 13 (Thread 0x7f30b3dff700 (LWP 11747)): #0 __lll_lock_wait (futex=futex@entry=0x56340218ffa0 <qemu_global_mutex>, private=0) at lowlevellock.c:52 #1 0x00007f34efa000a3 in _GI__pthread_mutex_lock (mutex=0x56340218ffa0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80 #2 0x0000563401997f99 in qemu_mutex_lock_impl (mutex=0x56340218ffa0 <qemu_global_mutex>, file=0x563401b7a80e "migration/colo.c", line=806) at util/qemu-thread-posix.c:78 #3 0x0000563401407144 in qemu_mutex_lock_iothread_impl (file=0x563401b7a80e "migration/colo.c", line=806) at /home/workspace/colo-qemu/cpus.c:1899 #4 0x00005634017ba8e8 in colo_process_incoming_thread (opaque=0x563402d664c0) at migration/colo.c:806 #5 0x0000563401998b72 in qemu_thread_start (args=0x5634039f8370) at util/qemu-thread-posix.c:519 #6 0x00007f34ef9fd609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #7 0x00007f34ef924293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The QEMU main thread is holding the lock: (gdb) p qemu_global_mutex $1 = {lock = {_data = {lock = 2, __count = 0, __owner = 11212, __nusers = 9, __kind = 0, __spins = 0, __elision = 0, __list = {_prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\314+\000\000\t", '\000' <repeats 26 times>, __align = 2}, file = 0x563401c07e4b "util/main-loop.c", line = 240, initialized = true} >From the call trace, we can see it is a deadlock bug. and the QEMU main thread holds the global mutex to wait until the COLO thread ends. and the colo thread wants to acquire the global mutex, which will cause a deadlock. So, we should release the qemu_global_mutex before waiting colo thread ends. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
When trying to use the pc-dimm device on a non-NUMA machine, we get: $ qemu-system-arm -M none -cpu max -S \ -object memory-backend-file,id=mem1,size=1M,mem-path=/tmp/1m \ -device pc-dimm,id=dimm1,memdev=mem1 Segmentation fault (core dumped) (gdb) bt #0 pc_dimm_realize (dev=0x555556da3e90, errp=0x7fffffffcd10) at hw/mem/pc-dimm.c:184 #1 0x0000555555fe1f8f in device_set_realized (obj=0x555556da3e90, value=true, errp=0x7fffffffce18) at hw/core/qdev.c:531 #2 0x0000555555feb4a9 in property_set_bool (obj=0x555556da3e90, v=0x555556e54420, name=0x5555563c3c41 "realized", opaque=0x555556a704f0, errp=0x7fffffffce18) at qom/object.c:2257 To avoid that crash, restrict the pc-dimm NUMA check to machines supporting NUMA, and do not allow the use of 'node' property on non-NUMA machines. Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20211106145016.611332-1-f4bug@amsat.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Without the previous commit, when running 'make check-qtest-i386' with QEMU configured with '--enable-sanitizers' we get: AddressSanitizer:DEADLYSIGNAL ================================================================= ==287878==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000344 ==287878==The signal is caused by a WRITE memory access. ==287878==Hint: address points to the zero page. #0 0x564b2e5bac27 in blk_inc_in_flight block/block-backend.c:1346:5 #1 0x564b2e5bb228 in blk_pwritev_part block/block-backend.c:1317:5 #2 0x564b2e5bcd57 in blk_pwrite block/block-backend.c:1498:11 #3 0x564b2ca1cdd3 in fdctrl_write_data hw/block/fdc.c:2221:17 #4 0x564b2ca1b2f7 in fdctrl_write hw/block/fdc.c:829:9 #5 0x564b2dc49503 in portio_write softmmu/ioport.c:201:9 Add the reproducer for CVE-2021-20196. Suggested-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20211124161536.631563-4-philmd@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>
tl;dr: This allows Big Endian zImage booting via -kernel + x-vof=on. QEMU loads the kernel at 0x400000 by default which works most of the time as Linux kernels are relocatable, 64bit and compiled with "-pie" (position independent code). This works for a little endian zImage too. However a big endian zImage is compiled without -pie, is 32bit, linked to 0x4000000 so current QEMU ends up loading it at 0x4400000 but keeps spapr->kernel_addr unchanged so booting fails. This uses the kernel address returned from load_elf(). If the default kernel_addr is used, there is no change in behavior (as translate_kernel_address() takes care of this), which is: LE/BE vmlinux and LE zImage boot, BE zImage does not. If the VM created with "-machine kernel-addr=0,x-vof=on", then QEMU prints a warning and BE zImage boots. Note #1: SLOF (x-vof=off) still cannot boot a big endian zImage as SLOF enables MSR_SF for everything loaded by QEMU and this leads to early crash of 32bit zImage. Note #2: BE/LE vmlinux images set MSR_SF in early boot so these just work; a LE zImage restores MSR_SF after every CI call and we are lucky enough not to crash before the first CI call. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220504065536.3534488-1-aik@ozlabs.ru> [danielhb: use PRIx64 instead of lx in warn_report] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
the code in pcibus_get_fw_dev_path contained the potential for a stack buffer overflow of 1 byte, potentially writing to the stack an extra NUL byte. This overflow could happen if the PCI slot is >= 0x10000000, and the PCI function is >= 0x10000000, due to the size parameter of snprintf being incorrectly calculated in the call: if (PCI_FUNC(d->devfn)) snprintf(path + off, sizeof(path) + off, ",%x", PCI_FUNC(d->devfn)); since the off obtained from a previous call to snprintf is added instead of subtracted from the total available size of the buffer. Without the accurate size guard from snprintf, we end up writing in the worst case: name (32) + "@" (1) + SLOT (8) + "," (1) + FUNC (8) + term NUL (1) = 51 bytes In order to provide something more robust, replace all of the code in pcibus_get_fw_dev_path with a single call to g_strdup_printf, so there is no need to rely on manual calculations. Found by compiling QEMU with FORTIFY_SOURCE=3 as the error: *** buffer overflow detected ***: terminated Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted. [Switching to Thread 0x7ffff642c380 (LWP 121307)] 0x00007ffff71ff55c in __pthread_kill_implementation () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff71ff55c in __pthread_kill_implementation () at /lib64/libc.so.6 #1 0x00007ffff71ac6f6 in raise () at /lib64/libc.so.6 #2 0x00007ffff7195814 in abort () at /lib64/libc.so.6 #3 0x00007ffff71f279e in __libc_message () at /lib64/libc.so.6 #4 0x00007ffff729767a in __fortify_fail () at /lib64/libc.so.6 #5 0x00007ffff7295c36 in () at /lib64/libc.so.6 #6 0x00007ffff72957f5 in __snprintf_chk () at /lib64/libc.so.6 #7 0x0000555555b1c1fd in pcibus_get_fw_dev_path () #8 0x0000555555f2bde4 in qdev_get_fw_dev_path_helper.constprop () #9 0x0000555555f2bd86 in qdev_get_fw_dev_path_helper.constprop () #10 0x00005555559a6e5d in get_boot_device_path () #11 0x00005555559a712c in get_boot_devices_list () #12 0x0000555555b1a3d0 in fw_cfg_machine_reset () #13 0x0000555555bf4c2d in pc_machine_reset () #14 0x0000555555c66988 in qemu_system_reset () #15 0x0000555555a6dff6 in qdev_machine_creation_done () #16 0x0000555555c79186 in qmp_x_exit_preconfig.part () #17 0x0000555555c7b459 in qemu_init () #18 0x0000555555960a29 in main () Found-by: Dario Faggioli <Dario Faggioli <dfaggioli@suse.com> Found-by: Martin Liška <martin.liska@suse.com> Cc: qemu-stable@nongnu.org Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20220531114707.18830-1-cfontana@suse.de> Reviewed-by: Ani Sinha <ani@anisinha.ca>
In two places in gdbstub.c we look at gdbserver_state.init to decide whether we're going to do a semihosting syscall via the gdb remote protocol: * when setting up, if the user didn't explicitly select either native semihosting or gdb semihosting, we autoselect, with the intended behaviour "use gdb if gdb is connected" * when the semihosting layer attempts to do a syscall via gdb, we silently ignore it if the gdbstub wasn't actually set up However, if the user's commandline sets up the gdbstub but tells QEMU to start rather than waiting for a GDB to connect (eg using '-s' but not '-S'), then we will have gdbserver_state.init true but no actual connection; an attempt to use gdb syscalls will then crash because we try to use gdbserver_state.c_cpu when it hasn't been set up: #0 0x00007ffff6803ba8 in qemu_cpu_kick (cpu=0x0) at ../../softmmu/cpus.c:457 #1 0x00007ffff6c03913 in gdb_do_syscallv (cb=0x7ffff6c19944 <common_semi_cb>, fmt=0x7ffff7573b7e "", va=0x7ffff56294c0) at ../../gdbstub.c:2946 #2 0x00007ffff6c19c3a in common_semi_gdb_syscall (cs=0x7ffff83fe060, cb=0x7ffff6c19944 <common_semi_cb>, fmt=0x7ffff7573b75 "isatty,%x") at ../../semihosting/arm-compat-semi.c:494 #3 0x00007ffff6c1a064 in gdb_isattyfn (cs=0x7ffff83fe060, gf=0x7ffff86a3690) at ../../semihosting/arm-compat-semi.c:636 #4 0x00007ffff6c1b20f in do_common_semihosting (cs=0x7ffff83fe060) at ../../semihosting/arm-compat-semi.c:967 #5 0x00007ffff693a037 in handle_semihosting (cs=0x7ffff83fe060) at ../../target/arm/helper.c:10316 You can probably also get into this state via some odd corner cases involving connecting a GDB and then telling it to detach from all the vCPUs. Abstract out the test into a new gdb_attached() function which returns true only if there's actually a GDB connected to the debug stub and attached to at least one vCPU. Reported-by: Liviu Ionescu <ilg@livius.net> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Luc Michel <luc@lmichel.fr> Message-id: 20220526190053.521505-2-peter.maydell@linaro.org
There is such error info when running linux kernel: tcg_handle_interrupt: assertion failed: (qemu_mutex_iothread_locked()). calling stack: #0 in raise () at /lib64/libc.so.6 #1 in abort () at /lib64/libc.so.6 #2 in g_assertion_message_expr.cold () at /lib64/libglib-2.0.so.0 #3 in g_assertion_message_expr () at /lib64/libglib-2.0.so.0 #4 in tcg_handle_interrupt (cpu=0x632000030800, mask=2) at ../accel/tcg/tcg-accel-ops.c:79 #5 in cpu_interrupt (cpu=0x632000030800, mask=2) at ../softmmu/cpus.c:248 #6 in loongarch_cpu_set_irq (opaque=0x632000030800, irq=11, level=0) at ../target/loongarch/cpu.c:100 #7 in helper_csrwr_ticlr (env=0x632000039440, val=1) at ../target/loongarch/csr_helper.c:85 #8 in code_gen_buffer () #9 in cpu_tb_exec (cpu=0x632000030800, itb=0x7fff946ac280, tb_exit=0x7ffe4fcb6c30) at ../accel/tcg/cpu-exec.c:358 Add mutex iothread lock around loongarch_cpu_set_irq in csrwr_ticlr() to fix the bug. Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220701093407.2150607-10-yangxiaojuan@loongson.cn> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
As soon as virtio_scsi_data_plane_start() attaches host notifiers the IOThread may start virtqueue processing. There is a race between IOThread virtqueue processing and virtio_scsi_data_plane_start() because it only assigns s->dataplane_started after attaching host notifiers. When a virtqueue handler function in the IOThread calls virtio_scsi_defer_to_dataplane() it may see !s->dataplane_started and attempt to start dataplane even though we're already in the IOThread: #0 0x00007f67b360857c __pthread_kill_implementation (libc.so.6 + 0xa257c) #1 0x00007f67b35bbd56 raise (libc.so.6 + 0x55d56) #2 0x00007f67b358e833 abort (libc.so.6 + 0x28833) #3 0x00007f67b358e75b __assert_fail_base.cold (libc.so.6 + 0x2875b) #4 0x00007f67b35b4cd6 __assert_fail (libc.so.6 + 0x4ecd6) #5 0x000055ca87fd411b memory_region_transaction_commit (qemu-kvm + 0x67511b) #6 0x000055ca87e17811 virtio_pci_ioeventfd_assign (qemu-kvm + 0x4b8811) #7 0x000055ca87e14836 virtio_bus_set_host_notifier (qemu-kvm + 0x4b5836) #8 0x000055ca87f8e14e virtio_scsi_set_host_notifier (qemu-kvm + 0x62f14e) #9 0x000055ca87f8dd62 virtio_scsi_dataplane_start (qemu-kvm + 0x62ed62) #10 0x000055ca87e14610 virtio_bus_start_ioeventfd (qemu-kvm + 0x4b5610) #11 0x000055ca87f8c29a virtio_scsi_handle_ctrl (qemu-kvm + 0x62d29a) #12 0x000055ca87fa5902 virtio_queue_host_notifier_read (qemu-kvm + 0x646902) #13 0x000055ca882c099e aio_dispatch_handler (qemu-kvm + 0x96199e) #14 0x000055ca882c1761 aio_poll (qemu-kvm + 0x962761) #15 0x000055ca880e1052 iothread_run (qemu-kvm + 0x782052) #16 0x000055ca882c562a qemu_thread_start (qemu-kvm + 0x96662a) This patch assigns s->dataplane_started before attaching host notifiers so that virtqueue handler functions that run in the IOThread before virtio_scsi_data_plane_start() returns correctly identify that dataplane does not need to be started. This fix is taken from the virtio-blk dataplane code and it's worth adding a comment in virtio-blk as well to explain why it works. Note that s->dataplane_started does not need the AioContext lock because it is set before attaching host notifiers and cleared after detaching host notifiers. In other words, the IOThread always sees the value true and the main loop thread does not modify it while the IOThread is active. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2099541 Reported-by: Qing Wang <qinwang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220808162134.240405-1-stefanha@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The DMA engine is started by I/O access and then itself accesses the I/O registers, triggering a reentrancy bug. The following log can reveal it: ==5637==ERROR: AddressSanitizer: stack-overflow #0 0x5595435f6078 in tulip_xmit_list_update qemu/hw/net/tulip.c:673 #1 0x5595435f204a in tulip_write qemu/hw/net/tulip.c:805:13 #2 0x559544637f86 in memory_region_write_accessor qemu/softmmu/memory.c:492:5 #3 0x5595446379fa in access_with_adjusted_size qemu/softmmu/memory.c:554:18 #4 0x5595446372fa in memory_region_dispatch_write qemu/softmmu/memory.c #5 0x55954468b74c in flatview_write_continue qemu/softmmu/physmem.c:2825:23 #6 0x559544683662 in flatview_write qemu/softmmu/physmem.c:2867:12 #7 0x5595446833f3 in address_space_write qemu/softmmu/physmem.c:2963:18 #8 0x5595435fb082 in dma_memory_rw_relaxed qemu/include/sysemu/dma.h:87:12 #9 0x5595435fb082 in dma_memory_rw qemu/include/sysemu/dma.h:130:12 #10 0x5595435fb082 in dma_memory_write qemu/include/sysemu/dma.h:171:12 #11 0x5595435fb082 in stl_le_dma qemu/include/sysemu/dma.h:272:1 #12 0x5595435fb082 in stl_le_pci_dma qemu/include/hw/pci/pci.h:910:1 #13 0x5595435fb082 in tulip_desc_write qemu/hw/net/tulip.c:101:9 #14 0x5595435f7e3d in tulip_xmit_list_update qemu/hw/net/tulip.c:706:9 #15 0x5595435f204a in tulip_write qemu/hw/net/tulip.c:805:13 Fix this bug by restricting the DMA engine to memories regions. Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Setup an ARM virtual machine of machine virt and execute qmp "query-acpi-ospm-status" causes segmentation fault with following dumpstack: #1 0x0000aaaaab64235c in qmp_query_acpi_ospm_status (errp=errp@entry=0xfffffffff030) at ../monitor/qmp-cmds.c:312 #2 0x0000aaaaabfc4e20 in qmp_marshal_query_acpi_ospm_status (args=<optimized out>, ret=0xffffea4ffe90, errp=0xffffea4ffe88) at qapi/qapi-commands-acpi.c:63 #3 0x0000aaaaabff8ba0 in do_qmp_dispatch_bh (opaque=0xffffea4ffe98) at ../qapi/qmp-dispatch.c:128 #4 0x0000aaaaac02e594 in aio_bh_call (bh=0xffffe0004d80) at ../util/async.c:150 #5 aio_bh_poll (ctx=ctx@entry=0xaaaaad0f6040) at ../util/async.c:178 #6 0x0000aaaaac00bd40 in aio_dispatch (ctx=ctx@entry=0xaaaaad0f6040) at ../util/aio-posix.c:421 #7 0x0000aaaaac02e010 in aio_ctx_dispatch (source=0xaaaaad0f6040, callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:320 #8 0x0000fffff76f6884 in g_main_context_dispatch () at /usr/lib64/libglib-2.0.so.0 #9 0x0000aaaaac0452d4 in glib_pollfds_poll () at ../util/main-loop.c:297 #10 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:320 #11 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:596 #12 0x0000aaaaab5c9e50 in qemu_main_loop () at ../softmmu/runstate.c:734 #13 0x0000aaaaab185370 in qemu_main (argc=argc@entry=47, argv=argv@entry=0xfffffffff518, envp=envp@entry=0x0) at ../softmmu/main.c:38 #14 0x0000aaaaab16f99c in main (argc=47, argv=0xfffffffff518) at ../softmmu/main.c:47 Fixes: ebb6207 ("hw/acpi: Add ACPI Generic Event Device Support") Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-id: 20220816094957.31700-1-zhukeqian1@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When building QEMU with DEBUG_ATI defined then running with '-device ati-vga,romfile="" -d unimp,guest_errors -trace ati\*' we get: ati_mm_write 4 0x16c0 DP_CNTL <- 0x1 ati_mm_write 4 0x146c DP_GUI_MASTER_CNTL <- 0x2 ati_mm_write 4 0x16c8 DP_MIX <- 0xff0000 ati_mm_write 4 0x16c4 DP_DATATYPE <- 0x2 ati_mm_write 4 0x224 CRTC_OFFSET <- 0x0 ati_mm_write 4 0x142c DST_PITCH_OFFSET <- 0xfe00000 ati_mm_write 4 0x1420 DST_Y <- 0x3fff ati_mm_write 4 0x1410 DST_HEIGHT <- 0x3fff ati_mm_write 4 0x1588 DST_WIDTH_X <- 0x3fff3fff ati_2d_blt: vram:0x7fff5fa00000 addr:0 ds:0x7fff61273800 stride:2560 bpp:32 rop:0xff ati_2d_blt: 0 0 0, 0 127 0, (0,0) -> (16383,16383) 16383x16383 > ^ ati_2d_blt: pixman_fill(dst:0x7fff5fa00000, stride:254, bpp:8, x:16383, y:16383, w:16383, h:16383, xor:0xff000000) Thread 3 "qemu-system-i38" received signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007ffff7f62ce0 in sse2_fill.lto_priv () at /lib64/libpixman-1.so.0 #1 0x00007ffff7f09278 in pixman_fill () at /lib64/libpixman-1.so.0 #2 0x0000555557b5a9af in ati_2d_blt (s=0x631000028800) at hw/display/ati_2d.c:196 #3 0x0000555557b4b5a2 in ati_mm_write (opaque=0x631000028800, addr=5512, data=1073692671, size=4) at hw/display/ati.c:843 #4 0x0000555558b90ec4 in memory_region_write_accessor (mr=0x631000039cc0, addr=5512, ..., size=4, ...) at softmmu/memory.c:492 Commit 584acf3 ("ati-vga: Fix reverse bit blts") introduced the local dst_x and dst_y which adjust the (x, y) coordinates depending on the direction in the SRCCOPY ROP3 operation, but forgot to address the same issue for the PATCOPY, BLACKNESS and WHITENESS operations, which also call pixman_fill(). Fix that now by using the adjusted coordinates in the pixman_fill call, and update the related debug printf(). Reported-by: Qiang Liu <qiangliu@zju.edu.cn> Fixes: 584acf3 ("ati-vga: Fix reverse bit blts") Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Mauro Matteo Cascella <mcascell@redhat.com> Message-Id: <20210906153103.1661195-1-philmd@redhat.com> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Fix warnings such: disas/nanomips.c:3251:64: warning: format specifies type 'char *' but the argument has type 'int64' (aka 'long long') [-Wformat] return img_format("CACHE 0x%" PRIx64 ", %s(%s)", op_value, s_value, rs); ~~ ^~~~~~~ %lld To avoid crashes such (kernel from commit f375ad6): $ qemu-system-mipsel -cpu I7200 -d in_asm -kernel generic_nano32r6el_page4k ... ---------------- IN: __bzero 0x805c6084: 20c4 6950 ADDU r13, a0, a2 0x805c6088: 9089 ADDIU a0, 1 Process 70261 stopped * thread #6, stop reason = EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff0) frame #0: 0x00000001bfe38864 libsystem_platform.dylib`_platform_strlen + 4 libsystem_platform.dylib`: -> 0x1bfe38864 <+4>: ldr q0, [x1] 0x1bfe38868 <+8>: adr x3, #-0xc8 ; ___lldb_unnamed_symbol314 0x1bfe3886c <+12>: ldr q2, [x3], #0x10 0x1bfe38870 <+16>: and x2, x0, #0xf Target 0: (qemu-system-mipsel) stopped. (lldb) bt * thread #6, stop reason = EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff0) * frame #0: 0x00000001bfe38864 libsystem_platform.dylib`_platform_strlen + 4 frame #1: 0x00000001bfce76a0 libsystem_c.dylib`__vfprintf + 4544 frame #2: 0x00000001bfd158b4 libsystem_c.dylib`_vasprintf + 280 frame #3: 0x0000000101c22fb0 libglib-2.0.0.dylib`g_vasprintf + 28 frame #4: 0x0000000101bfb7d8 libglib-2.0.0.dylib`g_strdup_vprintf + 32 frame #5: 0x000000010000fb70 qemu-system-mipsel`img_format(format=<unavailable>) at nanomips.c:103:14 [opt] frame #6: 0x0000000100018868 qemu-system-mipsel`SB_S9_(instruction=<unavailable>, info=<unavailable>) at nanomips.c:12616:12 [opt] frame #7: 0x000000010000f90c qemu-system-mipsel`print_insn_nanomips at nanomips.c:589:28 [opt] Fixes: 4066c15 ("disas/nanomips: Remove IMMEDIATE functions") Reported-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20221101114458.25756-2-philmd@linaro.org>
This is below memleak detected when to quit the qemu-system-x86_64 (with vhost-scsi-pci). (qemu) quit ================================================================= ==15568==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f00aec57917 in __interceptor_calloc (/lib64/libasan.so.6+0xb4917) #1 0x7f00ada0d7b5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x517b5) #2 0x5648ffd38bac in vhost_scsi_start ../hw/scsi/vhost-scsi.c:92 #3 0x5648ffd38d52 in vhost_scsi_set_status ../hw/scsi/vhost-scsi.c:131 #4 0x5648ffda340e in virtio_set_status ../hw/virtio/virtio.c:2036 #5 0x5648ff8de281 in virtio_ioport_write ../hw/virtio/virtio-pci.c:431 #6 0x5648ff8deb29 in virtio_pci_config_write ../hw/virtio/virtio-pci.c:576 #7 0x5648ffe5c0c2 in memory_region_write_accessor ../softmmu/memory.c:493 #8 0x5648ffe5c424 in access_with_adjusted_size ../softmmu/memory.c:555 #9 0x5648ffe6428f in memory_region_dispatch_write ../softmmu/memory.c:1515 #10 0x5648ffe8613d in flatview_write_continue ../softmmu/physmem.c:2825 #11 0x5648ffe86490 in flatview_write ../softmmu/physmem.c:2867 #12 0x5648ffe86d9f in address_space_write ../softmmu/physmem.c:2963 #13 0x5648ffe86e57 in address_space_rw ../softmmu/physmem.c:2973 #14 0x5648fffbfb3d in kvm_handle_io ../accel/kvm/kvm-all.c:2639 #15 0x5648fffc0e0d in kvm_cpu_exec ../accel/kvm/kvm-all.c:2890 #16 0x5648fffc90a7 in kvm_vcpu_thread_fn ../accel/kvm/kvm-accel-ops.c:51 #17 0x56490042400a in qemu_thread_start ../util/qemu-thread-posix.c:505 #18 0x7f00ac3b6ea4 in start_thread (/lib64/libpthread.so.0+0x7ea4) Free the vsc->inflight at the 'stop' path. Fixes: b82526c ("vhost-scsi: support inflight io track") Cc: Joe Jin <joe.jin@oracle.com> Cc: Li Feng <fengli@smartx.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Message-Id: <20230104160433.21353-1-dongli.zhang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes the appended use-after-free. The root cause is that during tb invalidation we use CPU_FOREACH, and therefore to safely free a vCPU we must wait for an RCU grace period to elapse. $ x86_64-linux-user/qemu-x86_64 tests/tcg/x86_64-linux-user/munmap-pthread ================================================================= ==1800604==ERROR: AddressSanitizer: heap-use-after-free on address 0x62d0005f7418 at pc 0x5593da6704eb bp 0x7f4961a7ac70 sp 0x7f4961a7ac60 READ of size 8 at 0x62d0005f7418 thread T2 #0 0x5593da6704ea in tb_jmp_cache_inval_tb ../accel/tcg/tb-maint.c:244 #1 0x5593da6704ea in do_tb_phys_invalidate ../accel/tcg/tb-maint.c:290 #2 0x5593da670631 in tb_phys_invalidate__locked ../accel/tcg/tb-maint.c:306 #3 0x5593da670631 in tb_invalidate_phys_page_range__locked ../accel/tcg/tb-maint.c:542 #4 0x5593da67106d in tb_invalidate_phys_range ../accel/tcg/tb-maint.c:614 #5 0x5593da6a64d4 in target_munmap ../linux-user/mmap.c:766 #6 0x5593da6dba05 in do_syscall1 ../linux-user/syscall.c:10105 #7 0x5593da6f564c in do_syscall ../linux-user/syscall.c:13329 #8 0x5593da49e80c in cpu_loop ../linux-user/x86_64/../i386/cpu_loop.c:233 #9 0x5593da6be28c in clone_func ../linux-user/syscall.c:6633 #10 0x7f496231cb42 in start_thread nptl/pthread_create.c:442 #11 0x7f49623ae9ff (/lib/x86_64-linux-gnu/libc.so.6+0x1269ff) 0x62d0005f7418 is located 28696 bytes inside of 32768-byte region [0x62d0005f0400,0x62d0005f8400) freed by thread T148 here: #0 0x7f49627b6460 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:52 #1 0x5593da5ac057 in cpu_exec_unrealizefn ../cpu.c:180 #2 0x5593da81f851 (/home/cota/src/qemu/build/qemu-x86_64+0x484851) Signed-off-by: Emilio Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230111151628.320011-2-cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230124180127.1881110-27-alex.bennee@linaro.org>
Fixes this tsan crash, easy to reproduce with any large enough program: $ tests/unit/test-qht 1..2 ThreadSanitizer: CHECK failed: sanitizer_deadlock_detector.h:67 "((n_all_locks_)) < (((sizeof(all_locks_with_contexts_)/sizeof((all_locks_with_contexts_)[0]))))" (0x40, 0x40) (tid=1821568) #0 __tsan::CheckUnwind() ../../../../src/libsanitizer/tsan/tsan_rtl.cpp:353 (libtsan.so.2+0x90034) #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ../../../../src/libsanitizer/sanitizer_common/sanitizer_termination.cpp:86 (libtsan.so.2+0xca555) #2 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::addLock(unsigned long, unsigned long, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:67 (libtsan.so.2+0xb3616) #3 __sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::addLock(unsigned long, unsigned long, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:59 (libtsan.so.2+0xb3616) #4 __sanitizer::DeadlockDetector<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >::onLockAfter(__sanitizer::DeadlockDetectorTLS<__sanitizer::TwoLevelBitVector<1ul, __sanitizer::BasicBitVector<unsigned long> > >*, unsigned long, unsigned int) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector.h:216 (libtsan.so.2+0xb3616) #5 __sanitizer::DD::MutexAfterLock(__sanitizer::DDCallback*, __sanitizer::DDMutex*, bool, bool) ../../../../src/libsanitizer/sanitizer_common/sanitizer_deadlock_detector1.cpp:169 (libtsan.so.2+0xb3616) #6 __tsan::MutexPostLock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned int, int) ../../../../src/libsanitizer/tsan/tsan_rtl_mutex.cpp:200 (libtsan.so.2+0xa3382) #7 __tsan_mutex_post_lock ../../../../src/libsanitizer/tsan/tsan_interface_ann.cpp:384 (libtsan.so.2+0x76bc3) #8 qemu_spin_lock /home/cota/src/qemu/include/qemu/thread.h:259 (test-qht+0x44a97) #9 qht_map_lock_buckets ../util/qht.c:253 (test-qht+0x44a97) #10 do_qht_iter ../util/qht.c:809 (test-qht+0x45f33) #11 qht_iter ../util/qht.c:821 (test-qht+0x45f33) #12 iter_check ../tests/unit/test-qht.c:121 (test-qht+0xe473) #13 qht_do_test ../tests/unit/test-qht.c:202 (test-qht+0xe473) #14 qht_test ../tests/unit/test-qht.c:240 (test-qht+0xe7c1) #15 test_default ../tests/unit/test-qht.c:246 (test-qht+0xe828) #16 <null> <null> (libglib-2.0.so.0+0x7daed) #17 <null> <null> (libglib-2.0.so.0+0x7d80a) #18 <null> <null> (libglib-2.0.so.0+0x7d80a) #19 g_test_run_suite <null> (libglib-2.0.so.0+0x7dfe9) #20 g_test_run <null> (libglib-2.0.so.0+0x7e055) #21 main ../tests/unit/test-qht.c:259 (test-qht+0xd2c6) #22 __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 (libc.so.6+0x29d8f) #23 __libc_start_main_impl ../csu/libc-start.c:392 (libc.so.6+0x29e3f) #24 _start <null> (test-qht+0xdb44) Signed-off-by: Emilio Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230111151628.320011-5-cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20230124180127.1881110-30-alex.bennee@linaro.org>
address_space_map() can fail: uart:~$ hash test sha256_test tv[0]: Segmentation fault: 11 Thread 3 "qemu-system-arm" received signal SIGSEGV, Segmentation fault. gen_acc_mode_iov (req_len=0x7ffff18b7778, id=<optimized out>, iov=0x7ffff18b7780, s=0x555556ce0bd0) at ../hw/misc/aspeed_hace.c:171 171 if (has_padding(s, &iov[id], *req_len, &total_msg_len, &pad_offset)) { (gdb) bt #0 gen_acc_mode_iov (req_len=0x7ffff18b7778, id=<optimized out>, iov=0x7ffff18b7780, s=0x555556ce0bd0) at ../hw/misc/aspeed_hace.c:171 #1 do_hash_operation (s=s@entry=0x555556ce0bd0, algo=3, sg_mode=sg_mode@entry=true, acc_mode=acc_mode@entry=true) at ../hw/misc/aspeed_hace.c:224 #2 0x00005555559bdbb8 in aspeed_hace_write (opaque=<optimized out>, addr=12, data=262488, size=<optimized out>) at ../hw/misc/aspeed_hace.c:358 This change doesn't fix much, but at least the guest can't crash QEMU anymore. Instead it is still usable: uart:~$ hash test sha256_test tv[0]:hash_final error sha384_test tv[0]:hash_final error sha512_test tv[0]:hash_final error [00:00:06.278,000] <err> hace_global: HACE poll timeout [00:00:09.324,000] <err> hace_global: HACE poll timeout [00:00:12.261,000] <err> hace_global: HACE poll timeout uart:~$ Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Delevoryas <peter@pjd.dev> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
After live migration with virtio block device, qemu crash at: #0 0x000055914f46f795 in object_dynamic_cast_assert (obj=0x559151b7b090, typename=0x55914f80fbc4 "qio-channel", file=0x55914f80fb90 "/images/testvfe/sw/qemu.gerrit/include/io/channel.h", line=30, func=0x55914f80fcb8 <__func__.17257> "QIO_CHANNEL") at ../qom/object.c:872 #1 0x000055914f480d68 in QIO_CHANNEL (obj=0x559151b7b090) at /images/testvfe/sw/qemu.gerrit/include/io/channel.h:29 #2 0x000055914f4812f8 in qio_net_listener_set_client_func_full (listener=0x559151b7a720, func=0x55914f580b97 <tcp_chr_accept>, data=0x5591519f4ea0, notify=0x0, context=0x0) at ../io/net-listener.c:166 #3 0x000055914f580059 in tcp_chr_update_read_handler (chr=0x5591519f4ea0) at ../chardev/char-socket.c:637 #4 0x000055914f583dca in qemu_chr_be_update_read_handlers (s=0x5591519f4ea0, context=0x0) at ../chardev/char.c:226 #5 0x000055914f57b7c9 in qemu_chr_fe_set_handlers_full (b=0x559152bf23a0, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=false, sync_state=true) at ../chardev/char-fe.c:279 #6 0x000055914f57b86d in qemu_chr_fe_set_handlers (b=0x559152bf23a0, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=false) at ../chardev/char-fe.c:304 #7 0x000055914f378caf in vhost_user_async_close (d=0x559152bf21a0, chardev=0x559152bf23a0, vhost=0x559152bf2420, cb=0x55914f2fb8c1 <vhost_user_blk_disconnect>) at ../hw/virtio/vhost-user.c:2725 #8 0x000055914f2fba40 in vhost_user_blk_event (opaque=0x559152bf21a0, event=CHR_EVENT_CLOSED) at ../hw/block/vhost-user-blk.c:395 #9 0x000055914f58388c in chr_be_event (s=0x5591519f4ea0, event=CHR_EVENT_CLOSED) at ../chardev/char.c:61 #10 0x000055914f583905 in qemu_chr_be_event (s=0x5591519f4ea0, event=CHR_EVENT_CLOSED) at ../chardev/char.c:81 #11 0x000055914f581275 in char_socket_finalize (obj=0x5591519f4ea0) at ../chardev/char-socket.c:1083 #12 0x000055914f46f073 in object_deinit (obj=0x5591519f4ea0, type=0x5591519055c0) at ../qom/object.c:680 #13 0x000055914f46f0e5 in object_finalize (data=0x5591519f4ea0) at ../qom/object.c:694 #14 0x000055914f46ff06 in object_unref (objptr=0x5591519f4ea0) at ../qom/object.c:1202 #15 0x000055914f4715a4 in object_finalize_child_property (obj=0x559151b76c50, name=0x559151b7b250 "char3", opaque=0x5591519f4ea0) at ../qom/object.c:1747 #16 0x000055914f46ee86 in object_property_del_all (obj=0x559151b76c50) at ../qom/object.c:632 #17 0x000055914f46f0d2 in object_finalize (data=0x559151b76c50) at ../qom/object.c:693 #18 0x000055914f46ff06 in object_unref (objptr=0x559151b76c50) at ../qom/object.c:1202 #19 0x000055914f4715a4 in object_finalize_child_property (obj=0x559151b6b560, name=0x559151b76630 "chardevs", opaque=0x559151b76c50) at ../qom/object.c:1747 #20 0x000055914f46ef67 in object_property_del_child (obj=0x559151b6b560, child=0x559151b76c50) at ../qom/object.c:654 #21 0x000055914f46f042 in object_unparent (obj=0x559151b76c50) at ../qom/object.c:673 #22 0x000055914f58632a in qemu_chr_cleanup () at ../chardev/char.c:1189 #23 0x000055914f16c66c in qemu_cleanup () at ../softmmu/runstate.c:830 #24 0x000055914eee7b9e in qemu_default_main () at ../softmmu/main.c:38 #25 0x000055914eee7bcc in main (argc=86, argv=0x7ffc97cb8d88) at ../softmmu/main.c:48 In char_socket_finalize after s->listener freed, event callback function vhost_user_blk_event will be called to handle CHR_EVENT_CLOSED. vhost_user_blk_event is calling qio_net_listener_set_client_func_full which is still using s->listener. Setting s->listener = NULL after object_unref(OBJECT(s->listener)) can solve this issue. Signed-off-by: Yajun Wu <yajunw@nvidia.com> Acked-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20230214021430.3638579-1-yajunw@nvidia.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This leakage can be seen through test-io-channel-tls: $ ../configure --target-list=aarch64-softmmu --enable-sanitizers $ make ./tests/unit/test-io-channel-tls $ ./tests/unit/test-io-channel-tls Indirect leak of 104 byte(s) in 1 object(s) allocated from: #0 0x7f81d1725808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 #1 0x7f81d135ae98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98) #2 0x55616c5d4c1b in object_new_with_propv ../qom/object.c:795 #3 0x55616c5d4a83 in object_new_with_props ../qom/object.c:768 #4 0x55616c5c5415 in test_tls_creds_create ../tests/unit/test-io-channel-tls.c:70 #5 0x55616c5c5a6b in test_io_channel_tls ../tests/unit/test-io-channel-tls.c:158 #6 0x7f81d137d58d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a58d) Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f81d1725a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153 #1 0x7f81d1472a20 in gnutls_dh_params_init (/lib/x86_64-linux-gnu/libgnutls.so.30+0x46a20) #2 0x55616c6485ff in qcrypto_tls_creds_x509_load ../crypto/tlscredsx509.c:634 #3 0x55616c648ba2 in qcrypto_tls_creds_x509_complete ../crypto/tlscredsx509.c:694 #4 0x55616c5e1fea in user_creatable_complete ../qom/object_interfaces.c:28 #5 0x55616c5d4c8c in object_new_with_propv ../qom/object.c:807 #6 0x55616c5d4a83 in object_new_with_props ../qom/object.c:768 #7 0x55616c5c5415 in test_tls_creds_create ../tests/unit/test-io-channel-tls.c:70 #8 0x55616c5c5a6b in test_io_channel_tls ../tests/unit/test-io-channel-tls.c:158 #9 0x7f81d137d58d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a58d) ... SUMMARY: AddressSanitizer: 49143 byte(s) leaked in 184 allocation(s). The docs for `g_source_add_child_source(source, child_source)` says "source will hold a reference on child_source while child_source is attached to it." Therefore, we should unreference the child source at `qio_channel_tls_read_watch()` after attaching it to `source`. With this change, ./tests/unit/test-io-channel-tls shows no leakages. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
xbzrle_encode_buffer_avx512() checks for overflows too scarcely in its outer loop, causing out-of-bounds writes: $ ../configure --target-list=aarch64-softmmu --enable-sanitizers --enable-avx512bw $ make tests/unit/test-xbzrle && ./tests/unit/test-xbzrle ==5518==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62100000b100 at pc 0x561109a7714d bp 0x7ffed712a440 sp 0x7ffed712a430 WRITE of size 1 at 0x62100000b100 thread T0 #0 0x561109a7714c in uleb128_encode_small ../util/cutils.c:831 #1 0x561109b67f6a in xbzrle_encode_buffer_avx512 ../migration/xbzrle.c:275 #2 0x5611099a7428 in test_encode_decode_overflow ../tests/unit/test-xbzrle.c:153 #3 0x7fb2fb65a58d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a58d) #4 0x7fb2fb65a333 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a333) #5 0x7fb2fb65aa79 in g_test_run_suite (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7aa79) #6 0x7fb2fb65aa94 in g_test_run (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7aa94) #7 0x5611099a3a23 in main ../tests/unit/test-xbzrle.c:218 #8 0x7fb2fa78c082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082) #9 0x5611099a608d in _start (/qemu/build/tests/unit/test-xbzrle+0x28408d) 0x62100000b100 is located 0 bytes to the right of 4096-byte region [0x62100000a100,0x62100000b100) allocated by thread T0 here: #0 0x7fb2fb823a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153 #1 0x7fb2fb637ef0 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57ef0) Fix that by performing the overflow check in the inner loop, instead. Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
For ex, when resetting the xlnx-zcu102 machine: (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x50) * frame #0: 0x10020a740 gd_vc_send_chars(vc=0x000000000) at gtk.c:1759:41 [opt] frame #1: 0x100636264 qemu_chr_fe_accept_input(be=<unavailable>) at char-fe.c:159:9 [opt] frame #2: 0x1000608e0 cadence_uart_reset_hold [inlined] uart_rx_reset(s=0x10810a960) at cadence_uart.c:158:5 [opt] frame #3: 0x1000608d4 cadence_uart_reset_hold(obj=0x10810a960) at cadence_uart.c:530:5 [opt] frame #4: 0x100580ab4 resettable_phase_hold(obj=0x10810a960, opaque=0x000000000, type=<unavailable>) at resettable.c:0 [opt] frame #5: 0x10057d1b0 bus_reset_child_foreach(obj=<unavailable>, cb=(resettable_phase_hold at resettable.c:162), opaque=0x000000000, type=RESET_TYPE_COLD) at bus.c:97:13 [opt] frame #6: 0x1005809f8 resettable_phase_hold [inlined] resettable_child_foreach(rc=0x000060000332d2c0, obj=0x0000600002c1c180, cb=<unavailable>, opaque=0x000000000, type=RESET_TYPE_COLD) at resettable.c:96:9 [opt] frame #7: 0x1005809d8 resettable_phase_hold(obj=0x0000600002c1c180, opaque=0x000000000, type=RESET_TYPE_COLD) at resettable.c:173:5 [opt] frame #8: 0x1005803a0 resettable_assert_reset(obj=0x0000600002c1c180, type=<unavailable>) at resettable.c:60:5 [opt] frame #9: 0x10058027c resettable_reset(obj=0x0000600002c1c180, type=RESET_TYPE_COLD) at resettable.c:45:5 [opt] While the chardev is created early, the VirtualConsole is associated after, during qemu_init_displays(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230220072251.3385878-1-marcandre.lureau@redhat.com>
blk_get_geometry() eventually calls bdrv_nb_sectors(), which is a co_wrapper_mixed_bdrv_rdlock. This means that when it is called from coroutine context, it already assume to have the graph locked. However, virtio_blk_sect_range_ok() in block/export/virtio-blk-handler.c (used by vhost-user-blk and VDUSE exports) runs in a coroutine, but doesn't take the graph lock - blk_*() functions are generally expected to do that internally. This causes an assertion failure when accessing an export for the first time if it runs in an iothread. This is an example of the crash: $ ./storage-daemon/qemu-storage-daemon --object iothread,id=th0 --blockdev file,filename=/home/kwolf/images/hd.img,node-name=disk --export vhost-user-blk,addr.type=unix,addr.path=/tmp/vhost.sock,node-name=disk,id=exp0,iothread=th0 qemu-storage-daemon: ../block/graph-lock.c:268: void assert_bdrv_graph_readable(void): Assertion `qemu_in_main_thread() || reader_count()' failed. (gdb) bt #0 0x00007ffff6eafe5c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007ffff6e5fa76 in raise () from /lib64/libc.so.6 #2 0x00007ffff6e497fc in abort () from /lib64/libc.so.6 #3 0x00007ffff6e4971b in __assert_fail_base.cold () from /lib64/libc.so.6 #4 0x00007ffff6e58656 in __assert_fail () from /lib64/libc.so.6 #5 0x00005555556337a3 in assert_bdrv_graph_readable () at ../block/graph-lock.c:268 #6 0x00005555555fd5a2 in bdrv_co_nb_sectors (bs=0x5555564c5ef0) at ../block.c:5847 #7 0x00005555555ee949 in bdrv_nb_sectors (bs=0x5555564c5ef0) at block/block-gen.c:256 #8 0x00005555555fd6b9 in bdrv_get_geometry (bs=0x5555564c5ef0, nb_sectors_ptr=0x7fffef7fedd0) at ../block.c:5884 #9 0x000055555562ad6d in blk_get_geometry (blk=0x5555564cb200, nb_sectors_ptr=0x7fffef7fedd0) at ../block/block-backend.c:1624 #10 0x00005555555ddb74 in virtio_blk_sect_range_ok (blk=0x5555564cb200, block_size=512, sector=0, size=512) at ../block/export/virtio-blk-handler.c:44 #11 0x00005555555dd80d in virtio_blk_process_req (handler=0x5555564cbb98, in_iov=0x7fffe8003830, out_iov=0x7fffe8003860, in_num=1, out_num=0) at ../block/export/virtio-blk-handler.c:189 #12 0x00005555555dd546 in vu_blk_virtio_process_req (opaque=0x7fffe8003800) at ../block/export/vhost-user-blk-server.c:66 #13 0x00005555557bf4a1 in coroutine_trampoline (i0=-402635264, i1=32767) at ../util/coroutine-ucontext.c:177 #14 0x00007ffff6e75c20 in ?? () from /lib64/libc.so.6 #15 0x00007fffefffa870 in ?? () #16 0x0000000000000000 in ?? () Fix this by creating a new blk_co_get_geometry() that takes the lock, and changing blk_get_geometry() to be a co_wrapper_mixed around it. To make the resulting code cleaner, virtio-blk-handler.c can directly call the coroutine version now (though that wouldn't be necessary for fixing the bug, taking the lock in blk_co_get_geometry() is what fixes it). Fixes: 8ab8140 Reported-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230327113959.60071-1-kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just updated master-dev to git 5e79823 and tried to build apt in a fresh unstable chroot which previously failed with timeouts when running /usr/bin/msgmerge:
Generating ../build/po/domains/apt/ast.po echo ../build/po/domains/apt/ast.po : ast.po ../build/po/apt.pot > ../build/po/apt_ast.po.d
/usr/bin//msgmerge --add-location=file ast.po ../build/po/apt.pot -o ../build/po/domains/apt/ast.po
............E: Caught signal ‘Terminated’: terminating immediately
make[2]: *** wait: No child processes. Stop.
make[2]: *** Waiting for unfinished jobs....
make[2]: *** wait: No child processes. Stop.
make[1]: *** wait: No child processes. Stop.
make[1]: *** Waiting for unfinished jobs....
make[1]: *** wait: No child processes. Stop.
make: *** wait: No child processes. Stop.
make: *** Waiting for unfinished jobs....
make: *** wait: No child processes. Stop.
Build killed with signal TERM after 30 minutes of inactivity
The build still fails, but this time not with a timeout but with an illegal instruction:
Generating ../build/po/domains/apt/ar.po echo ../build/po/domains/apt/ar.po : ar.po ../build/po/apt.pot > ../build/po/apt_ar.po.d
/usr/bin//msgmerge --add-location=file ar.po ../build/po/apt.pot -o ../build/po/domains/apt/ar.po
qemu: fatal: Illegal instruction: 2f0a @ f615cd26
D0 = 00000004 A0 = 80093c70 F0 = 7fff 4000000000000000
D1 = 00000000 A1 = 80015150 F1 = 7fff 4000000000000000
D2 = f6fff428 A2 = f42ff91c F2 = 7fff 4000000000000000
D3 = 80093c84 A3 = 80003bf0 F3 = 7fff 4000000000000000
D4 = f42ff400 A4 = f6fff22c F4 = 7fff 4000000000000000
D5 = 00001000 A5 = f6168000 F5 = 7fff 4000000000000000
D6 = f614b3e0 A6 = f42feef0 F6 = 7fff 4000000000000000
D7 = 00800000 A7 = f42feed0 F7 = 7fff 4000000000000000
PC = f615cd22 SR = 0008 -N--- makefile:60: recipe for target '../build/po/domains/apt/ar.po' failed
Attaching the build log.
apt_1.1.10_m68k-20160111-1534.build.zip
The text was updated successfully, but these errors were encountered: