Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

typofix - https://github.com/vlajos/misspell_fixer #28

Closed
wants to merge 1 commit into from

Conversation

vlajos
Copy link
Contributor

@vlajos vlajos commented Aug 7, 2015

No description provided.

@crobinso
Copy link
Contributor

crobinso commented Aug 9, 2015

Thanks for the contribution, but qemu does not accept patches via pull-request. Please close this pull-request and submit your patch to the mailing list:

http://wiki.qemu.org/Contribute/SubmitAPatch

@vlajos
Copy link
Contributor Author

vlajos commented Aug 9, 2015

OK. I have just sent an email to the qemu-trivial@nongnu.org list. (I hope it is fine.)

@vlajos vlajos closed this Aug 9, 2015
lalrae pushed a commit to lalrae/qemu that referenced this pull request Apr 18, 2016
Do not clear BEV for MIPS malta kernel load

Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
qemu-deploy pushed a commit that referenced this pull request Jun 16, 2016
Back in the 2.3.0 release we declared qcow[2] encryption as
deprecated, warning people that it would be removed in a future
release.

  commit a1f688f
  Author: Markus Armbruster <armbru@redhat.com>
  Date:   Fri Mar 13 21:09:40 2015 +0100

    block: Deprecate QCOW/QCOW2 encryption

The code still exists today, but by a (happy?) accident we entirely
broke the ability to use qcow[2] encryption in the system emulators
in the 2.4.0 release due to

  commit 8336aaf
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Tue May 12 17:09:18 2015 +0100

    qcow2/qcow: protect against uninitialized encryption key

This commit was designed to prevent future coding bugs which
might cause QEMU to read/write data on an encrypted block
device in plain text mode before a decryption key is set.

It turns out this preventative measure was a little too good,
because we already had a long standing bug where QEMU read
encrypted data in plain text mode during system emulator
startup, in order to guess disk geometry:

  Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)):
  #0  0x00007fffe90b1a28 in raise () at /lib64/libc.so.6
  #1  0x00007fffe90b362a in abort () at /lib64/libc.so.6
  #2  0x00007fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6
  #3  0x00007fffe90aa2d2 in  () at /lib64/libc.so.6
  #4  0x000055555587ae19 in qcow2_co_readv (bs=0x5555562accb0, sector_num=0, remaining_sectors=1, qiov=0x7fffffffd260) at block/qcow2.c:1229
  #5  0x000055555589b60d in bdrv_aligned_preadv (bs=bs@entry=0x5555562accb0, req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7fffffffd260, flags=0) at block/io.c:908
  #6  0x000055555589b8bc in bdrv_co_do_preadv (bs=0x5555562accb0, offset=0, bytes=512, qiov=0x7fffffffd260, flags=<optimized out>) at block/io.c:999
  #7  0x000055555589c375 in bdrv_rw_co_entry (opaque=0x7fffffffd210) at block/io.c:544
  #8  0x000055555586933b in coroutine_thread (opaque=0x555557876310) at coroutine-gthread.c:134
  #9  0x00007ffff64e1835 in g_thread_proxy (data=0x5555562b5590) at gthread.c:778
  #10 0x00007ffff6bb760a in start_thread () at /lib64/libpthread.so.0
  #11 0x00007fffe917f59d in clone () at /lib64/libc.so.6

  Thread 1 (Thread 0x7ffff7ecab40 (LWP 30343)):
  #0  0x00007fffe91797a9 in syscall () at /lib64/libc.so.6
  #1  0x00007ffff64ff87f in g_cond_wait (cond=cond@entry=0x555555e085f0 <coroutine_cond>, mutex=mutex@entry=0x555555e08600 <coroutine_lock>) at gthread-posix.c:1397
  #2  0x00005555558692c3 in qemu_coroutine_switch (co=<optimized out>) at coroutine-gthread.c:117
  #3  0x00005555558692c3 in qemu_coroutine_switch (from_=0x5555562b5e30, to_=to_@entry=0x555557876310, action=action@entry=COROUTINE_ENTER) at coroutine-gthread.c:175
  #4  0x0000555555868a90 in qemu_coroutine_enter (co=0x555557876310, opaque=0x0) at qemu-coroutine.c:116
  #5  0x0000555555859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) at thread-pool.c:187
  #6  0x0000555555859514 in aio_bh_poll (ctx=ctx@entry=0x5555562953b0) at async.c:85
  #7  0x0000555555864d10 in aio_dispatch (ctx=ctx@entry=0x5555562953b0) at aio-posix.c:135
  #8  0x0000555555864f75 in aio_poll (ctx=ctx@entry=0x5555562953b0, blocking=blocking@entry=true) at aio-posix.c:291
  #9  0x000055555589c40d in bdrv_prwv_co (bs=bs@entry=0x5555562accb0, offset=offset@entry=0, qiov=qiov@entry=0x7fffffffd260, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591
  #10 0x000055555589c503 in bdrv_rw_co (bs=bs@entry=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:614
  #11 0x000055555589c562 in bdrv_read_unthrottled (nb_sectors=21845, buf=0x7fffffffd2e0 "\321,", sector_num=0, bs=0x5555562accb0) at block/io.c:622
  #12 0x000055555589c562 in bdrv_read_unthrottled (bs=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845) at block/io.c:634
    nb_sectors@entry=1) at block/block-backend.c:504
  #14 0x0000555555752e9f in guess_disk_lchs (blk=blk@entry=0x5555562a5290, pcylinders=pcylinders@entry=0x7fffffffd52c, pheads=pheads@entry=0x7fffffffd530, psectors=psectors@entry=0x7fffffffd534) at hw/block/hd-geometry.c:68
  #15 0x0000555555752ff7 in hd_geometry_guess (blk=0x5555562a5290, pcyls=pcyls@entry=0x555557875d1c, pheads=pheads@entry=0x555557875d20, psecs=psecs@entry=0x555557875d24, ptrans=ptrans@entry=0x555557875d28) at hw/block/hd-geometry.c:133
  #16 0x0000555555752b87 in blkconf_geometry (conf=conf@entry=0x555557875d00, ptrans=ptrans@entry=0x555557875d28, cyls_max=cyls_max@entry=65536, heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, errp=errp@entry=0x7fffffffd5e0) at hw/block/block.c:71
  #17 0x0000555555799bc4 in ide_dev_initfn (dev=0x555557875c80, kind=IDE_HD) at hw/ide/qdev.c:174
  #18 0x0000555555768394 in device_realize (dev=0x555557875c80, errp=0x7fffffffd640) at hw/core/qdev.c:247
  #19 0x0000555555769a81 in device_set_realized (obj=0x555557875c80, value=<optimized out>, errp=0x7fffffffd730) at hw/core/qdev.c:1058
  #20 0x00005555558240ce in property_set_bool (obj=0x555557875c80, v=<optimized out>, opaque=0x555557875de0, name=<optimized out>, errp=0x7fffffffd730)
        at qom/object.c:1514
  #21 0x0000555555826c87 in object_property_set_qobject (obj=obj@entry=0x555557875c80, value=value@entry=0x55555784bcb0, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/qom-qobject.c:24
  #22 0x0000555555825760 in object_property_set_bool (obj=obj@entry=0x555557875c80, value=value@entry=true, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/object.c:905
  #23 0x000055555576897b in qdev_init_nofail (dev=dev@entry=0x555557875c80) at hw/core/qdev.c:380
  #24 0x0000555555799ead in ide_create_drive (bus=bus@entry=0x555557629630, unit=unit@entry=0, drive=0x5555562b77e0) at hw/ide/qdev.c:122
  #25 0x000055555579a746 in pci_ide_create_devs (dev=dev@entry=0x555557628db0, hd_table=hd_table@entry=0x7fffffffd830) at hw/ide/pci.c:440
  #26 0x000055555579b165 in pci_piix3_ide_init (bus=<optimized out>, hd_table=0x7fffffffd830, devfn=<optimized out>) at hw/ide/piix.c:218
  #27 0x000055555568ca55 in pc_init1 (machine=0x5555562960a0, pci_enabled=1, kvmclock_enabled=<optimized out>) at /home/berrange/src/virt/qemu/hw/i386/pc_piix.c:256
  #28 0x0000555555603ab2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4249

So the safety net is correctly preventing QEMU reading cipher
text as if it were plain text, during startup and aborting QEMU
to avoid bad usage of this data.

For added fun this bug only happens if the encrypted qcow2
file happens to have data written to the first cluster,
otherwise the cluster won't be allocated and so qcow2 would
not try the decryption routines at all, just return all 0's.

That no one even noticed, let alone reported, this bug that
has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number
of actual users of encrypted qcow2 is approximately zero.

So rather than fix the crash, and backport it to stable
releases, just go ahead with what we have warned users about
and disable any use of qcow2 encryption in the system
emulators. qemu-img/qemu-io/qemu-nbd are still able to access
qcow2 encrypted images for the sake of data conversion.

In the future, qcow2 will gain support for the alternative
luks format, but when this happens it'll be using the
'-object secret' infrastructure for getting keys, which
avoids this problematic scenario entirely.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
0day-ci pushed a commit to 0day-ci/qemu that referenced this pull request Jul 28, 2016
ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20)
    qemu#1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58)
    qemu#2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896
    qemu#3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367
    qemu#4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844
    qemu#5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333
    qemu#6 0x556a187b650b in ide_start_dma hw/ide/core.c:921
    qemu#7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911
    qemu#8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486
    qemu#9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027
    qemu#10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204
    qemu#11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254
    qemu#12 0x556a187d168a in check_cmd hw/ide/ahci.c:510
    qemu#13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314
    qemu#14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435
    qemu#15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525
    qemu#16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591
    qemu#17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262
    qemu#18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578
    qemu#19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635
    qemu#20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737
    qemu#21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746
    qemu#22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72
    qemu#23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382
    qemu#24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573
    qemu#25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585
    qemu#26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387
    qemu#27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399
    qemu#28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902
    qemu#29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84

Follow John Snow recommendation:
  Everywhere else ncq_err is used, it is accompanied by a list cleanup
  except for ncq_cb, which is the case you are fixing here.

  Move the sglist destruction inside of ncq_err and then delete it from
  the other two locations to keep it tidy.

  Call dma_buf_commit in ide_dma_cb after the early return. Though, this
  is also a little wonky because this routine does more than clear the
  list, but it is at the moment the centralized "we're done with the
  sglist" function and none of the other side effects that occur in
  dma_buf_commit will interfere with the reset that occurs from
  ide_restart_bh, I think

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
0day-ci pushed a commit to 0day-ci/qemu that referenced this pull request Aug 2, 2016
The vnc_server_info_get will allocate the VncServerInfo
struct and then call vnc_init_basic_info_from_server_addr
to populate the basic fields. If this returns an error
though, the qapi_free_VncServerInfo call will then crash
because the VncServerInfo struct instance was not properly
NULL-initialized and thus contains random stack garbage.

 #0  0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6
 qemu#1  0x00007f1987c902fa in abort () at /lib64/libc.so.6
 qemu#2  0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6
 qemu#3  0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6
 qemu#4  0x00007f1987cdb2ac in free () at /lib64/libc.so.6
 qemu#5  0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0
 qemu#6  0x0000559193cdcf54 in visit_type_str (v=v@entry=
     0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80)
     at qapi/qapi-visit-core.c:255
 qemu#7  0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry=
     0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307
 qemu#8  0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry=
     0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632
 qemu#9  0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry=
     0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658
 qemu#10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970
 qemu#11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233
 qemu#12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284
 qemu#13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039
 qemu#14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>)
     at ui/vnc.c:3877
 qemu#15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90)
     at qmp-marshal.c:105
 qemu#16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3971
 qemu#17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105
 qemu#18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false)
     at qobject/json-lexer.c:319
 qemu#19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-lexer.c:369
 qemu#20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-streamer.c:124
 qemu#21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3987
 qemu#22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900)
     at qemu-char.c:2895
 qemu#23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 qemu#24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213
 qemu#25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 qemu#26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 qemu#27 0x0000559193964c55 in main () at vl.c:1908
 qemu#28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603

This was introduced in

  commit 98481bf
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Oct 26 16:34:45 2015 -0600

    vnc: Hoist allocation of VncBasicInfo to callers

which added error reporting for vnc_init_basic_info_from_server_addr
but didn't change the g_malloc calls to g_malloc0.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu-deploy pushed a commit that referenced this pull request Aug 4, 2016
The vnc_server_info_get will allocate the VncServerInfo
struct and then call vnc_init_basic_info_from_server_addr
to populate the basic fields. If this returns an error
though, the qapi_free_VncServerInfo call will then crash
because the VncServerInfo struct instance was not properly
NULL-initialized and thus contains random stack garbage.

 #0  0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6
 #1  0x00007f1987c902fa in abort () at /lib64/libc.so.6
 #2  0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6
 #3  0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6
 #4  0x00007f1987cdb2ac in free () at /lib64/libc.so.6
 #5  0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0
 #6  0x0000559193cdcf54 in visit_type_str (v=v@entry=
     0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80)
     at qapi/qapi-visit-core.c:255
 #7  0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry=
     0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307
 #8  0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry=
     0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632
 #9  0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry=
     0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658
 #10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970
 #11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233
 #12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284
 #13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039
 #14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>)
     at ui/vnc.c:3877
 #15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90)
     at qmp-marshal.c:105
 #16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3971
 #17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105
 #18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false)
     at qobject/json-lexer.c:319
 #19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-lexer.c:369
 #20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-streamer.c:124
 #21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3987
 #22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900)
     at qemu-char.c:2895
 #23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 #24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213
 #25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 #26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 #27 0x0000559193964c55 in main () at vl.c:1908
 #28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603

This was introduced in

  commit 98481bf
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Oct 26 16:34:45 2015 -0600

    vnc: Hoist allocation of VncBasicInfo to callers

which added error reporting for vnc_init_basic_info_from_server_addr
but didn't change the g_malloc calls to g_malloc0.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1470134726-15697-3-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
elmarco added a commit to elmarco/qemu that referenced this pull request Aug 7, 2016
ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20)
    #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58)
    #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896
    qemu#3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367
    qemu#4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844
    qemu#5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333
    qemu#6 0x556a187b650b in ide_start_dma hw/ide/core.c:921
    qemu#7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911
    qemu#8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486
    qemu#9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027
    qemu#10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204
    qemu#11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254
    qemu#12 0x556a187d168a in check_cmd hw/ide/ahci.c:510
    qemu#13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314
    qemu#14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435
    qemu#15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525
    qemu#16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591
    qemu#17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262
    qemu#18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578
    qemu#19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635
    qemu#20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737
    qemu#21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746
    qemu#22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72
    qemu#23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382
    qemu#24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573
    qemu#25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585
    qemu#26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387
    qemu#27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399
    qemu#28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902
    qemu#29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84

Follow John Snow recommendation:
  Everywhere else ncq_err is used, it is accompanied by a list cleanup
  except for ncq_cb, which is the case you are fixing here.

  Move the sglist destruction inside of ncq_err and then delete it from
  the other two locations to keep it tidy.

  Call dma_buf_commit in ide_dma_cb after the early return. Though, this
  is also a little wonky because this routine does more than clear the
  list, but it is at the moment the centralized "we're done with the
  sglist" function and none of the other side effects that occur in
  dma_buf_commit will interfere with the reset that occurs from
  ide_restart_bh, I think

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
qemu-deploy pushed a commit that referenced this pull request Aug 17, 2016
ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20)
    #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58)
    #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896
    #3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367
    #4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844
    #5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333
    #6 0x556a187b650b in ide_start_dma hw/ide/core.c:921
    #7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911
    #8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486
    #9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027
    #10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204
    #11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254
    #12 0x556a187d168a in check_cmd hw/ide/ahci.c:510
    #13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314
    #14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435
    #15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525
    #16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591
    #17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262
    #18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578
    #19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635
    #20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737
    #21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746
    #22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72
    #23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382
    #24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573
    #25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585
    #26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387
    #27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399
    #28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902
    #29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84

Follow John Snow recommendation:
  Everywhere else ncq_err is used, it is accompanied by a list cleanup
  except for ncq_cb, which is the case you are fixing here.

  Move the sglist destruction inside of ncq_err and then delete it from
  the other two locations to keep it tidy.

  Call dma_buf_commit in ide_dma_cb after the early return. Though, this
  is also a little wonky because this routine does more than clear the
  list, but it is at the moment the centralized "we're done with the
  sglist" function and none of the other side effects that occur in
  dma_buf_commit will interfere with the reset that occurs from
  ide_restart_bh, I think

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 5839df7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qemu-deploy pushed a commit that referenced this pull request Sep 29, 2016
The vnc_server_info_get will allocate the VncServerInfo
struct and then call vnc_init_basic_info_from_server_addr
to populate the basic fields. If this returns an error
though, the qapi_free_VncServerInfo call will then crash
because the VncServerInfo struct instance was not properly
NULL-initialized and thus contains random stack garbage.

 #0  0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6
 #1  0x00007f1987c902fa in abort () at /lib64/libc.so.6
 #2  0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6
 #3  0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6
 #4  0x00007f1987cdb2ac in free () at /lib64/libc.so.6
 #5  0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0
 #6  0x0000559193cdcf54 in visit_type_str (v=v@entry=
     0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80)
     at qapi/qapi-visit-core.c:255
 #7  0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry=
     0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307
 #8  0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry=
     0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632
 #9  0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry=
     0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658
 #10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970
 #11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233
 #12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284
 #13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039
 #14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>)
     at ui/vnc.c:3877
 #15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90)
     at qmp-marshal.c:105
 #16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3971
 #17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105
 #18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false)
     at qobject/json-lexer.c:319
 #19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-lexer.c:369
 #20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-streamer.c:124
 #21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3987
 #22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900)
     at qemu-char.c:2895
 #23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 #24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213
 #25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 #26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 #27 0x0000559193964c55 in main () at vl.c:1908
 #28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603

This was introduced in

  commit 98481bf
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Oct 26 16:34:45 2015 -0600

    vnc: Hoist allocation of VncBasicInfo to callers

which added error reporting for vnc_init_basic_info_from_server_addr
but didn't change the g_malloc calls to g_malloc0.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1470134726-15697-3-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3e7f136)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
cuinutanix referenced this pull request in NXPower/qemu May 17, 2017
RH-Author: Markus Armbruster <armbru@redhat.com>
Message-id: <1469602394-4025-2-git-send-email-armbru@redhat.com>
Patchwork-id: 71467
O-Subject: [RHEV-7.3 qemu-kvm-rhev PATCH] block: drop support for using qcow[2] encryption with system emulators
Bugzilla: 1336659
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
RH-Acked-by: Daniel P. Berrange <berrange@redhat.com>

From: "Daniel P. Berrange" <berrange@redhat.com>

Back in the 2.3.0 release we declared qcow[2] encryption as
deprecated, warning people that it would be removed in a future
release.

  commit a1f688f
  Author: Markus Armbruster <armbru@redhat.com>
  Date:   Fri Mar 13 21:09:40 2015 +0100

    block: Deprecate QCOW/QCOW2 encryption

The code still exists today, but by a (happy?) accident we entirely
broke the ability to use qcow[2] encryption in the system emulators
in the 2.4.0 release due to

  commit 8336aaf
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Tue May 12 17:09:18 2015 +0100

    qcow2/qcow: protect against uninitialized encryption key

This commit was designed to prevent future coding bugs which
might cause QEMU to read/write data on an encrypted block
device in plain text mode before a decryption key is set.

It turns out this preventative measure was a little too good,
because we already had a long standing bug where QEMU read
encrypted data in plain text mode during system emulator
startup, in order to guess disk geometry:

  Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)):
  #0  0x00007fffe90b1a28 in raise () at /lib64/libc.so.6
  #1  0x00007fffe90b362a in abort () at /lib64/libc.so.6
  #2  0x00007fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6
  #3  0x00007fffe90aa2d2 in  () at /lib64/libc.so.6
  open-power-host-os#4  0x000055555587ae19 in qcow2_co_readv (bs=0x5555562accb0, sector_num=0, remaining_sectors=1, qiov=0x7fffffffd260) at block/qcow2.c:1229
  open-power-host-os#5  0x000055555589b60d in bdrv_aligned_preadv (bs=bs@entry=0x5555562accb0, req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7fffffffd260, flags=0) at block/io.c:908
  open-power-host-os#6  0x000055555589b8bc in bdrv_co_do_preadv (bs=0x5555562accb0, offset=0, bytes=512, qiov=0x7fffffffd260, flags=<optimized out>) at block/io.c:999
  open-power-host-os#7  0x000055555589c375 in bdrv_rw_co_entry (opaque=0x7fffffffd210) at block/io.c:544
  open-power-host-os#8  0x000055555586933b in coroutine_thread (opaque=0x555557876310) at coroutine-gthread.c:134
  open-power-host-os#9  0x00007ffff64e1835 in g_thread_proxy (data=0x5555562b5590) at gthread.c:778
  open-power-host-os#10 0x00007ffff6bb760a in start_thread () at /lib64/libpthread.so.0
  open-power-host-os#11 0x00007fffe917f59d in clone () at /lib64/libc.so.6

  Thread 1 (Thread 0x7ffff7ecab40 (LWP 30343)):
  #0  0x00007fffe91797a9 in syscall () at /lib64/libc.so.6
  #1  0x00007ffff64ff87f in g_cond_wait (cond=cond@entry=0x555555e085f0 <coroutine_cond>, mutex=mutex@entry=0x555555e08600 <coroutine_lock>) at gthread-posix.c:1397
  #2  0x00005555558692c3 in qemu_coroutine_switch (co=<optimized out>) at coroutine-gthread.c:117
  #3  0x00005555558692c3 in qemu_coroutine_switch (from_=0x5555562b5e30, to_=to_@entry=0x555557876310, action=action@entry=COROUTINE_ENTER) at coroutine-gthread.c:175
  open-power-host-os#4  0x0000555555868a90 in qemu_coroutine_enter (co=0x555557876310, opaque=0x0) at qemu-coroutine.c:116
  open-power-host-os#5  0x0000555555859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) at thread-pool.c:187
  open-power-host-os#6  0x0000555555859514 in aio_bh_poll (ctx=ctx@entry=0x5555562953b0) at async.c:85
  open-power-host-os#7  0x0000555555864d10 in aio_dispatch (ctx=ctx@entry=0x5555562953b0) at aio-posix.c:135
  open-power-host-os#8  0x0000555555864f75 in aio_poll (ctx=ctx@entry=0x5555562953b0, blocking=blocking@entry=true) at aio-posix.c:291
  open-power-host-os#9  0x000055555589c40d in bdrv_prwv_co (bs=bs@entry=0x5555562accb0, offset=offset@entry=0, qiov=qiov@entry=0x7fffffffd260, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591
  open-power-host-os#10 0x000055555589c503 in bdrv_rw_co (bs=bs@entry=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:614
  open-power-host-os#11 0x000055555589c562 in bdrv_read_unthrottled (nb_sectors=21845, buf=0x7fffffffd2e0 "\321,", sector_num=0, bs=0x5555562accb0) at block/io.c:622
  open-power-host-os#12 0x000055555589c562 in bdrv_read_unthrottled (bs=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845) at block/io.c:634
    nb_sectors@entry=1) at block/block-backend.c:504
  open-power-host-os#14 0x0000555555752e9f in guess_disk_lchs (blk=blk@entry=0x5555562a5290, pcylinders=pcylinders@entry=0x7fffffffd52c, pheads=pheads@entry=0x7fffffffd530, psectors=psectors@entry=0x7fffffffd534) at hw/block/hd-geometry.c:68
  open-power-host-os#15 0x0000555555752ff7 in hd_geometry_guess (blk=0x5555562a5290, pcyls=pcyls@entry=0x555557875d1c, pheads=pheads@entry=0x555557875d20, psecs=psecs@entry=0x555557875d24, ptrans=ptrans@entry=0x555557875d28) at hw/block/hd-geometry.c:133
  open-power-host-os#16 0x0000555555752b87 in blkconf_geometry (conf=conf@entry=0x555557875d00, ptrans=ptrans@entry=0x555557875d28, cyls_max=cyls_max@entry=65536, heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, errp=errp@entry=0x7fffffffd5e0) at hw/block/block.c:71
  open-power-host-os#17 0x0000555555799bc4 in ide_dev_initfn (dev=0x555557875c80, kind=IDE_HD) at hw/ide/qdev.c:174
  open-power-host-os#18 0x0000555555768394 in device_realize (dev=0x555557875c80, errp=0x7fffffffd640) at hw/core/qdev.c:247
  open-power-host-os#19 0x0000555555769a81 in device_set_realized (obj=0x555557875c80, value=<optimized out>, errp=0x7fffffffd730) at hw/core/qdev.c:1058
  open-power-host-os#20 0x00005555558240ce in property_set_bool (obj=0x555557875c80, v=<optimized out>, opaque=0x555557875de0, name=<optimized out>, errp=0x7fffffffd730)
        at qom/object.c:1514
  open-power-host-os#21 0x0000555555826c87 in object_property_set_qobject (obj=obj@entry=0x555557875c80, value=value@entry=0x55555784bcb0, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/qom-qobject.c:24
  open-power-host-os#22 0x0000555555825760 in object_property_set_bool (obj=obj@entry=0x555557875c80, value=value@entry=true, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/object.c:905
  open-power-host-os#23 0x000055555576897b in qdev_init_nofail (dev=dev@entry=0x555557875c80) at hw/core/qdev.c:380
  open-power-host-os#24 0x0000555555799ead in ide_create_drive (bus=bus@entry=0x555557629630, unit=unit@entry=0, drive=0x5555562b77e0) at hw/ide/qdev.c:122
  open-power-host-os#25 0x000055555579a746 in pci_ide_create_devs (dev=dev@entry=0x555557628db0, hd_table=hd_table@entry=0x7fffffffd830) at hw/ide/pci.c:440
  open-power-host-os#26 0x000055555579b165 in pci_piix3_ide_init (bus=<optimized out>, hd_table=0x7fffffffd830, devfn=<optimized out>) at hw/ide/piix.c:218
  open-power-host-os#27 0x000055555568ca55 in pc_init1 (machine=0x5555562960a0, pci_enabled=1, kvmclock_enabled=<optimized out>) at /home/berrange/src/virt/qemu/hw/i386/pc_piix.c:256
  open-power-host-os#28 0x0000555555603ab2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4249

So the safety net is correctly preventing QEMU reading cipher
text as if it were plain text, during startup and aborting QEMU
to avoid bad usage of this data.

For added fun this bug only happens if the encrypted qcow2
file happens to have data written to the first cluster,
otherwise the cluster won't be allocated and so qcow2 would
not try the decryption routines at all, just return all 0's.

That no one even noticed, let alone reported, this bug that
has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number
of actual users of encrypted qcow2 is approximately zero.

So rather than fix the crash, and backport it to stable
releases, just go ahead with what we have warned users about
and disable any use of qcow2 encryption in the system
emulators. qemu-img/qemu-io/qemu-nbd are still able to access
qcow2 encrypted images for the sake of data conversion.

In the future, qcow2 will gain support for the alternative
luks format, but when this happens it'll be using the
'-object secret' infrastructure for getting keys, which
avoids this problematic scenario entirely.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8c0dcbc)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
cuinutanix referenced this pull request in NXPower/qemu May 17, 2017
RH-Author: Gerd Hoffmann <kraxel@redhat.com>
Message-id: <1470920000-23113-3-git-send-email-kraxel@redhat.com>
Patchwork-id: 71947
O-Subject: [RHEL-7.3 qemu-kvm-rhev PATCH 2/3] vnc: fix crash when vnc_server_info_get has an error
Bugzilla: 1359655
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Marcel Apfelbaum <marcel@redhat.com>
RH-Acked-by: Markus Armbruster <armbru@redhat.com>

From: "Daniel P. Berrange" <berrange@redhat.com>

The vnc_server_info_get will allocate the VncServerInfo
struct and then call vnc_init_basic_info_from_server_addr
to populate the basic fields. If this returns an error
though, the qapi_free_VncServerInfo call will then crash
because the VncServerInfo struct instance was not properly
NULL-initialized and thus contains random stack garbage.

 #0  0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6
 #1  0x00007f1987c902fa in abort () at /lib64/libc.so.6
 #2  0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6
 #3  0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6
 open-power-host-os#4  0x00007f1987cdb2ac in free () at /lib64/libc.so.6
 open-power-host-os#5  0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0
 open-power-host-os#6  0x0000559193cdcf54 in visit_type_str (v=v@entry=
     0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80)
     at qapi/qapi-visit-core.c:255
 open-power-host-os#7  0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry=
     0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307
 open-power-host-os#8  0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry=
     0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632
 open-power-host-os#9  0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry=
     0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658
 open-power-host-os#10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970
 open-power-host-os#11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233
 open-power-host-os#12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284
 open-power-host-os#13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039
 open-power-host-os#14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>)
     at ui/vnc.c:3877
 open-power-host-os#15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90)
     at qmp-marshal.c:105
 open-power-host-os#16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3971
 open-power-host-os#17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105
 open-power-host-os#18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false)
     at qobject/json-lexer.c:319
 open-power-host-os#19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-lexer.c:369
 open-power-host-os#20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-streamer.c:124
 open-power-host-os#21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3987
 open-power-host-os#22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900)
     at qemu-char.c:2895
 open-power-host-os#23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 open-power-host-os#24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213
 open-power-host-os#25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 open-power-host-os#26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 open-power-host-os#27 0x0000559193964c55 in main () at vl.c:1908
 open-power-host-os#28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603

This was introduced in

  commit 98481bf
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Oct 26 16:34:45 2015 -0600

    vnc: Hoist allocation of VncBasicInfo to callers

which added error reporting for vnc_init_basic_info_from_server_addr
but didn't change the g_malloc calls to g_malloc0.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1470134726-15697-3-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3e7f136)
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
cuinutanix referenced this pull request in NXPower/qemu May 17, 2017
RH-Author: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: <20161129073543.13711-11-marcandre.lureau@redhat.com>
Patchwork-id: 72914
O-Subject: [RHEV-7.3.z qemu-kvm-rhev PATCH 10/10] ahci: fix sglist leak on retry
Bugzilla: 1397745
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>

ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20)
    #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58)
    #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896
    #3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367
    open-power-host-os#4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844
    open-power-host-os#5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333
    open-power-host-os#6 0x556a187b650b in ide_start_dma hw/ide/core.c:921
    open-power-host-os#7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911
    open-power-host-os#8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486
    open-power-host-os#9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027
    open-power-host-os#10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204
    open-power-host-os#11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254
    open-power-host-os#12 0x556a187d168a in check_cmd hw/ide/ahci.c:510
    open-power-host-os#13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314
    open-power-host-os#14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435
    open-power-host-os#15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525
    open-power-host-os#16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591
    open-power-host-os#17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262
    open-power-host-os#18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578
    open-power-host-os#19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635
    open-power-host-os#20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737
    open-power-host-os#21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746
    open-power-host-os#22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72
    open-power-host-os#23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382
    open-power-host-os#24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573
    open-power-host-os#25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585
    open-power-host-os#26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387
    open-power-host-os#27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399
    open-power-host-os#28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902
    open-power-host-os#29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84

Follow John Snow recommendation:
  Everywhere else ncq_err is used, it is accompanied by a list cleanup
  except for ncq_cb, which is the case you are fixing here.

  Move the sglist destruction inside of ncq_err and then delete it from
  the other two locations to keep it tidy.

  Call dma_buf_commit in ide_dma_cb after the early return. Though, this
  is also a little wonky because this routine does more than clear the
  list, but it is at the moment the centralized "we're done with the
  sglist" function and none of the other side effects that occur in
  dma_buf_commit will interfere with the reset that occurs from
  ide_restart_bh, I think

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

(cherry picked from commit 5839df7)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
fengguang pushed a commit to 0day-ci/qemu that referenced this pull request Jun 13, 2017
Hi,

I'd like to report one use-after-free problem which is found by AddressSanitizer.
My company provides virtualization server with Qemu-2.7.
If customer commands Disk hot-unplug on Web-based application,
the application sends "device_del" and "drive_del" commands to qemu process via QMP.
It usually works fine.
But I found a corner case like following.
1. Customer commands Disk hot-unplug via Web-based application.
2. The application sends "device_del" that sometimes takes unpredictable time.
3. There are some inflight IOs
4. Customer does reboot or shutdown the guest OS
4. Qemu process generates segfault.

If a disk is unplugged with "device_del" and "drive_del" commands,
qemu does not generate problem. But if only "device_del" command are finished and
guest os reboots, qemu generates segfault.
I can reproduce that case easily with following commands
1. generate heavy IO on disk in Guest OS
2. run "device_del" on the qemu monitor
3. run "system_reboot" on the qemu monitor

According to AdressSanitizer, VirtQueue is freed by virtio_cleanup but dereferenced
by asynchronous request. I think asynchronous requests should be finished
before virtio_cleanup, so I added blk_drain() before virtio_cleanup.

Following is configure options I used to build qemu with AddressSanitizer.
./configure --target-list=x86_64-softmmu --extra-cflags="-fsanitize=address -fno-omit-frame-pointer" --enable-debug

Following is the report of AddressSanitizer.

------------------------------------- 8< ----------------------------------------
==8801==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
==8801==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc984da000; bottom 0x7fa524363000; size: 0x005774177000 (375609847808)
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189
=================================================================
==8801==ERROR: AddressSanitizer: heap-use-after-free on address 0x7fa5240ab82c at pc 0x55ccd7bffb76 bp 0x7fa524364590 sp 0x7fa524364580
READ of size 2 at 0x7fa5240ab82c thread T0
    #0 0x55ccd7bffb75 in virtqueue_fill /home/gohkim/work/tools/qemu/hw/virtio/virtio.c:284
    qemu#1 0x55ccd7bffe46 in virtqueue_push /home/gohkim/work/tools/qemu/hw/virtio/virtio.c:308
    qemu#2 0x55ccd7b66c74 in virtio_blk_req_complete /home/gohkim/work/tools/qemu/hw/block/virtio-blk.c:58
    qemu#3 0x55ccd7b67154 in virtio_blk_rw_complete /home/gohkim/work/tools/qemu/hw/block/virtio-blk.c:121
    qemu#4 0x55ccd83852b9 in blk_aio_complete block/block-backend.c:923
    qemu#5 0x55ccd8385a73 in blk_aio_write_entry block/block-backend.c:984
    qemu#6 0x55ccd84c7d68 in coroutine_trampoline util/coroutine-ucontext.c:78
    qemu#7 0x7fa5702d45cf  (/lib/x86_64-linux-gnu/libc.so.6+0x495cf)

0x7fa5240ab82c is located 44 bytes inside of 131072-byte region [0x7fa5240ab800,0x7fa5240cb800)
freed by thread T0 here:
    #0 0x7fa573f876aa in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x986aa)
    qemu#1 0x55ccd7c07fa4 in virtio_cleanup /home/gohkim/work/tools/qemu/hw/virtio/virtio.c:1678
    qemu#2 0x55ccd7b6c585 in virtio_blk_device_unrealize /home/gohkim/work/tools/qemu/hw/block/virtio-blk.c:948
    qemu#3 0x55ccd7c09734 in virtio_device_unrealize /home/gohkim/work/tools/qemu/hw/virtio/virtio.c:1950
    qemu#4 0x55ccd7eb9bb7 in device_set_realized hw/core/qdev.c:964
    qemu#5 0x55ccd82c1bef in property_set_bool qom/object.c:1853
    qemu#6 0x55ccd82be535 in object_property_set qom/object.c:1087
    qemu#7 0x55ccd82c454f in object_property_set_qobject qom/qom-qobject.c:27
    qemu#8 0x55ccd82be810 in object_property_set_bool qom/object.c:1156
    qemu#9 0x55ccd7ec1395 in bus_set_realized hw/core/bus.c:181
    qemu#10 0x55ccd82c1bef in property_set_bool qom/object.c:1853
    qemu#11 0x55ccd82be535 in object_property_set qom/object.c:1087
    qemu#12 0x55ccd82c454f in object_property_set_qobject qom/qom-qobject.c:27
    qemu#13 0x55ccd82be810 in object_property_set_bool qom/object.c:1156
    qemu#14 0x55ccd7eb9a8a in device_set_realized hw/core/qdev.c:956
    qemu#15 0x55ccd82c1bef in property_set_bool qom/object.c:1853
    qemu#16 0x55ccd82be535 in object_property_set qom/object.c:1087
    qemu#17 0x55ccd82c454f in object_property_set_qobject qom/qom-qobject.c:27
    qemu#18 0x55ccd82be810 in object_property_set_bool qom/object.c:1156
    qemu#19 0x55ccd7eba820 in device_unparent hw/core/qdev.c:1099
    qemu#20 0x55ccd82bf50b in object_finalize_child_property qom/object.c:1362
    qemu#21 0x55ccd82bb3c6 in object_property_del_child qom/object.c:422
    qemu#22 0x55ccd82bb601 in object_unparent qom/object.c:441
    qemu#23 0x55ccd7e221f1 in acpi_pcihp_eject_slot hw/acpi/pcihp.c:139
    qemu#24 0x55ccd7e22301 in acpi_pcihp_update_hotplug_bus hw/acpi/pcihp.c:152
    qemu#25 0x55ccd7e22600 in acpi_pcihp_update hw/acpi/pcihp.c:176
    qemu#26 0x55ccd7e22628 in acpi_pcihp_reset hw/acpi/pcihp.c:182
    qemu#27 0x55ccd7e1fc10 in piix4_reset hw/acpi/piix4.c:363
    qemu#28 0x55ccd7da4ebd in qemu_devices_reset /home/gohkim/work/tools/qemu/vl.c:1713
    qemu#29 0x55ccd7c28354 in pc_machine_reset /home/gohkim/work/tools/qemu/hw/i386/pc.c:2178

previously allocated by thread T0 here:
    #0 0x7fa573f87b49 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98b49)
    qemu#1 0x7fa5718575d0 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f5d0)
    qemu#2 0x55ccd7b6bfa5 in virtio_blk_device_realize /home/gohkim/work/tools/qemu/hw/block/virtio-blk.c:910
    qemu#3 0x55ccd7c09500 in virtio_device_realize /home/gohkim/work/tools/qemu/hw/virtio/virtio.c:1927
    qemu#4 0x55ccd7eb9690 in device_set_realized hw/core/qdev.c:918
    qemu#5 0x55ccd82c1bef in property_set_bool qom/object.c:1853
    qemu#6 0x55ccd82be535 in object_property_set qom/object.c:1087
    qemu#7 0x55ccd82c454f in object_property_set_qobject qom/qom-qobject.c:27
    qemu#8 0x55ccd82be810 in object_property_set_bool qom/object.c:1156
    qemu#9 0x55ccd81b1b5c in virtio_blk_pci_realize hw/virtio/virtio-pci.c:1897
    qemu#10 0x55ccd81b14df in virtio_pci_realize hw/virtio/virtio-pci.c:1799
    qemu#11 0x55ccd805e2c5 in pci_qdev_realize hw/pci/pci.c:1966
    qemu#12 0x55ccd81b17db in virtio_pci_dc_realize hw/virtio/virtio-pci.c:1852
    qemu#13 0x55ccd7eb9690 in device_set_realized hw/core/qdev.c:918
    qemu#14 0x55ccd82c1bef in property_set_bool qom/object.c:1853
    qemu#15 0x55ccd82be535 in object_property_set qom/object.c:1087
    qemu#16 0x55ccd82c454f in object_property_set_qobject qom/qom-qobject.c:27
    qemu#17 0x55ccd82be810 in object_property_set_bool qom/object.c:1156
    qemu#18 0x55ccd7d8141e in qdev_device_add /home/gohkim/work/tools/qemu/qdev-monitor.c:618
    qemu#19 0x55ccd7dab462 in device_init_func /home/gohkim/work/tools/qemu/vl.c:2316
    qemu#20 0x55ccd84bd18c in qemu_opts_foreach util/qemu-option.c:1116
    qemu#21 0x55ccd7db2bdb in main /home/gohkim/work/tools/qemu/vl.c:4507
    qemu#22 0x7fa5702ababf in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x20abf)

SUMMARY: AddressSanitizer: heap-use-after-free /home/gohkim/work/tools/qemu/hw/virtio/virtio.c:284 virtqueue_fill
Shadow bytes around the buggy address:
  0x0ff52480d6b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff52480d6c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff52480d6d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff52480d6e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff52480d6f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0ff52480d700: fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd
  0x0ff52480d710: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff52480d720: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff52480d730: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff52480d740: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0ff52480d750: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==8801==ABORTING

[v1]: add braces to fix style error

Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
qemu-deploy pushed a commit that referenced this pull request Jan 16, 2018
Spotted thanks to ASAN:

==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350
READ of size 1 at 0x556715a1f120 thread T0
    #0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219
    #1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294
    #2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635
    #3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324
    #4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418
    #5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109
    #6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613
    #7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704
    #8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104
    #9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131
    #10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839
    #11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105
    #12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323
    #13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373
    #14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124
    #15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881
    #16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172
    #17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184
    #18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440
    #19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84
    #20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182
    #21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847
    #22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
    #23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
    #24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
    #25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950
    #26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865
    #27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009)
    #28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9)

0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104
0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600

This fix is based on Andreas Arnez <arnez@linux.vnet.ibm.com> upstream
commit:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b

2014-08-19  Andreas Arnez  <arnez@linux.vnet.ibm.com>

       * s390-dis.c (init_disasm): Simplify initialization of
       opc_index[].  This also fixes an access after the last element
       of s390_opcodes[].

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180104160523.22995-19-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu-deploy pushed a commit that referenced this pull request Jul 31, 2018
With a Spice port chardev, it is possible to reenter
monitor_qapi_event_queue() (when the client disconnects for
example). This will dead-lock on monitor_lock.

Instead, use some TLS variables to check for recursion and queue the
events.

Fixes:
 (gdb) bt
 #0  0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0
 #1  0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
 #2  0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
 #3  0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645
 #4  0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149
 #5  0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235
 #6  0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316
 #7  0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197
 #8  0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197
 #9  0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414
 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388
 #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347
 #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
 #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341
 #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
 #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
 #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353
 #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199
 #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112
 #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147
 #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42
 #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425
 #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468
 #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517
 #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538
 #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624
 #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649
 #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58
 #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822
 #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862
 #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644

Note that error report is now moved to the first caller, which may
receive an error for a recursed event. This is probably fine (95% of
callers use &error_abort, the rest have NULL error and ignore it)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180731150144.14022-1-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[*_no_recurse renamed to *_no_reenter, local variables reordered]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
qemu-deploy pushed a commit that referenced this pull request Sep 25, 2018
In qemu_laio_process_completions_and_submit, the AioContext is acquired
before the ioq_submit iteration and after qemu_laio_process_completions,
but the latter is not thread safe either.

This change avoids a number of random crashes when the Main Thread and
an IO Thread collide processing completions for the same AioContext.
This is an example of such crash:

 - The IO Thread is trying to acquire the AioContext at aio_co_enter,
   which evidences that it didn't lock it before:

Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)):
 #0  0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0
 #2  0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0)
    at ../nptl/pthread_mutex_lock.c:114
 #3  0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66
 #4  0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493
 #5  0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478
 #6  0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104
 #7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 #8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 #9  0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406
 #10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true)
    at util/aio-posix.c:693
 #11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64
 #12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308
 #13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

 - The Main Thread is also processing completions from the same
   AioContext, and crashes due to failed assertion at util/iov.c:78:

Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)):
 #0  0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 #1  0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90
 #2  0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset")
    at assert.c:92
 #3  0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101
 #4  0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78
 #5  0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410
 #6  0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88
 #7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 #8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 #9  0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272
 #10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497
 #11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574
 #12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false)
    at util/aio-posix.c:604
 #13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273
 #14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390
 #15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590
 #16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774
 #17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401
 #18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449
 #19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080)
    at block/commit.c:92
 #20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973
 #21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90
 #22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118
 #23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436
 #24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
 #25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201
 #26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854
 #27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215
 #28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
 #29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
 #30 0x00005631fbd81412 in main_loop () at vl.c:1866
 #31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4647

 - A closer examination shows that s->io_q.in_flight appears to have
   gone backwards:

(gdb) frame 7
 #7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
222	            qemu_laio_process_completion(laiocb);
(gdb) p s
$2 = (LinuxAioState *) 0x7fdfc0297670
(gdb) p *s
$3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0,
    in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0,
      sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241}
(gdb) p/x s->io_q.in_flight
$4 = 0xfffffff0

Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
famz pushed a commit to famz/qemu that referenced this pull request Sep 28, 2018
Both virtio-blk and virtio-scsi use virtio_queue_empty() as the
loop condition in VQ handlers (virtio_blk_handle_vq,
virtio_scsi_handle_cmd_vq). When a device is marked broken in
virtqueue_pop, for example if a vIOMMU address translation failed, we
want to break out of the loop.

This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64
kernel:

  $ qemu-system-x86_64 \
    ... \
    -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
    -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0

The dead loop happens immediately when the kernel boots and initializes
the device, where virtio_scsi_data_plane_handle_cmd will not return:

    > ...
    > qemu#13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq
    > qemu#14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd
    > qemu#15 0x00005586602ddab7 in virtio_queue_notify_aio_vq
    > qemu#16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll
    > qemu#17 0x00005586607885da in run_poll_handlers_once
    > qemu#18 0x000055866078880e in try_poll_mode
    > qemu#19 0x00005586607888eb in aio_poll
    > qemu#20 0x0000558660784561 in aio_wait_bh_oneshot
    > qemu#21 0x00005586602b9582 in virtio_scsi_dataplane_stop
    > qemu#22 0x00005586605a7110 in virtio_bus_stop_ioeventfd
    > qemu#23 0x00005586605a9426 in virtio_pci_stop_ioeventfd
    > qemu#24 0x00005586605ab808 in virtio_pci_common_write
    > qemu#25 0x0000558660242396 in memory_region_write_accessor
    > qemu#26 0x00005586602425ab in access_with_adjusted_size
    > qemu#27 0x0000558660245281 in memory_region_dispatch_write
    > qemu#28 0x00005586601e008e in flatview_write_continue
    > qemu#29 0x00005586601e01d8 in flatview_write
    > qemu#30 0x00005586601e04de in address_space_write
    > qemu#31 0x00005586601e052f in address_space_rw
    > qemu#32 0x00005586602607f2 in kvm_cpu_exec
    > qemu#33 0x0000558660227148 in qemu_kvm_cpu_thread_fn
    > qemu#34 0x000055866078bde7 in qemu_thread_start
    > qemu#35 0x00007f5784906594 in start_thread
    > qemu#36 0x00007f5784639e6f in clone

With this patch, virtio_queue_empty will now return 1 as soon as the
vdev is marked as broken, after a "virtio: zero sized buffers are not
allowed" error.

To be consistent, update virtio_queue_empty_rcu as well.

Signed-off-by: Fam Zheng <famz@redhat.com>

---

v2: - Drop ATS condition from the patch description since it is not
      essential.
    - Drop patch 2. [Paolo]
qemu-deploy pushed a commit that referenced this pull request Oct 3, 2018
Both virtio-blk and virtio-scsi use virtio_queue_empty() as the
loop condition in VQ handlers (virtio_blk_handle_vq,
virtio_scsi_handle_cmd_vq). When a device is marked broken in
virtqueue_pop, for example if a vIOMMU address translation failed, we
want to break out of the loop.

This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64
kernel with ATS enabled:

  $ qemu-system-x86_64 \
    ... \
    -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
    -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0

The dead loop happens immediately when the kernel boots and initializes
the device, where virtio_scsi_data_plane_handle_cmd will not return:

    > ...
    > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq
    > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd
    > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq
    > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll
    > #17 0x00005586607885da in run_poll_handlers_once
    > #18 0x000055866078880e in try_poll_mode
    > #19 0x00005586607888eb in aio_poll
    > #20 0x0000558660784561 in aio_wait_bh_oneshot
    > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop
    > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd
    > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd
    > #24 0x00005586605ab808 in virtio_pci_common_write
    > #25 0x0000558660242396 in memory_region_write_accessor
    > #26 0x00005586602425ab in access_with_adjusted_size
    > #27 0x0000558660245281 in memory_region_dispatch_write
    > #28 0x00005586601e008e in flatview_write_continue
    > #29 0x00005586601e01d8 in flatview_write
    > #30 0x00005586601e04de in address_space_write
    > #31 0x00005586601e052f in address_space_rw
    > #32 0x00005586602607f2 in kvm_cpu_exec
    > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn
    > #34 0x000055866078bde7 in qemu_thread_start
    > #35 0x00007f5784906594 in start_thread
    > #36 0x00007f5784639e6f in clone

With this patch, virtio_queue_empty will now return 1 as soon as the
vdev is marked as broken, after a "virtio: zero sized buffers are not
allowed" error.

To be consistent, update virtio_queue_empty_rcu as well.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180910145616.8598-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pranith pushed a commit to pranith/qemu that referenced this pull request Oct 8, 2018
Both virtio-blk and virtio-scsi use virtio_queue_empty() as the
loop condition in VQ handlers (virtio_blk_handle_vq,
virtio_scsi_handle_cmd_vq). When a device is marked broken in
virtqueue_pop, for example if a vIOMMU address translation failed, we
want to break out of the loop.

This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64
kernel with ATS enabled:

  $ qemu-system-x86_64 \
    ... \
    -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
    -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0

The dead loop happens immediately when the kernel boots and initializes
the device, where virtio_scsi_data_plane_handle_cmd will not return:

    > ...
    > qemu#13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq
    > qemu#14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd
    > qemu#15 0x00005586602ddab7 in virtio_queue_notify_aio_vq
    > qemu#16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll
    > qemu#17 0x00005586607885da in run_poll_handlers_once
    > qemu#18 0x000055866078880e in try_poll_mode
    > qemu#19 0x00005586607888eb in aio_poll
    > qemu#20 0x0000558660784561 in aio_wait_bh_oneshot
    > qemu#21 0x00005586602b9582 in virtio_scsi_dataplane_stop
    > qemu#22 0x00005586605a7110 in virtio_bus_stop_ioeventfd
    > qemu#23 0x00005586605a9426 in virtio_pci_stop_ioeventfd
    > qemu#24 0x00005586605ab808 in virtio_pci_common_write
    > qemu#25 0x0000558660242396 in memory_region_write_accessor
    > qemu#26 0x00005586602425ab in access_with_adjusted_size
    > qemu#27 0x0000558660245281 in memory_region_dispatch_write
    > qemu#28 0x00005586601e008e in flatview_write_continue
    > qemu#29 0x00005586601e01d8 in flatview_write
    > qemu#30 0x00005586601e04de in address_space_write
    > qemu#31 0x00005586601e052f in address_space_rw
    > qemu#32 0x00005586602607f2 in kvm_cpu_exec
    > qemu#33 0x0000558660227148 in qemu_kvm_cpu_thread_fn
    > qemu#34 0x000055866078bde7 in qemu_thread_start
    > qemu#35 0x00007f5784906594 in start_thread
    > qemu#36 0x00007f5784639e6f in clone

With this patch, virtio_queue_empty will now return 1 as soon as the
vdev is marked as broken, after a "virtio: zero sized buffers are not
allowed" error.

To be consistent, update virtio_queue_empty_rcu as well.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180910145616.8598-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
deltova pushed a commit to EPITA-GISTRE/qemu that referenced this pull request Oct 8, 2018
In qemu_laio_process_completions_and_submit, the AioContext is acquired
before the ioq_submit iteration and after qemu_laio_process_completions,
but the latter is not thread safe either.

This change avoids a number of random crashes when the Main Thread and
an IO Thread collide processing completions for the same AioContext.
This is an example of such crash:

 - The IO Thread is trying to acquire the AioContext at aio_co_enter,
   which evidences that it didn't lock it before:

Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)):
 #0  0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 qemu#1  0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0
 qemu#2  0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0)
    at ../nptl/pthread_mutex_lock.c:114
 qemu#3  0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66
 qemu#4  0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493
 qemu#5  0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478
 qemu#6  0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104
 qemu#7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 qemu#8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 qemu#9  0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406
 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true)
    at util/aio-posix.c:693
 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64
 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308
 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

 - The Main Thread is also processing completions from the same
   AioContext, and crashes due to failed assertion at util/iov.c:78:

Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)):
 #0  0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 qemu#1  0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90
 qemu#2  0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset")
    at assert.c:92
 qemu#3  0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101
 qemu#4  0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78
 qemu#5  0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410
 qemu#6  0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88
 qemu#7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 qemu#8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 qemu#9  0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272
 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497
 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574
 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false)
    at util/aio-posix.c:604
 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273
 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390
 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590
 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774
 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401
 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449
 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080)
    at block/commit.c:92
 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973
 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90
 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118
 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436
 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201
 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854
 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215
 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866
 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4647

 - A closer examination shows that s->io_q.in_flight appears to have
   gone backwards:

(gdb) frame 7
 qemu#7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
222	            qemu_laio_process_completion(laiocb);
(gdb) p s
$2 = (LinuxAioState *) 0x7fdfc0297670
(gdb) p *s
$3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0,
    in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0,
      sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241}
(gdb) p/x s->io_q.in_flight
$4 = 0xfffffff0

Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
rth7680 pushed a commit to rth7680/qemu that referenced this pull request Oct 16, 2018
On PCXS chips (PA7000, pa 1.1a), trap qemu#18 is raised on memory faults,
while all later chips (>= PA7100) generate either trap qemu#26, qemu#27 or qemu#28
(depending on the fault type).

Since the current qemu emulation emulates a B160L machine (with a
PA7300LC PCX-L2 chip, we should raise trap qemu#26 (EXCP_DMAR) instead
of qemu#18 (EXCP_DMP) on access faults by the Linux kernel to page zero.

With the patch we now get the correct output (I tested against real
hardware):
 Kernel Fault: Code=26 (Data memory access rights trap)
instead of:
 Kernel Fault: Code=18 (Data memory protection/unaligned access trap)

Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20181007205153.GA30270@ls3530.fritz.box>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
yashkmankad pushed a commit to yashkmankad/qemu that referenced this pull request Dec 5, 2018
RH-Author: Kevin Wolf <kwolf@redhat.com>
Message-id: <20181010202213.7372-14-kwolf@redhat.com>
Patchwork-id: 82603
O-Subject: [RHEL-8 qemu-kvm PATCH 23/44] block/linux-aio: acquire AioContext before qemu_laio_process_completions
Bugzilla: 1637976
RH-Acked-by: Max Reitz <mreitz@redhat.com>
RH-Acked-by: John Snow <jsnow@redhat.com>
RH-Acked-by: Thomas Huth <thuth@redhat.com>

From: Sergio Lopez <slp@redhat.com>

In qemu_laio_process_completions_and_submit, the AioContext is acquired
before the ioq_submit iteration and after qemu_laio_process_completions,
but the latter is not thread safe either.

This change avoids a number of random crashes when the Main Thread and
an IO Thread collide processing completions for the same AioContext.
This is an example of such crash:

 - The IO Thread is trying to acquire the AioContext at aio_co_enter,
   which evidences that it didn't lock it before:

Thread 3 (Thread 0x7fdfd8bd8700 (LWP 36743)):
 #0  0x00007fdfe0dd542d in __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 qemu#1  0x00007fdfe0dd0de6 in _L_lock_870 () at /lib64/libpthread.so.0
 qemu#2  0x00007fdfe0dd0cdf in __GI___pthread_mutex_lock (mutex=mutex@entry=0x5631fde0e6c0)
    at ../nptl/pthread_mutex_lock.c:114
 qemu#3  0x00005631fc0603a7 in qemu_mutex_lock_impl (mutex=0x5631fde0e6c0, file=0x5631fc23520f "util/async.c", line=511) at util/qemu-thread-posix.c:66
 qemu#4  0x00005631fc05b558 in aio_co_enter (ctx=0x5631fde0e660, co=0x7fdfcc0c2b40) at util/async.c:493
 qemu#5  0x00005631fc05b5ac in aio_co_wake (co=<optimized out>) at util/async.c:478
 qemu#6  0x00005631fbfc51ad in qemu_laio_process_completion (laiocb=<optimized out>) at block/linux-aio.c:104
 qemu#7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 qemu#8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 qemu#9  0x00005631fc05d978 in aio_dispatch_handlers (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:406
 qemu#10 0x00005631fc05e3ea in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=true)
    at util/aio-posix.c:693
 qemu#11 0x00005631fbd7ad96 in iothread_run (opaque=0x5631fde0e1c0) at iothread.c:64
 qemu#12 0x00007fdfe0dcee25 in start_thread (arg=0x7fdfd8bd8700) at pthread_create.c:308
 qemu#13 0x00007fdfe0afc34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

 - The Main Thread is also processing completions from the same
   AioContext, and crashes due to failed assertion at util/iov.c:78:

Thread 1 (Thread 0x7fdfeb5eac80 (LWP 36740)):
 #0  0x00007fdfe0a391f7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 qemu#1  0x00007fdfe0a3a8e8 in __GI_abort () at abort.c:90
 qemu#2  0x00007fdfe0a32266 in __assert_fail_base (fmt=0x7fdfe0b84e68 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset")
    at assert.c:92
 qemu#3  0x00007fdfe0a32312 in __GI___assert_fail (assertion=assertion@entry=0x5631fc238ccb "offset == 0", file=file@entry=0x5631fc23698e "util/iov.c", line=line@entry=78, function=function@entry=0x5631fc236adc <__PRETTY_FUNCTION__.15220> "iov_memset") at assert.c:101
 qemu#4  0x00005631fc065287 in iov_memset (iov=<optimized out>, iov_cnt=<optimized out>, offset=<optimized out>, offset@entry=65536, fillc=fillc@entry=0, bytes=15515191315812405248) at util/iov.c:78
 qemu#5  0x00005631fc065a63 in qemu_iovec_memset (qiov=<optimized out>, offset=offset@entry=65536, fillc=fillc@entry=0, bytes=<optimized out>) at util/iov.c:410
 qemu#6  0x00005631fbfc5178 in qemu_laio_process_completion (laiocb=0x7fdd920df630) at block/linux-aio.c:88
 qemu#7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
 qemu#8  0x00005631fbfc5499 in qemu_laio_process_completions_and_submit (s=0x7fdfc0297670)
    at block/linux-aio.c:237
 qemu#9  0x00005631fbfc54ed in qemu_laio_poll_cb (opaque=<optimized out>) at block/linux-aio.c:272
 qemu#10 0x00005631fc05d85e in run_poll_handlers_once (ctx=ctx@entry=0x5631fde0e660) at util/aio-posix.c:497
 qemu#11 0x00005631fc05e2ca in aio_poll (blocking=false, ctx=0x5631fde0e660) at util/aio-posix.c:574
 qemu#12 0x00005631fc05e2ca in aio_poll (ctx=0x5631fde0e660, blocking=blocking@entry=false)
    at util/aio-posix.c:604
 qemu#13 0x00005631fbfcb8a3 in bdrv_do_drained_begin (ignore_parent=<optimized out>, recursive=<optimized out>, bs=<optimized out>) at block/io.c:273
 qemu#14 0x00005631fbfcb8a3 in bdrv_do_drained_begin (bs=0x5631fe8b6200, recursive=<optimized out>, parent=0x0, ignore_bds_parents=<optimized out>, poll=<optimized out>) at block/io.c:390
 qemu#15 0x00005631fbfbcd2e in blk_drain (blk=0x5631fe83ac80) at block/block-backend.c:1590
 qemu#16 0x00005631fbfbe138 in blk_remove_bs (blk=blk@entry=0x5631fe83ac80) at block/block-backend.c:774
 qemu#17 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:401
 qemu#18 0x00005631fbfbe3d6 in blk_unref (blk=0x5631fe83ac80) at block/block-backend.c:449
 qemu#19 0x00005631fbfc9a69 in commit_complete (job=0x5631fe8b94b0, opaque=0x7fdfcc1bb080)
    at block/commit.c:92
 qemu#20 0x00005631fbf7d662 in job_defer_to_main_loop_bh (opaque=0x7fdfcc1b4560) at job.c:973
 qemu#21 0x00005631fc05ad41 in aio_bh_poll (bh=0x7fdfcc01ad90) at util/async.c:90
 qemu#22 0x00005631fc05ad41 in aio_bh_poll (ctx=ctx@entry=0x5631fddffdb0) at util/async.c:118
 qemu#23 0x00005631fc05e210 in aio_dispatch (ctx=0x5631fddffdb0) at util/aio-posix.c:436
 qemu#24 0x00005631fc05ac1e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
 qemu#25 0x00007fdfeaae44c9 in g_main_context_dispatch (context=0x5631fde00140) at gmain.c:3201
 qemu#26 0x00007fdfeaae44c9 in g_main_context_dispatch (context=context@entry=0x5631fde00140) at gmain.c:3854
 qemu#27 0x00005631fc05d503 in main_loop_wait () at util/main-loop.c:215
 qemu#28 0x00005631fc05d503 in main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
 qemu#29 0x00005631fc05d503 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:497
 qemu#30 0x00005631fbd81412 in main_loop () at vl.c:1866
 qemu#31 0x00005631fbc18ff3 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4647

 - A closer examination shows that s->io_q.in_flight appears to have
   gone backwards:

(gdb) frame 7
 qemu#7  0x00005631fbfc523c in qemu_laio_process_completions (s=s@entry=0x7fdfc0297670)
    at block/linux-aio.c:222
222	            qemu_laio_process_completion(laiocb);
(gdb) p s
$2 = (LinuxAioState *) 0x7fdfc0297670
(gdb) p *s
$3 = {aio_context = 0x5631fde0e660, ctx = 0x7fdfeb43b000, e = {rfd = 33, wfd = 33}, io_q = {plugged = 0,
    in_queue = 0, in_flight = 4294967280, blocked = false, pending = {sqh_first = 0x0,
      sqh_last = 0x7fdfc0297698}}, completion_bh = 0x7fdfc0280ef0, event_idx = 21, event_max = 241}
(gdb) p/x s->io_q.in_flight
$4 = 0xfffffff0

Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e091f0e)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
slp added a commit to slp/qemu that referenced this pull request Jan 8, 2020
All paths that lead to bdrv_backup_top_drop(), except for the call
from backup_clean(), imply that the BDS AioContext has already been
acquired, so doing it there too can potentially lead to QEMU hanging
on AIO_WAIT_WHILE().

An easy way to trigger this situation is by issuing a two actions
transaction, with a proper and a bogus blockdev-backup, so the second
one will trigger a rollback. This will trigger a hang with an stack
trace like this one:

 #0  0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>,
     timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
 qemu#1  0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
     at /usr/include/bits/poll2.h:77
 qemu#2  0x000055e743386e09 in qemu_poll_ns
     (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336
 qemu#3  0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true)
     at util/aio-posix.c:669
 qemu#4  0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878
 qemu#5  0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017
 qemu#6  0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262
 qemu#7  0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644
 qemu#8  0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273
 qemu#9  0x000055e74331461f in backup_job_create
     (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478
 qemu#10 0x000055e74315bc52 in do_backup_common
     (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at blockdev.c:3580
 qemu#11 0x000055e74315c37c in do_blockdev_backup
     (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492
 qemu#12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018)
     at blockdev.c:1885
 qemu#13 0x000055e743160152 in qmp_transaction
     (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340
 qemu#14 0x000055e743287ff5 in qmp_marshal_transaction
     (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8)
     at qapi/qapi-commands-transaction.c:44
 qemu#15 0x000055e74333de6c in do_qmp_dispatch
     (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
 qemu#16 0x000055e74333de6c in qmp_dispatch
     (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>)
     at qapi/qmp-dispatch.c:175
 qemu#17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>)
     at monitor/qmp.c:145
 qemu#18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
 qemu#19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117
 qemu#20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117
 qemu#21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459
 qemu#22 0x000055e743385742 in aio_ctx_dispatch
     (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
 qemu#23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176
 qemu#24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829
 qemu#25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219
 qemu#26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
 qemu#27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
 qemu#28 0x000055e74316a3c1 in main_loop () at vl.c:1828
 qemu#29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
     at vl.c:4504

Fix this by not acquiring the AioContext there, and ensuring all paths
leading to it have it already acquired (backup_clean()).

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111
Signed-off-by: Sergio Lopez <slp@redhat.com>
crobinso pushed a commit to crobinso/qemu that referenced this pull request Jan 28, 2020
All paths that lead to bdrv_backup_top_drop(), except for the call
from backup_clean(), imply that the BDS AioContext has already been
acquired, so doing it there too can potentially lead to QEMU hanging
on AIO_WAIT_WHILE().

An easy way to trigger this situation is by issuing a two actions
transaction, with a proper and a bogus blockdev-backup, so the second
one will trigger a rollback. This will trigger a hang with an stack
trace like this one:

 #0  0x00007fb680c75016 in __GI_ppoll (fds=0x55e74580f7c0, nfds=1, timeout=<optimized out>,
     timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
 qemu#1  0x000055e743386e09 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>)
     at /usr/include/bits/poll2.h:77
 qemu#2  0x000055e743386e09 in qemu_poll_ns
     (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:336
 qemu#3  0x000055e743388dc4 in aio_poll (ctx=0x55e7458925d0, blocking=blocking@entry=true)
     at util/aio-posix.c:669
 qemu#4  0x000055e743305dea in bdrv_flush (bs=bs@entry=0x55e74593c0d0) at block/io.c:2878
 qemu#5  0x000055e7432be58e in bdrv_close (bs=0x55e74593c0d0) at block.c:4017
 qemu#6  0x000055e7432be58e in bdrv_delete (bs=<optimized out>) at block.c:4262
 qemu#7  0x000055e7432be58e in bdrv_unref (bs=bs@entry=0x55e74593c0d0) at block.c:5644
 qemu#8  0x000055e743316b9b in bdrv_backup_top_drop (bs=bs@entry=0x55e74593c0d0) at block/backup-top.c:273
 qemu#9  0x000055e74331461f in backup_job_create
     (job_id=0x0, bs=bs@entry=0x55e7458d5820, target=target@entry=0x55e74589f640, speed=0, sync_mode=MIRROR_SYNC_MODE_FULL, sync_bitmap=sync_bitmap@entry=0x0, bitmap_mode=BITMAP_SYNC_MODE_ON_SUCCESS, compress=false, filter_node_name=0x0, on_source_error=BLOCKDEV_ON_ERROR_REPORT, on_target_error=BLOCKDEV_ON_ERROR_REPORT, creation_flags=0, cb=0x0, opaque=0x0, txn=0x0, errp=0x7ffddfd1efb0) at block/backup.c:478
 qemu#10 0x000055e74315bc52 in do_backup_common
     (backup=backup@entry=0x55e746c066d0, bs=bs@entry=0x55e7458d5820, target_bs=target_bs@entry=0x55e74589f640, aio_context=aio_context@entry=0x55e7458a91e0, txn=txn@entry=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at blockdev.c:3580
 qemu#11 0x000055e74315c37c in do_blockdev_backup
     (backup=backup@entry=0x55e746c066d0, txn=0x0, errp=errp@entry=0x7ffddfd1efb0)
     at /usr/src/debug/qemu-kvm-4.2.0-2.module+el8.2.0+5135+ed3b2489.x86_64/./qapi/qapi-types-block-core.h:1492
 qemu#12 0x000055e74315c449 in blockdev_backup_prepare (common=0x55e746a8de90, errp=0x7ffddfd1f018)
     at blockdev.c:1885
 qemu#13 0x000055e743160152 in qmp_transaction
     (dev_list=<optimized out>, has_props=<optimized out>, props=0x55e7467fe2c0, errp=errp@entry=0x7ffddfd1f088) at blockdev.c:2340
 qemu#14 0x000055e743287ff5 in qmp_marshal_transaction
     (args=<optimized out>, ret=<optimized out>, errp=0x7ffddfd1f0f8)
     at qapi/qapi-commands-transaction.c:44
 qemu#15 0x000055e74333de6c in do_qmp_dispatch
     (errp=0x7ffddfd1f0f0, allow_oob=<optimized out>, request=<optimized out>, cmds=0x55e743c28d60 <qmp_commands>) at qapi/qmp-dispatch.c:132
 qemu#16 0x000055e74333de6c in qmp_dispatch
     (cmds=0x55e743c28d60 <qmp_commands>, request=<optimized out>, allow_oob=<optimized out>)
     at qapi/qmp-dispatch.c:175
 qemu#17 0x000055e74325c061 in monitor_qmp_dispatch (mon=0x55e745908030, req=<optimized out>)
     at monitor/qmp.c:145
 qemu#18 0x000055e74325c6fa in monitor_qmp_bh_dispatcher (data=<optimized out>) at monitor/qmp.c:234
 qemu#19 0x000055e743385866 in aio_bh_call (bh=0x55e745807ae0) at util/async.c:117
 qemu#20 0x000055e743385866 in aio_bh_poll (ctx=ctx@entry=0x55e7458067a0) at util/async.c:117
 qemu#21 0x000055e743388c54 in aio_dispatch (ctx=0x55e7458067a0) at util/aio-posix.c:459
 qemu#22 0x000055e743385742 in aio_ctx_dispatch
     (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
 qemu#23 0x00007fb68543e67d in g_main_dispatch (context=0x55e745893a40) at gmain.c:3176
 qemu#24 0x00007fb68543e67d in g_main_context_dispatch (context=context@entry=0x55e745893a40) at gmain.c:3829
 qemu#25 0x000055e743387d08 in glib_pollfds_poll () at util/main-loop.c:219
 qemu#26 0x000055e743387d08 in os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
 qemu#27 0x000055e743387d08 in main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
 qemu#28 0x000055e74316a3c1 in main_loop () at vl.c:1828
 qemu#29 0x000055e743016a72 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
     at vl.c:4504

Fix this by not acquiring the AioContext there, and ensuring all paths
leading to it have it already acquired (backup_clean()).

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1782111
Signed-off-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
@repo-lockdown
Copy link

repo-lockdown bot commented Apr 8, 2020

Thank you for your interest in the QEMU project.

This repository is a read-only mirror of the project's master
repostories hosted on https://git.qemu.org/git/qemu.git.
The project does not process merge requests filed on GitHub.

QEMU welcomes contributions of code (either fixing bugs or adding new
functionality). However, we get a lot of patches, and so we have some
guidelines about contributing on the project website:
https://www.qemu.org/contribute/

@repo-lockdown repo-lockdown bot locked and limited conversation to collaborators Apr 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants