Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qemu build fails on wheezy x86_64 #19

Closed
smclinden opened this issue Sep 17, 2014 · 9 comments
Closed

qemu build fails on wheezy x86_64 #19

smclinden opened this issue Sep 17, 2014 · 9 comments

Comments

@smclinden
Copy link

I've built LLVM 3.3 according to the documentation (but not with the debug option). When I attempt to compile Panda I get the following:

  • nproc
  • make -j 1
    GEN config-host.h
    GEN trace.h
    GEN trace.c
    GEN config-target.h
    LINK x86_64-softmmu/qemu-system-x86_64
    ../slirp/misc.o: In function memset': /usr/include/x86_64-linux-gnu/bits/string3.h:82: warning: memset used with constant zero length parameter; this could be due to transposed parameters panda/panda_dynval_inst.o:(.data.rel.ro._ZTIN4llvm22PandaInstrFunctionPassE[_ZTIN4llvm22PandaInstrFunctionPassE]+0x10): undefined reference totypeinfo for llvm::FunctionPass'
    panda/panda_helper_call_morph.o:(.data.rel.ro._ZTIN4llvm26PandaCallMorphFunctionPassE[_ZTIN4llvm26PandaCallMorphFunctionPassE]+0x10): undefined reference to typeinfo for llvm::FunctionPass' tcg/tcg-llvm.o:(.data.rel.ro._ZTI17TJITMemoryManager[_ZTI17TJITMemoryManager]+0x10): undefined reference totypeinfo for llvm::SectionMemoryManager'
    collect2: error: ld returned 1 exit status
    make[1]: *** [qemu-system-x86_64] Error 1
    make: *** [subdir-x86_64-softmmu] Error 2

Suggestions?

@moyix
Copy link
Collaborator

moyix commented Sep 17, 2014

Could you make sure that LLVM and QEMU were built with the same compiler?

Also, make sure that you're using the ./build.sh for QEMU.

@smclinden
Copy link
Author

Yes to both. Same compiler and used the ./build.sh (but changed the path to the llvm build).

@smclinden
Copy link
Author

64-bit Linux should not be a problem, right?

@jhodosh
Copy link

jhodosh commented Sep 17, 2014

64-bit Linux is the only supported build environment.

@moyix
Copy link
Collaborator

moyix commented Sep 17, 2014

I believe this is caused by building LLVM without RTTI support. Make sure that you have built LLVM using:

REQUIRES_RTTI=1 make

@smclinden
Copy link
Author

That may be it. I believe that LLVM is built with -fno-rtti by default, though I may be incorrect about that.

@smclinden
Copy link
Author

I checked and I actually cut and pasted this line from the Panda documentation:
CC=gcc-4.7 CXX=g++-4.7 ./configure --enable-optimized --disable-assertions --enable-targets=x86 && REQUIRES_RTTI=1 make -j $(nproc)

I also checked and build.sh uses version 4.7 for gcc and g++ so they are using the same compiler.

@moyix
Copy link
Collaborator

moyix commented Sep 23, 2014

Well, I the only way to proceed now is to start checking low-level details. Since it's complaining about an undefined symbol, make sure the LLVM library you're linking to defines typeinfo for llvm::FunctionPass (it should be in libLLVMCore.a).

Mine looks like (the relevant line is the one with the "V"):

$ nm -C ../llvm-3.3/Release/lib/libLLVMCore.a | grep "typeinfo for llvm::FunctionPass"
                 U typeinfo for llvm::FunctionPass
0000000000000000 V typeinfo for llvm::FunctionPass
                 U typeinfo for llvm::FunctionPass
0000000000000000 V typeinfo for llvm::FunctionPassManager
0000000000000000 V typeinfo for llvm::FunctionPassManagerImpl
                 U typeinfo for llvm::FunctionPass
                 U typeinfo for llvm::FunctionPass

Also make sure the library you run nm on is actually the one you're linking to. If you do V=1 make &> compile.log in the qemu directory, you should at the end see a line like:

g++-4.7 -m64 -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE [... lots and lots snipped ...] -lLLVMTableGen -lLLVMDebugInfo -lLLVMOption -lLLVMX86Disassembler -lLLVMX86AsmParser -lLLVMX86CodeGen -lLLVMSelectionDAG -lLLVMAsmPrinter -lLLVMX86Desc -lLLVMX86Info -lLLVMX86AsmPrinter -lLLVMX86Utils -lLLVMIRReader -lLLVMAsmParser -lLLVMMCDisassembler -lLLVMMCParser -lLLVMInstrumentation -lLLVMArchive -lLLVMBitReader -lLLVMInterpreter -lLLVMipo -lLLVMVectorize -lLLVMLinker -lLLVMBitWriter -lLLVMMCJIT -lLLVMJIT -lLLVMCodeGen -lLLVMObjCARCOpts -lLLVMScalarOpts -lLLVMInstCombine -lLLVMTransformUtils -lLLVMipa -lLLVMAnalysis -lLLVMRuntimeDyld -lLLVMExecutionEngine -lLLVMTarget -lLLVMMC -lLLVMObject -lLLVMCore -lLLVMSupport -L/scratch/git/panda/llvm-3.3/Release/lib [...]

The part at the end (-L/scratch/git/panda/llvm-3.3/Release/lib) is the path to LLVM QEMU is using.

Since ld is complaining, I would expect that the libLLVMCore.a you're linking against does not have the typeinfo for llvm::FunctionPass defined; your goal will be to figure out why that's happening and make sure it gets built with that symbol.

Unfortunately I can't be of much more help, since I can't reproduce this on my end.

@yrp604
Copy link

yrp604 commented Sep 28, 2014

You're likely linking against llvm 3.4, which moved llvm::FunctionPass into llvm::legacy::FunctionPass. If you change the order of your include dirs (so 3.3 is first), it will probably work.

@phulin phulin closed this as completed Feb 27, 2015
moyix pushed a commit that referenced this issue Feb 8, 2019
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:

    qmp_block_commit
      commit_active_start
        bdrv_reopen
          bdrv_reopen_multiple
            bdrv_reopen_prepare
              bdrv_flush
                aio_poll
                  aio_bh_poll
                    aio_bh_call
                      block_job_defer_to_main_loop_bh
                        stream_complete
                          bdrv_reopen

block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.

Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well.  To achieve
this, this patch forces draining the BH in BDRV_POLL_WHILE.

As a side effect this fixes a hang in block_job_detach_aio_context
during system_reset when a block job is ready:

    #0  0x0000555555aa79f3 in bdrv_drain_recurse
    #1  0x0000555555aa825d in bdrv_drained_begin
    #2  0x0000555555aa8449 in bdrv_drain
    #3  0x0000555555a9c356 in blk_drain
    #4  0x0000555555aa3cfd in mirror_drain
    #5  0x0000555555a66e11 in block_job_detach_aio_context
    #6  0x0000555555a62f4d in bdrv_detach_aio_context
    #7  0x0000555555a63116 in bdrv_set_aio_context
    #8  0x0000555555a9d326 in blk_set_aio_context
    #9  0x00005555557e38da in virtio_blk_data_plane_stop
    #10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd
    #11 0x00005555559fa49b in virtio_bus_stop_ioeventfd
    #12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd
    #13 0x00005555559f6a18 in virtio_pci_reset
    #14 0x00005555559139a9 in qdev_reset_one
    #15 0x0000555555916738 in qbus_walk_children
    #16 0x0000555555913318 in qdev_walk_children
    #17 0x0000555555916738 in qbus_walk_children
    #18 0x00005555559168ca in qemu_devices_reset
    #19 0x000055555581fcbb in pc_machine_reset
    #20 0x00005555558a4d96 in qemu_system_reset
    #21 0x000055555577157a in main_loop_should_exit
    #22 0x000055555577157a in main_loop
    #23 0x000055555577157a in main

The rationale is that the loop in block_job_detach_aio_context cannot
make any progress in pausing/completing the job, because bs->in_flight
is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
BH. With this patch, it does.

Reported-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-3-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
AndrewFasano pushed a commit that referenced this issue Oct 16, 2023
Both virtio-blk and virtio-scsi use virtio_queue_empty() as the
loop condition in VQ handlers (virtio_blk_handle_vq,
virtio_scsi_handle_cmd_vq). When a device is marked broken in
virtqueue_pop, for example if a vIOMMU address translation failed, we
want to break out of the loop.

This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64
kernel with ATS enabled:

  $ qemu-system-x86_64 \
    ... \
    -device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
    -device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0

The dead loop happens immediately when the kernel boots and initializes
the device, where virtio_scsi_data_plane_handle_cmd will not return:

    > ...
    > #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq
    > #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd
    > #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq
    > #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll
    > #17 0x00005586607885da in run_poll_handlers_once
    > #18 0x000055866078880e in try_poll_mode
    > #19 0x00005586607888eb in aio_poll
    > #20 0x0000558660784561 in aio_wait_bh_oneshot
    > #21 0x00005586602b9582 in virtio_scsi_dataplane_stop
    > #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd
    > #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd
    > #24 0x00005586605ab808 in virtio_pci_common_write
    > #25 0x0000558660242396 in memory_region_write_accessor
    > #26 0x00005586602425ab in access_with_adjusted_size
    > #27 0x0000558660245281 in memory_region_dispatch_write
    > #28 0x00005586601e008e in flatview_write_continue
    > #29 0x00005586601e01d8 in flatview_write
    > #30 0x00005586601e04de in address_space_write
    > #31 0x00005586601e052f in address_space_rw
    > #32 0x00005586602607f2 in kvm_cpu_exec
    > #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn
    > #34 0x000055866078bde7 in qemu_thread_start
    > #35 0x00007f5784906594 in start_thread
    > #36 0x00007f5784639e6f in clone

With this patch, virtio_queue_empty will now return 1 as soon as the
vdev is marked as broken, after a "virtio: zero sized buffers are not
allowed" error.

To be consistent, update virtio_queue_empty_rcu as well.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180910145616.8598-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AndrewFasano pushed a commit that referenced this issue Oct 16, 2023
The virtqueue code sets up MemoryRegionCaches to access the virtqueue
guest RAM data structures.  The code currently assumes that
VRingMemoryRegionCaches is initialized before device emulation code
accesses the virtqueue.  An assertion will fail in
vring_get_region_caches() when this is not true.  Device fuzzing found a
case where this assumption is false (see below).

Virtqueue guest RAM addresses can also be changed from a vCPU thread
while an IOThread is accessing the virtqueue.  This breaks the same
assumption but this time the caches could become invalid partway through
the virtqueue code.  The code fetches the caches RCU pointer multiple
times so we will need to validate the pointer every time it is fetched.

Add checks each time we call vring_get_region_caches() and treat invalid
caches as a nop: memory stores are ignored and memory reads return 0.

The fuzz test failure is as follows:

  $ qemu -M pc -device virtio-blk-pci,id=drv0,drive=drive0,addr=4.0 \
         -drive if=none,id=drive0,file=null-co://,format=raw,auto-read-only=off \
         -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw \
         -display none \
         -qtest stdio
  endianness
  outl 0xcf8 0x80002020
  outl 0xcfc 0xe0000000
  outl 0xcf8 0x80002004
  outw 0xcfc 0x7
  write 0xe0000000 0x24 0x00ffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab5cffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffabffffffab0000000001
  inb 0x4
  writew 0xe000001c 0x1
  write 0xe0000014 0x1 0x0d

The following error message is produced:

  qemu-system-x86_64: /home/stefanha/qemu/hw/virtio/virtio.c:286: vring_get_region_caches: Assertion `caches != NULL' failed.

The backtrace looks like this:

  #0  0x00007ffff5520625 in raise () at /lib64/libc.so.6
  #1  0x00007ffff55098d9 in abort () at /lib64/libc.so.6
  #2  0x00007ffff55097a9 in _nl_load_domain.cold () at /lib64/libc.so.6
  #3  0x00007ffff5518a66 in annobin_assert.c_end () at /lib64/libc.so.6
  #4  0x00005555559073da in vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:286
  #5  vring_get_region_caches (vq=<optimized out>) at qemu/hw/virtio/virtio.c:283
  #6  0x000055555590818d in vring_used_flags_set_bit (mask=1, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398
  #7  virtio_queue_split_set_notification (enable=0, vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:398
  #8  virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:451
  #9  0x0000555555908512 in virtio_queue_set_notification (vq=vq@entry=0x5555575ceea0, enable=enable@entry=0) at qemu/hw/virtio/virtio.c:444
  #10 0x00005555558c697a in virtio_blk_handle_vq (s=0x5555575c57e0, vq=0x5555575ceea0) at qemu/hw/block/virtio-blk.c:775
  #11 0x0000555555907836 in virtio_queue_notify_aio_vq (vq=0x5555575ceea0) at qemu/hw/virtio/virtio.c:2244
  #12 0x0000555555cb5dd7 in aio_dispatch_handlers (ctx=ctx@entry=0x55555671a420) at util/aio-posix.c:429
  #13 0x0000555555cb67a8 in aio_dispatch (ctx=0x55555671a420) at util/aio-posix.c:460
  #14 0x0000555555cb307e in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at util/async.c:260
  #15 0x00007ffff7bbc510 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
  #16 0x0000555555cb5848 in glib_pollfds_poll () at util/main-loop.c:219
  #17 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
  #18 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:518
  #19 0x00005555559b20c9 in main_loop () at vl.c:1683
  #20 0x0000555555838115 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4441

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Cc: Michael Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200207104619.164892-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
AndrewFasano pushed a commit that referenced this issue Oct 16, 2023
when s->inflight is freed, vhost_dev_free_inflight may try to access
s->inflight->addr, it will retrigger the following issue.

==7309==ERROR: AddressSanitizer: heap-use-after-free on address 0x604001020d18 at pc 0x555555ce948a bp 0x7fffffffb170 sp 0x7fffffffb160
READ of size 8 at 0x604001020d18 thread T0
    #0 0x555555ce9489 in vhost_dev_free_inflight /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473
    #1 0x555555cd86eb in virtio_reset /root/smartx/qemu-el7/qemu-test/hw/virtio/virtio.c:1214
    #2 0x5555560d3eff in virtio_pci_reset hw/virtio/virtio-pci.c:1859
    #3 0x555555f2ac53 in device_set_realized hw/core/qdev.c:893
    #4 0x5555561d572c in property_set_bool qom/object.c:1925
    #5 0x5555561de8de in object_property_set_qobject qom/qom-qobject.c:27
    #6 0x5555561d99f4 in object_property_set_bool qom/object.c:1188
    #7 0x555555e50ae7 in qdev_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:626
    #8 0x555555e51213 in qmp_device_add /root/smartx/qemu-el7/qemu-test/qdev-monitor.c:806
    #9 0x555555e8ff40 in hmp_device_add /root/smartx/qemu-el7/qemu-test/hmp.c:1951
    #10 0x555555be889a in handle_hmp_command /root/smartx/qemu-el7/qemu-test/monitor.c:3404
    #11 0x555555beac8b in monitor_command_cb /root/smartx/qemu-el7/qemu-test/monitor.c:4296
    #12 0x555556433eb7 in readline_handle_byte util/readline.c:393
    #13 0x555555be89ec in monitor_read /root/smartx/qemu-el7/qemu-test/monitor.c:4279
    #14 0x5555563285cc in tcp_chr_read chardev/char-socket.c:470
    #15 0x7ffff670b968 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x4a968)
    #16 0x55555640727c in glib_pollfds_poll util/main-loop.c:215
    #17 0x55555640727c in os_host_main_loop_wait util/main-loop.c:238
    #18 0x55555640727c in main_loop_wait util/main-loop.c:497
    #19 0x555555b2d0bf in main_loop /root/smartx/qemu-el7/qemu-test/vl.c:2013
    #20 0x555555b2d0bf in main /root/smartx/qemu-el7/qemu-test/vl.c:4776
    #21 0x7fffdd2eb444 in __libc_start_main (/lib64/libc.so.6+0x22444)
    #22 0x555555b3767a  (/root/smartx/qemu-el7/qemu-test/x86_64-softmmu/qemu-system-x86_64+0x5e367a)

0x604001020d18 is located 8 bytes inside of 40-byte region [0x604001020d10,0x604001020d38)
freed by thread T0 here:
    #0 0x7ffff6f00508 in __interceptor_free (/lib64/libasan.so.4+0xde508)
    #1 0x7ffff671107d in g_free (/lib64/libglib-2.0.so.0+0x5007d)

previously allocated by thread T0 here:
    #0 0x7ffff6f00a88 in __interceptor_calloc (/lib64/libasan.so.4+0xdea88)
    #1 0x7ffff6710fc5 in g_malloc0 (/lib64/libglib-2.0.so.0+0x4ffc5)

SUMMARY: AddressSanitizer: heap-use-after-free /root/smartx/qemu-el7/qemu-test/hw/virtio/vhost.c:1473 in vhost_dev_free_inflight
Shadow bytes around the buggy address:
  0x0c08801fc150: fa fa 00 00 00 00 04 fa fa fa fd fd fd fd fd fa
  0x0c08801fc160: fa fa fd fd fd fd fd fd fa fa 00 00 00 00 04 fa
  0x0c08801fc170: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 04 fa
  0x0c08801fc180: fa fa 00 00 00 00 00 01 fa fa 00 00 00 00 00 01
  0x0c08801fc190: fa fa 00 00 00 00 00 fa fa fa 00 00 00 00 04 fa
=>0x0c08801fc1a0: fa fa fd[fd]fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c08801fc1b0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa
  0x0c08801fc1c0: fa fa 00 00 00 00 00 fa fa fa fd fd fd fd fd fd
  0x0c08801fc1d0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa
  0x0c08801fc1e0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fd
  0x0c08801fc1f0: fa fa 00 00 00 00 00 01 fa fa fd fd fd fd fd fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==7309==ABORTING

Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20200417101707.14467-1-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
MarkMankins added a commit to MarkMankins/panda that referenced this issue Dec 18, 2023
…replayable_recordings

Ace758 make replayable recordings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants