Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: qemu/qemu
base: 26da5de3a09d
Choose a base ref
...
head repository: qemu/qemu
compare: cde0704a76cc
Choose a head ref
  • 8 commits
  • 17 files changed
  • 5 contributors

Commits on May 22, 2023

  1. hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0

    Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
    set for machine types < 8.0 will cause migration to fail if the target
    QEMU version is < 8.0.0 :
    
    qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
    qemu-system-x86_64: Failed to load PCIDevice:config
    qemu-system-x86_64: Failed to load e1000e:parent_obj
    qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
    qemu-system-x86_64: load of migration failed: Invalid argument
    
    The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
    with this cmdline:
    
    ./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]
    
    In order to fix this, property x-pcie-err-unc-mask was introduced to
    control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
    default, but is disabled if machine type <= 7.2.
    
    Fixes: 010746a ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
    Suggested-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Leonardo Bras <leobras@redhat.com>
    Message-Id: <20230503002701.854329-1-leobras@redhat.com>
    Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
    Reviewed-by: Peter Xu <peterx@redhat.com>
    Reviewed-by: Juan Quintela <quintela@redhat.com>
    Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576
    Tested-by: Fiona Ebner <f.ebner@proxmox.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    (cherry picked from commit 5ed3dab)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Leonardo Bras authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    adc4975 View commit details
    Browse the repository at this point in the history
  2. virtio-net: not enable vq reset feature unconditionally

    The commit 93a97dc ("virtio-net: enable vq reset feature") enables
    unconditionally vq reset feature as long as the device is emulated.
    This makes impossible to actually disable the feature, and it causes
    migration problems from qemu version previous than 7.2.
    
    The entire final commit is unneeded as device system already enable or
    disable the feature properly.
    
    This reverts commit 93a97dc.
    Fixes: 93a97dc ("virtio-net: enable vq reset feature")
    Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
    
    Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
    Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    (cherry picked from commit 1fac00f)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    eugpermar authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    302ac06 View commit details
    Browse the repository at this point in the history
  3. virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_req…

    …uest
    
    Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype.
    
    Fixes: 0e660a6 ("crypto: Introduce RSA algorithm")
    Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
    Reported-by: Yiming Tao <taoym@zju.edu.cn>
    Message-Id: <20230509075317.1132301-1-mcascell@redhat.com>
    Reviewed-by: Gonglei <arei.gonglei@huawei.com>
    Reviewed-by: zhenwei pi<pizhenwei@bytedance.com>
    Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    (cherry picked from commit 3e69908)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Mauro Matteo Cascella authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    81d13aa View commit details
    Browse the repository at this point in the history
  4. aio-posix: do not nest poll handlers

    QEMU's event loop supports nesting, which means that event handler
    functions may themselves call aio_poll(). The condition that triggered a
    handler must be reset before the nested aio_poll() call, otherwise the
    same handler will be called and immediately re-enter aio_poll. This
    leads to an infinite loop and stack exhaustion.
    
    Poll handlers are especially prone to this issue, because they typically
    reset their condition by finishing the processing of pending work.
    Unfortunately it is during the processing of pending work that nested
    aio_poll() calls typically occur and the condition has not yet been
    reset.
    
    Disable a poll handler during ->io_poll_ready() so that a nested
    aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the
    disabled poll handler and its associated fd handler do not run during
    the nested aio_poll(). Calling aio_set_fd_handler() from inside nested
    aio_poll() could cause it to run again. If the fd handler is pending
    inside nested aio_poll(), then it will also run again.
    
    In theory fd handlers can be affected by the same issue, but they are
    more likely to reset the condition before calling nested aio_poll().
    
    This is a special case and it's somewhat complex, but I don't see a way
    around it as long as nested aio_poll() is supported.
    
    Cc: qemu-stable@nongnu.org
    Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181
    Fixes: c382706 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK")
    Cc: Kevin Wolf <kwolf@redhat.com>
    Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
    Cc: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20230502184134.534703-2-stefanha@redhat.com>
    Reviewed-by: Kevin Wolf <kwolf@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    (cherry picked from commit 6d740fb)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Stefan Hajnoczi authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    a91defe View commit details
    Browse the repository at this point in the history
  5. tested: add test for nested aio_poll() in poll handlers

    Cc: qemu-stable@nongnu.org
    Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20230502184134.534703-3-stefanha@redhat.com>
    [kwolf: Restrict to CONFIG_POSIX, Windows doesn't support polling]
    Tested-by: Kevin Wolf <kwolf@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    (cherry picked from commit 844a12a)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Stefan Hajnoczi authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    a0b89ba View commit details
    Browse the repository at this point in the history
  6. block: compile out assert_bdrv_graph_readable() by default

    reader_count() is a performance bottleneck because the global
    aio_context_list_lock mutex causes thread contention. Put this debugging
    assertion behind a new ./configure --enable-debug-graph-lock option and
    disable it by default.
    
    The --enable-debug-graph-lock option is also enabled by the more general
    --enable-debug option.
    
    Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
    Message-Id: <20230501173443.153062-1-stefanha@redhat.com>
    Reviewed-by: Kevin Wolf <kwolf@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    (cherry picked from commit 58a2e3f)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Stefan Hajnoczi authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    9b56be5 View commit details
    Browse the repository at this point in the history
  7. graph-lock: Disable locking for now

    In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
    come from callers that hold an AioContext lock, which is not allowed
    during polling. In theory, we could temporarily release the lock, but
    callers are inconsistent about whether they hold a lock, and if they do,
    some are also confused about which one they hold. While all of this is
    fixable, it's not trivial, and the best course of action for 8.0.1 is
    probably just disabling the graph locking code temporarily.
    
    We don't currently rely on graph locking yet. It is supposed to replace
    the AioContext lock eventually to enable multiqueue support, but as long
    as we still have the AioContext lock, it is sufficient without the graph
    lock. Once the AioContext lock goes away, the deadlock doesn't exist any
    more either and this commit can be reverted. (Of course, it can also be
    reverted while the AioContext lock still exists if the callers have been
    fixed.)
    
    Cc: qemu-stable@nongnu.org
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    Message-Id: <20230517152834.277483-2-kwolf@redhat.com>
    Reviewed-by: Eric Blake <eblake@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    (cherry picked from commit 80fc5d2)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Kevin Wolf authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    03e7a4e View commit details
    Browse the repository at this point in the history
  8. nbd/server: Fix drained_poll to wake coroutine in right AioContext

    nbd_drained_poll() generally runs in the main thread, not whatever
    iothread the NBD server coroutine is meant to run in, so it can't
    directly reenter the coroutines to wake them up.
    
    The code seems to have the right intention, it specifies the correct
    AioContext when it calls qemu_aio_coroutine_enter(). However, this
    functions doesn't schedule the coroutine to run in that AioContext, but
    it assumes it is already called in the home thread of the AioContext.
    
    To fix this, add a new thread-safe qio_channel_wake_read() that can be
    called in the main thread to wake up the coroutine in its AioContext,
    and use this in nbd_drained_poll().
    
    Cc: qemu-stable@nongnu.org
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
    Reviewed-by: Eric Blake <eblake@redhat.com>
    Signed-off-by: Kevin Wolf <kwolf@redhat.com>
    (cherry picked from commit 7c1f51b)
    Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
    Kevin Wolf authored and Michael Tokarev committed May 22, 2023
    Copy the full SHA
    cde0704 View commit details
    Browse the repository at this point in the history