Commits
virtio-blk-dat…
Name already in use
Commits on Dec 17, 2012
-
virtio-blk: add x-data-plane=on|off performance feature
The virtio-blk-data-plane feature is easy to integrate into hw/virtio-blk.c. The data plane can be started and stopped similar to vhost-net. Users can take advantage of the virtio-blk-data-plane feature using the new -device virtio-blk-pci,x-data-plane=on property. The x-data-plane name was chosen because at this stage the feature is experimental and likely to see changes in the future. If the VM configuration does not support virtio-blk-data-plane an error message is printed. Although we could fall back to regular virtio-blk, I prefer the explicit approach since it prompts the user to fix their configuration if they want the performance benefit of virtio-blk-data-plane. Limitations: * Only format=raw is supported * Live migration is not supported * Block jobs, hot unplug, and other operations fail with -EBUSY * I/O throttling limits are ignored * Only Linux hosts are supported due to Linux AIO usage Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
dataplane: add virtio-blk data plane code
virtio-blk-data-plane is a subset implementation of virtio-blk. It only handles read, write, and flush requests. It does this using a dedicated thread that executes an epoll(2)-based event loop and processes I/O using Linux AIO. This approach performs very well but can be used for raw image files only. The number of IOPS achieved has been reported to be several times higher than the existing virtio-blk implementation. Eventually it should be possible to unify virtio-blk-data-plane with the main body of QEMU code once the block layer and hardware emulation is able to run outside the global mutex. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
virtio-blk: restore VirtIOBlkConf->config_wce flag
Two slightly different versions of a patch to conditionally set VIRTIO_BLK_F_CONFIG_WCE through the "config-wce" qdev property have been applied (ea776ab and eec7f96). David Gibson <david@gibson.dropbear.id.au> noticed that the "config-wce" property is broken as a result and fixed it recently. The fix sets the host_features VIRTIO_BLK_F_CONFIG_WCE bit from a qdev property. Unfortunately, the virtio device then has no chance to test for the presence of the feature bit during virtio_blk_init(). Therefore, reinstate the VirtIOBlkConf->config_wce flag. Drop the duplicate qdev property to set the host_features bit. The VirtIOBlkConf->config_wce flag will be used by virtio-blk-data-plane in a later patch. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
iov: add qemu_iovec_concat_iov()
The qemu_iovec_concat() function copies a subset of a QEMUIOVector. The new qemu_iovec_concat_iov() function does the same for a iov/cnt pair. It is easy to define qemu_iovec_concat() in terms of qemu_iovec_concat_iov(). The existing code is mostly unchanged, except for the assertion src->size >= soffset, which cannot be efficiently checked upfront on a iov/cnt pair. Instead we assert upon hitting the end of src with an unsatisfied soffset. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
test-iov: add iov_discard_front/back() testcases
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
iov: add iov_discard_front/back() to remove data
The iov_discard_front/back() functions remove data from the front or back of the vector. This is useful when peeling off header/footer structs. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
dataplane: add Linux AIO request queue
The IOQueue has a pool of iocb structs and a function to add new read/write requests. Multiple requests can be added before calling the submit function to actually tell the host kernel to begin I/O. This allows callers to batch requests and submit them in one go. The actual I/O is performed using Linux AIO. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
Outside the safety of the global mutex we need to poll on file descriptors. I found epoll(2) is a convenient way to do that, although other options could replace this module in the future (such as an AioContext-based loop or glib's GMainLoop). One important feature of this small event loop implementation is that the loop can be terminated in a thread-safe way. This allows QEMU to stop the data plane thread cleanly. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
dataplane: add virtqueue vring code
The virtio-blk-data-plane cannot access memory using the usual QEMU functions since it executes outside the global mutex and the memory APIs are this time are not thread-safe. This patch introduces a virtqueue module based on the kernel's vhost vring code. The trick is that we map guest memory ahead of time and access it cheaply outside the global mutex. Once the hardware emulation code can execute outside the global mutex it will be possible to drop this code. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
dataplane: add host memory mapping code
The data plane thread needs to map guest physical addresses to host pointers. Normally this is done with cpu_physical_memory_map() but the function assumes the global mutex is held. The data plane thread does not touch the global mutex and therefore needs a thread-safe memory mapping mechanism. Hostmem registers a MemoryListener similar to how vhost collects and pushes memory region information into the kernel. There is a fine-grained lock on the regions list which is held during lookup and when installing a new regions list. When the physical memory map changes the MemoryListener callbacks are invoked. They build up a new list of memory regions which is finally installed when the list has been completed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
configure: add CONFIG_VIRTIO_BLK_DATA_PLANE
The virtio-blk-data-plane feature only works with Linux AIO. Therefore add a ./configure option and necessary checks to implement this dependency. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012 -
raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane
The raw_get_aio_fd() function allows virtio-blk-data-plane to get the file descriptor of a raw image file with Linux AIO enabled. This interface is really a layering violation that can be resolved once the block layer is able to run outside the global mutex - at that point virtio-blk-data-plane will switch from custom Linux AIO code to using the block layer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi committedDec 17, 2012
Commits on Dec 16, 2012
-
exec: refactor cpu_restore_state
Refactor common code around calls to cpu_restore_state(). tb_find_pc() has now no external users, make it static. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
exec: move TB handling to translate-all.c
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
exec: extract TB watchpoint check
Will be moved by the next patch. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
Fix coding style in areas to be moved by later patches. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commits on Dec 15, 2012
-
Merge branch 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf
* 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf: (40 commits) pseries: Increase default NVRAM size target-ppc: Don't use hwaddr to represent hardware state PPC: e500: pci: Export slot2irq calculation PPC: E500plat: Make a lot of PCI slots available PPC: E500: Move PCI slot information into params PPC: E500: Generate dt pci irq map dynamically PPC: E500: PCI: Make IRQ calculation more generic PPC: E500: PCI: Make first slot qdev settable openpic: Accelerate pending irq search openpic: fix minor coding style issues MSI-X: Fix endianness PPC: e500: Declare pci bridge as bridge PPC: e500: Add MSI support openpic: add Shared MSI support openpic: make brr1 model specific openpic: convert to qdev openpic: remove irq_out openpic: rename openpic_t to OpenPICState openpic: convert simple reg operations to builtin bitops openpic: remove unused type variable ...
-
target-xtensa: fix ITLB/DTLB page protection flags
With MMU option xtensa architecture has two TLBs: ITLB and DTLB. ITLB is only used for code access, DTLB is only for data. However TLB entries in both TLBs have attribute field controlling write and exec access. These bits need to be properly masked off depending on TLB type before being used as tlb_set_page prot argument. Otherwise the following happens: (1) ITLB entry for some PFN gets invalidated (2) DTLB entry for the same PFN gets updated, attributes allow code execution (3) code at the page with that PFN is executed (possible due to step 2), entry for the TB is written into the jump cache (4) QEMU TLB entry for the PFN gets replaced with an entry for some other PFN (5) code in the TB from step 3 is executed (possible due to jump cache) and it accesses data, for which there's no DTLB entry, causing DTLB miss exception (6) re-translation of the TB from step 5 is attempted, but there's no QEMU TLB entry nor xtensa ITLB entry for that PFN, which causes ITLB miss exception at the TB start address (7) ITLB miss exception is handled by the guest, but execution is resumed from the beginning of the faulting TB (the point where ITLB miss occured), not from the point where DTLB miss occured, which is wrong. With that fix the above scenario causes ITLB miss exception (that used to be step 7) at step 3, right at the beginning of the TB. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commits on Dec 14, 2012
-
console: clip update rectangle
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
pixman: fix vnc tight png/jpeg support
This patch adds an x argument to qemu_pixman_linebuf_fill so it can also be used to convert a partial scanline. Then fix tight + png/jpeg encoding by passing in the x+y offset, so the data is read from the correct screen location instead of the upper left corner. Cc: 1087974@bugs.launchpad.net Cc: qemu-stable@nongnu.org Reported-by: Tim Hardeneck <thardeck@suse.de> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
pixman: update internal copy to pixman-0.28.2
Some w64 fixes by Stefan Weil found their way into 0.28.2, so update the internal copy to that version to improve windows support. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
Revert "pixman: require 0.18.4 or newer"
This reverts commit 288fa40. The only reason old pixman versions didn't work was the missing PIXMAN_TYPE_BGRA, which is properly #ifdef'ed now. So we don't have to require a minimum pixman version. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
pixman: fix version check for PIXMAN_TYPE_BGRA
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
-
pseries: Increase default NVRAM size
If no image file for NVRAM is specified, the pseries machine currently creates a 16K non-persistent NVRAM by default. This basically works, but is not large enough for current firmware and guest kernels to create all the NVRAM partitions they would like to. Increasing the default size to 64K addresses this and stops the guest generating error messages. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
-
target-ppc: Don't use hwaddr to represent hardware state
The hwaddr type is somewhat vaguely defined as being able to contain bus addresses on the widest possible bus in the system. For that reason it's discouraged for representing specific pieces of persistent hardware state, which should instead use an explicit width type that matches the bits available in real hardware. In particular, because of the possibility that the size of hwaddr might change if different buses are added to the target in future, it's not suitable for use in vm state descriptions for savevm and migration. This patch purges such unwise uses of hwaddr from the ppc target code, which turns out to be just one. The ppcemb_tlb_t struct, used on a number of embedded ppc models to represent a TLB entry contains a hwaddr for the real address field. This patch changes it to be a fixed uint64_t which is suitable enough for all machine types which use this structure. Other uses of hwaddr in CPUPPCState turn out not to be problematic: htab_base and htab_mask are just used for the convenience of the TCG code; the underlying machine state is the SDR1 register, which is stored with a suitable type already. Likewise the mpic_cpu_base field is only used internally and does not represent fundamental hardware state which needs to be saved. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: e500: pci: Export slot2irq calculation
We need the calculation method to get from a PCI slot ID to its respective interrupt line twice. Once in the internal map function and once when assembling the device tree. So let's extract the calculation to a separate function that can be called by both users. Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: E500plat: Make a lot of PCI slots available
The ppce500 machine doesn't have to stick to hardware limitations, as it's defined as being fully device tree based. Thus we can change the initial PCI slot ID to 0x1 which gives us a whopping 31 PCI devices we can support with this machine now! Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: E500: Move PCI slot information into params
We have a params struct that allows us to expose differences between e500 machine models. Include PCI slot information there, so we can have different machines with different PCI slot topology. Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: E500: Generate dt pci irq map dynamically
Today we're hardcoding the PCI interrupt map in the e500 machine file. Instead, let's write it dynamically so that different machine types can have different slot properties. Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: E500: PCI: Make IRQ calculation more generic
The IRQ line calculation is more or less hardcoded today. Instead, let's write it as an algorithmic function that theoretically allows an arbitrary number of PCI slots. Signed-off-by: Alexander Graf <agraf@suse.de>
-
PPC: E500: PCI: Make first slot qdev settable
Today the first slot id in our e500 pci implementation is hardcoded to 0x11. Keep it there as default, but allow users to change the default to a different id. Signed-off-by: Alexander Graf <agraf@suse.de>
-
openpic: Accelerate pending irq search
When we're done with one interrupt, we need to search for the next pending interrupt in the queue. This search has grown quite big now that we have more than 256 possible irq lines. So let's memorize how many interrupts we have pending in our bitmaps, so that we can always bail out in the usual case - the one where we're all done. Signed-off-by: Alexander Graf <agraf@suse.de>
-
openpic: fix minor coding style issues
This patch removes all remaining occurences of spaces before function parameter indicating parenthesis. Signed-off-by: Alexander Graf <agraf@suse.de>
-
The MSI-X vector tables are usually stored in little endian in memory, so let's mark the accessors as such. This fixes MSI-X on e500 for me. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
-
PPC: e500: Declare pci bridge as bridge
The new PCI host bridge device needs to identify itself as PCI host bridge. Declare it as such. Signed-off-by: Alexander Graf <agraf@suse.de>