Virtual I O acceleration technologies for KVM
Abel Gordon edited this page Sep 30, 2013
·
13 revisions
IBM Virtual I/O acceleration technologies for KVM
Copyright (C) IBM Corporation, 2013
The commits in this repository implement a set of virtual I/O acceleration technologies for KVM, both for pci pass-through (based on kvm assigned-dev, no VFIO) and Para-Virtual I/O (based on vhost).
Please note this work is under development and more effort is required to make it upstream-ready. If you would like to join the upstreaming effort please don't hesitate to contact us
Source-code contributors (in alphabetical order):
- Nadav Amit nadav.amit@gmail.com
- Muli Ben-Yehuda mulix@mulix.org
- Abel Gordon abelg@il.ibm.com
- Nadav Har'El nyh@math.technion.ac.il
- Alex Landau landau.alex@gmail.com
These commits (patches) were easily created and maintained using Patchouli -- patch creator http://patchouli.sourceforge.net/
The code supports the following independent major features:
- Exitless interrupt delivery and completion for assigned devices
- Shared vhost-thread. A single vhost-thread can now handle I/O requests of multiple virtio devices/queues
- Vhost fine-grained I/O scheduling. Heuristics and configuration parameters used by the shared vhost-thread to control when and for how long each virtio queue is handled
- Virtio exitless guest-to-host notifications. Polling based mechanism to remove PIO exit-based notifications
- Para-Virtual posted interrupts. Para-Virtual posted interrupts mechanism to inject exitless virtual interrupts
More detailed information and performance evaluation of these technologies can be found in the following papers:
-
ELI: Bare-metal Performance for I/O Virtualization -- ASPLOS 2012
-
Efficient and Scalable Paravirtual I/O System -- USENIX ATC 2013
General usage notes / guidelines:
- pci pass-through: physical interrupts of pci pass-throughput devices should be pinned to the core/s responsible for running the virtual machines owning the device.
- vhost: we recommend to dedicate a core for each vhost worker thread ("I/O core"). To maximize performance we suggest to use at least one I/O core per CPU socket (NUMA awareness) and pin physical interrupts to the I/O cores.
- This code includes vhost-blk back-end implemented by Asias He (https://github.com/asias/linux/tree/blk.vhost-blk).
- To achieve maximum performance we recommend to dedicate a core per VCPU, disable yield_on_hlt, use huge-pages, use x2APIC (for both the host and guest to enable exitless EOI) and collocate VCPU threads and VM memory in the same NUMA node. If you enable exitless EOI don't forget to disable para-virtual EOI (-cpu qemu64,+x2apic,-kvm_pv_eoi)
- In order to enable the vhost-blk back-end you need to use a modified QEMU version that supports this feature. Asias He already published a version in https://github.com/asias/qemu/tree/blk.vhost-blk
- Exitless interrupts and para-virtual posted interrupts are enabled/disabled via hypercalls. Please read the commits documentation for more details.
- The number of devices per vhost-thread, polling mode and fine-grained I/O scheduling are controlled via vhost kernel module parameters. See the commits and parameters documentation for more details