Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt
Add a small intro + minimal documentation for how to implement SR/IOV support for an emulated device. Signed-off-by: Knut Omang <knuto@ifi.uio.no> Message-Id: <20220217174504.1051716-3-lukasz.maniak@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
- Loading branch information
Showing
1 changed file
with
115 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
PCI SR/IOV EMULATION SUPPORT | ||
============================ | ||
|
||
Description | ||
=========== | ||
SR/IOV (Single Root I/O Virtualization) is an optional extended capability | ||
of a PCI Express device. It allows a single physical function (PF) to appear as multiple | ||
virtual functions (VFs) for the main purpose of eliminating software | ||
overhead in I/O from virtual machines. | ||
|
||
Qemu now implements the basic common functionality to enable an emulated device | ||
to support SR/IOV. Yet no fully implemented devices exists in Qemu, but a | ||
proof-of-concept hack of the Intel igb can be found here: | ||
|
||
git://github.com/knuto/qemu.git sriov_patches_v5 | ||
|
||
Implementation | ||
============== | ||
Implementing emulation of an SR/IOV capable device typically consists of | ||
implementing support for two types of device classes; the "normal" physical device | ||
(PF) and the virtual device (VF). From Qemu's perspective, the VFs are just | ||
like other devices, except that some of their properties are derived from | ||
the PF. | ||
|
||
A virtual function is different from a physical function in that the BAR | ||
space for all VFs are defined by the BAR registers in the PFs SR/IOV | ||
capability. All VFs have the same BARs and BAR sizes. | ||
|
||
Accesses to these virtual BARs then is computed as | ||
|
||
<VF BAR start> + <VF number> * <BAR sz> + <offset> | ||
|
||
From our emulation perspective this means that there is a separate call for | ||
setting up a BAR for a VF. | ||
|
||
1) To enable SR/IOV support in the PF, it must be a PCI Express device so | ||
you would need to add a PCI Express capability in the normal PCI | ||
capability list. You might also want to add an ARI (Alternative | ||
Routing-ID Interpretation) capability to indicate that your device | ||
supports functions beyond it's "own" function space (0-7), | ||
which is necessary to support more than 7 functions, or | ||
if functions extends beyond offset 7 because they are placed at an | ||
offset > 1 or have stride > 1. | ||
|
||
... | ||
#include "hw/pci/pcie.h" | ||
#include "hw/pci/pcie_sriov.h" | ||
|
||
pci_your_pf_dev_realize( ... ) | ||
{ | ||
... | ||
int ret = pcie_endpoint_cap_init(d, 0x70); | ||
... | ||
pcie_ari_init(d, 0x100, 1); | ||
... | ||
|
||
/* Add and initialize the SR/IOV capability */ | ||
pcie_sriov_pf_init(d, 0x200, "your_virtual_dev", | ||
vf_devid, initial_vfs, total_vfs, | ||
fun_offset, stride); | ||
|
||
/* Set up individual VF BARs (parameters as for normal BARs) */ | ||
pcie_sriov_pf_init_vf_bar( ... ) | ||
... | ||
} | ||
|
||
For cleanup, you simply call: | ||
|
||
pcie_sriov_pf_exit(device); | ||
|
||
which will delete all the virtual functions and associated resources. | ||
|
||
2) Similarly in the implementation of the virtual function, you need to | ||
make it a PCI Express device and add a similar set of capabilities | ||
except for the SR/IOV capability. Then you need to set up the VF BARs as | ||
subregions of the PFs SR/IOV VF BARs by calling | ||
pcie_sriov_vf_register_bar() instead of the normal pci_register_bar() call: | ||
|
||
pci_your_vf_dev_realize( ... ) | ||
{ | ||
... | ||
int ret = pcie_endpoint_cap_init(d, 0x60); | ||
... | ||
pcie_ari_init(d, 0x100, 1); | ||
... | ||
memory_region_init(mr, ... ) | ||
pcie_sriov_vf_register_bar(d, bar_nr, mr); | ||
... | ||
} | ||
|
||
Testing on Linux guest | ||
====================== | ||
The easiest is if your device driver supports sysfs based SR/IOV | ||
enabling. Support for this was added in kernel v.3.8, so not all drivers | ||
support it yet. | ||
|
||
To enable 4 VFs for a device at 01:00.0: | ||
|
||
modprobe yourdriver | ||
echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs | ||
|
||
You should now see 4 VFs with lspci. | ||
To turn SR/IOV off again - the standard requires you to turn it off before you can enable | ||
another VF count, and the emulation enforces this: | ||
|
||
echo 0 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs | ||
|
||
Older drivers typically provide a max_vfs module parameter | ||
to enable it at load time: | ||
|
||
modprobe yourdriver max_vfs=4 | ||
|
||
To disable the VFs again then, you simply have to unload the driver: | ||
|
||
rmmod yourdriver |