Skip to content

Testing VFIO with GPU

liujing2 edited this page Oct 17, 2018 · 1 revision

Testing VFIO passthrough with NEMU

As the the current CI framework runs in a VM it does not test VFIO passthrough. This document shows how to test it manually using a GPU as an example.

Determine Segment:Bus:Device.Function, Vendor and Device ID

$ lspci -nn -D | grep VGA
0000:17:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK107 [GeForce GTX 650] [10de:0fc6] (rev a1)

0000:17:00.0 is the Segment:Bus:Device.Function

10de:0fc6 is the Vendor and Device ID

Find the IOMMU group for the device

$ BDF="0000:17:00.0"
$ readlink -e /sys/bus/pci/devices/$BDF/iommu_group
/sys/kernel/iommu_groups/24

Find all other devices that are present in the same IOMMU group

$ find /sys/kernel/iommu_groups/ -type l | grep "0000:17.0"
/sys/kernel/iommu_groups/24/devices/0000:17:00.0
/sys/kernel/iommu_groups/24/devices/0000:17:00.0

Note: In this case the graphics card is a multifunction device and 0000:17:00.0 and 0000:17:00.0 represent the graphics and audio devices associated with the graphics card

Determine the Device and Vendor IDs for remaining devices in the IOMMU group

$ lspci -nn -D | grep "0000:17:00.1"
0000:17:00.1 Audio device [0403]: NVIDIA Corporation GK107 HDMI Audio Controller [10de:0e1b] (rev a1)

Setup the host not to use the GPU

To avoid potential graphics instability on the platform, it is advisable to bind to vfio-pci at boot time rather than after the host system is up.

Note: This is not typically needed for other devices. Other PCI devices can typically be unbound post boot.

Grub

  1. Edit /etc/default/grub to append the following to GRUB_CMDLINE_LINUX_DEFAULT

intel_iommu=on rd.driver.pre=vfio-pci video=vesafb:off,efifb:off

  • intel_iommu=on enables VT-d.
  • rd.driver.pre=vfio-pci ensures that the vfio-pci driver is loaded early
  • video=vesafb:off,efifb:off disable the EFI/VESA framebuffer
  1. Update the grub configuration
sudo update-grub

/etc/modules

Append the following to /etc/modules to ensure that the vfio drivers are loaded at boot

vfio-pci 
vfio_iommu_type1

/etc/modprobe.d/vfio.conf

Setup the device(s) in the IOMMU Group to bind to vfio-pci rather than its default driver. Append the following to /etc/modprobe.d/vfio.conf to bind the two functions to vfio.

options vfio-pci ids=10de:0fc6,10de:0e1b
options vfio-pci disable_vga=1

Blacklist the Graphics drivers

Graphics device binding, particularly with the nouveau driver, happens early enough that to successfully bind to vfio-pci, it is required that the graphics and audio driver has to be explicitly blacklisted.

Create /etc/modprobe.d/blacklist-nouveau.conf

blacklist nouveau
options nouveau modeset=0
blacklist snd_hda_intel

Apply the changes

For these changes to take effect, the initramfs needs to be updated and a reboot is required:

sudo update-initramfs -u
sudo reboot

Launching NEMU with VFIO

Check vfio-pci binding

  1. After the host has rebooted check that the vfio-pci driver is bound to the device
$ lspci -vvv -s 17:00.0
17:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GTX 650] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: eVga.com. Corp. GK107 [GeForce GTX 650]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 55
        NUMA node: 0
        Region 0: Memory at b4000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at a0000000 (64-bit, prefetchable) [size=256M]
        Region 3: Memory at b0000000 (64-bit, prefetchable) [size=32M]
        Region 5: I/O ports at 7000 [size=128]
        Expansion ROM at b5000000 [disabled] [size=512K]
        Capabilities: <access denied>
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

$ lspci -vvv -s 17:00.1
17:00.1 Audio device: NVIDIA Corporation GK107 HDMI Audio Controller (rev a1)
        Subsystem: eVga.com. Corp. GK107 HDMI Audio Controller
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin B routed to IRQ 10
        NUMA node: 0
        Region 0: Memory at b5080000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

Launch NEMU with the device passed in

$ qemu-system-x86_64 \
     -bios ovmf.fd.virt \
     -nographic \
     -nodefaults \
     -L . \
     -net none \
     -machine virt,accel=kvm,kernel_irqchip \
     -smp 4 \
     -m 512M,slots=3,maxmem=4G \
     -device virtio-serial-pci,id=virtio-serial0 -device virtconsole,chardev=charconsole0,id=console0 -chardev stdio,id=charconsole0 \
     -device virtio-blk-pci,drive=image,bus=pcie.0 -drive if=none,id=image,file=clear.img,format=raw \
     -monitor telnet:127.0.0.1:55555,server,nowait \
     -netdev user,id=mynet0,hostfwd=tcp::10022-:22 -device virtio-net-pci,netdev=mynet0 \
     -device sysbus-debugcon,iobase=0x402,chardev=debugcon -chardev file,path=/tmp/debug-log,id=debugcon \
     -device vfio-pci,host=17:00.0 

Note: -device vfio-pci,host=17:00.0. So we pass in just the graphics device. Not all other devices or functions in the IOMMU group.

Verify that the device is visible and available in the VM

$ lspci -v
...
00:04.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GTX 650] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: eVga.com. Corp. Device 2652
        Physical Slot: 4
        Flags: bus master, fast devsel, latency 0, IRQ 7
        Memory at 90000000 (32-bit, non-prefetchable) [size=16M]
        [virtual] Memory at 800000000 (64-bit, prefetchable) [size=256M]
        Memory at 810000000 (64-bit, prefetchable) [size=32M]
        I/O ports at c000 [size=128]
        Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nouveau
$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3
  1:          0          0          0          0   IO-APIC  17-fasteoi   ACPI:Ged
  2:          0          0          0          0   IO-APIC  19-fasteoi   ACPI:Ged
  3:          0          0          0          0   IO-APIC  18-fasteoi   ACPI:Ged
  4:          0          0          0          0   IO-APIC  16-fasteoi   ACPI:Ged
  5:          0          0          0          0   PCI-MSI 16384-edge      virtio0-config
  6:          0         60          0          0   PCI-MSI 16385-edge      virtio0-virtqueues
  7:          0          0        110          0   PCI-MSI 65536-edge      nvkm
$ ls /dev/dri/*

/dev/dri/card0  /dev/dri/renderD128

/dev/dri/by-path:
pci-0000:00:04.0-card  pci-0000:00:04.0-render
You can’t perform that action at this time.