Skip to content
This repository has been archived by the owner on Apr 14, 2021. It is now read-only.

Simplified PCI

Sebastien Boeuf edited this page Feb 6, 2019 · 1 revision

Goal

The current amount of PCI emulation code in QEMU is about

 cloc pci pci-bridge pci-host
      79 text files.
      75 unique files.
       9 files ignored.

github.com/AlDanial/cloc v 1.74  T=0.38 s (182.1 files/s, 77158.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C                               38           2229           2113          11946
D                               26           4442              0           8346
C/C++ Header                     6             80              2            497
-------------------------------------------------------------------------------
SUM:                            70           6751           2115          20789
-------------------------------------------------------------------------------

This code emulates all elements of a PCI and PCIe topology including Host Bridges, Root Ports, Switches, etc. However as we are moving towards an ACPI defined platform with APCI hotplug, a lot of this emulation may not be needed. At a rough analysis about 50% of this code can be easily eliminated.

One of the challenges with PCIe is that you need to define a pretty complex topology to support a large number of devices.

Background

Microsoft's GEN2 Hypervisor supports an x86 platform without PCI. This is made possible by the fact that thier PV drivers (equivalent to our virtio drivers) are instantiated by the vmbus. However when they do need to expose a PCI device to the guest operating system they do the following

  • Use VMBus to create a new PCIe Domain
  • Instantiate a BUS under that PCIe Domain (Bus 0)
  • Directly add the device to this bus

So a lspci on a GEN2 guest will provide something along the lines of

85ba:00:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01) (prog-if 02 [NVM Express])
	Subsystem: Intel Corporation DC P3700 SSD
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 0
	NUMA node: 0
	Region 0: Memory at fe0000000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI-X: Enable+ Count=32 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00003000
	...
	Capabilities: [2a0 v1] #19
	Kernel driver in use: nvme
	Kernel modules: nvme

In our case we have to have a PCI Bus so that the virtio devices used to boot the virtual machine are presented to the guest operating system in a standard manner.

Note: You can support virtio via virtio-mmio as demonstrated by crosvm, but that uses the kernel command line to send this information to the guest, which is not a model that will work well for a typical cloud workload that boots from disk image.

        virtio_mmio.device=
                        [VMMIO] Memory mapped virtio (platform) device.

                                <size>@<baseaddr>:<irq>[:<id>]
                        where:
                                <size>     := size (can use standard suffixes
                                                like K, M and G)
                                <baseaddr> := physical base address
                                <irq>      := interrupt number (as passed to
                                                request_irq())
                                <id>       := (optional) platform device id
                        example:
                                virtio_mmio.device=1K@0x100b0000:48:7

Note: ARM/Linaro seems to have done some work on using ACPI to perform virtio-mmio discovery