Skip to content

Conversation

@scholzp
Copy link

@scholzp scholzp commented Nov 3, 2025

Configurable BDF Design

Background

Configurable PCI BDF (Bus, Device, Function) allows for choosing the address that a guest can see a PCI device. Qemu, for example, implements the addr option for the device argument. The addr argument allows defining the device and function part of a device's BDF in the form addr=DD.F, where D denotes a device ID in the range of [0..32] and F a function ID in a range of [0..7]. Qemu denotes the bus number with the option busnr, where a valid expression is busnr=pci.2 to denote the second PCI bus, for example. [0]

With a series of pull requests, we try to mimic the command-line interface of Qemu and introduce new command line parameters to CHV, where applicable to similarly achieve configurable BDFs.

An example of a BDF for Bus=1, Device=3, and Function=5 is 1:03.5. [1]

Current status

CHV allows defining multiple PCI domains via the pci_segments option. For all PCI segments, CHV creates a dedicated host bridge. Via the option argument pci_segment, one can define to which domain a device is added.

Other than that, CHV currently has no support to influence what exact BDF is assigned to a device, other than the order in which devices are specified in the VM configuration. This is done by calling the next_device_bdf function of the
struct PciSegment for each PCI device to create. next_device_bdf returns the a BDF with the following free device ID and a fixed function ID of 0. PciSegment queries the underlying PciBus to obtain the following device ID, which we also must modify.

Currently, there is no support for multi-function devices in CHV.

Proposed changes

We split the configurable BDF feature into separate PRs with the goals defined in The following subsections.

The BAR address allocation is able to handle the changes listed below. So I don't expect the need for changes there.

1. Definition of Device IDs

To define a device ID, we first must implement some notion for specifying them via the API or command line. For this, we introduce the addr option argument, similar to Qemu. The expected format is addr=DD with DD being in the range
of [0..32]. This argument is stored in a device's configuration and must be propagated to next_device_bdf. This function will be replaced by specific_device_bdf to allow for specifying a concrete device identifier. The steps for implementing configurable device IDs are as follows:

  • Introduce function allocate_device_id to struct PciBus to reserve a device ID, depending on the argument given. If no device ID is specified, use the current mechanism to obtain an ID. This replaces next_device_id.
  • Introduce function allocate_device_bdf to struct PciSegment to allocate a BDF depending on function parameters. If no device parameter is specified, fall back to the algorithm currently used by next_device_bdf. Replaces next_device_bdf.
  • Introduce addr parameter to all forms of VirtIo device configurations
    and adjust parsers
  • Add unit tests covering allocate_device_id in PciBus
  • Add unit tests covering allocate_device_bdf in PciSegment
  • Add unit tests covering the new argument and its parsing

2. Definition of Function IDs

Builds on the changes in 1. For this, we extend the addr option argument. Similar to Qemu, the addr option argument is expected in the form of DD.F, where F denotes the function number in a range of [0..7]. We need to modify configurations and their parsers further. Moreover, next_device_bdf in PciSegment must be adjusted so that it does not fix the function part to 0. The steps for implementing configurable function IDs are as follows:

  • Modify allocate_device_bdf in struct PciSegment to use the given function ID instead of fixing the function ID to 0.
  • Enhance addr parameter in all forms of VirtIo device configurations and adjust parsers to accept the new DD.F syntaxs`
  • Add unit tests covering allocate_device_bdf in PciSegment
  • Add unit tests covering the new argument and its parsing

3. Definition of Bus IDs

Builds on the changes in 2. For this, similar to Qemu, we introduce the busnr option argument. The busnr option argument is expected in the form of BUS.ID, where BUS denotes the bus type, e.g., pci, and ID denotes the ID of the bus of the respective type. We need to modify configurations and their further parsers to accept the new option argument. Moreover, next_device_bdf in PciSegment must be adjusted to not fix the bus number to 0. The steps for implementing configurable bus IDs is as follows:

  • Modify PciSegment to enable it to store a reference to more than one PCI bus
  • Add a new function to PciSegment that allows adding new instances of PciBus to it
  • Modify specific_device_bdf in struct PciSegment to use the given bus ID instead of fixing the function ID to 0.
  • Implement busnr parameter in all forms of VirtIo device configurations and adjust parsers to accept it
  • Add unit tests to features added to PciSegment
  • Add unit tests to cover the new option and its parsing

Coming next

From here on we can look into implementing multi-function devices. With the definition of function IDs, the majority of work should be done for this.

References

[0] https://qemu-project.gitlab.io/qemu/system/device-emulation.html#device-buses

[1] https://wiki.xenproject.org/wiki/Bus:Device.Function_(BDF)_Notation

Copy link
Member

@phip1611 phip1611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on a first glance. This however must be done at least also for virtio-net devices. So far I only see this is done for virtio-blk

@scholzp scholzp force-pushed the bdf_device_id branch 4 times, most recently from 7a45389 to 232e8c7 Compare November 7, 2025 09:06
Allocating a device ID is crucial for assigning a specific ID to a
device. We need this to implement configurable PCI BDF.

Signed-off-by: Pascal Scholz <pascal.scholz@cyberus-technology.de>
On-behalf-of: SAP pascal.scholz@sap.com
@scholzp scholzp force-pushed the bdf_device_id branch 2 times, most recently from 2ddd68b to 93ee822 Compare November 7, 2025 10:56
We replace `next_device_bdf` with `allocate_device_bdf`. The new
function optionally accepts a device ID to use when allocating a BDF.
If not provided, we fall back to allocating the next available BDF.
This allows for creating specific BDFs on the respective segment.

Signed-off-by: Pascal Scholz <pascal.scholz@cyberus-technology.de>
On-behalf-of: SAP pascal.scholz@sap.com
Next to tests for `allocate_device_bdf`, we introduce a new constructor
`new_without_address_manager`, only available in the test build. As
there is no way to instantiate an `AddressManager` in the tests, we use
this constructor to work around this.

Signed-off-by: Pascal Scholz <pascal.scholz@cyberus-technology.de>
On-behalf-of: SAP pascal.scholz@sap.com
Updates all config structs in order to make the new config option
available to all PCI device. Additionally update the parser so the new
option becomes available on the CLI.

Signed-off-by: Pascal Scholz <pascal.scholz@cyberus-technology.de>
On-behalf-of: SAP pascal.scholz@sap.com
Signed-off-by: Pascal Scholz <pascal.scholz@cyberus-technology.de>
On-behalf-of: SAP pascal.scholz@sap.com
Currently we do not support multi function PCI devices, so we only
support assigning function IDs equal to 0.
Copy link
Member

@phip1611 phip1611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the workshop! For the purpose of our workshop (if it works), please leave it as is and focus on libvirt and libvirt-tests.

Before we merge this, we need to address a few minor things.

pci/src/bus.rs Outdated
/// Allocates a PCI device ID on the bus.
///
/// - `id`: ID to allocate on the bus. If [`None`], the next free
/// device ID on the bus is allocated, else the ID given is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not pass cargo fmt (or cargo clippy?). device ID should start on the same level as the ` from the line above


#[cfg(test)]
mod tests {
// Note this useful idiom: importing names from outer (for mod tests) scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment from ChatGPT? :D

// Note this useful idiom: importing names from outer (for mod tests) scope.
use super::*;

mod pci_bus_tests {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are already in pci::bus::tests, no need for another nesting level


#[test]
// Test that requesting specific ID work
fn allocate_device_id_request_id() -> std::result::Result<(), Box<dyn std::error::Error>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please return result::Result<(), Box<dyn error::Error>> or even Result<(), Box<dyn Error>>

/// - `id`: ID to allocate on the bus. If [`None`], the next free
/// device ID on the bus is allocated, else the ID given is
/// allocated
/// device ID on the bus is allocated, else the ID given is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this changes should be merged into the commit where you introduced these lines, please


#[cfg(test)]
mod tests {
// Note this useful idiom: importing names from outer (for mod tests) scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again this comment; please remove

}
}

fn parse_addr(s: &str) -> std::result::Result<(u8, u8), OptionParserError> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please return result::Result or just Result<(u8, u8)>. Further, please add inline comments, like this: Result<(u8 /* device */, u8 /* function */)>

id=<device_id>,pci_segment=<segment_id>,rate_limit_group=<group_id>,\
queue_affinity=<list_of_queue_indices_with_their_associated_cpuset>,\
serial=<serial_number>";
serial=<serial_number>, addr=<DD.F>";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • there should be no additional space (as for the other params)
  • I think DD.F doesn't really help that much. What about addr=<DD.F, e.g. 13.0>? But I am also not sure. Maybe your solution is the best. hm.

bw_size=<bytes>,bw_one_time_burst=<bytes>,bw_refill_time=<ms>,\
ops_size=<io_ops>,ops_one_time_burst=<io_ops>,ops_refill_time=<ms>,pci_segment=<segment_id>\
offload_tso=on|off,offload_ufo=on|off,offload_csum=on|off\"";
offload_tso=on|off,offload_ufo=on|off,offload_csum=on|off, addr=DD.F\"";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary space!

.parse::<u8>()
.map_err(|_| OptionParserError::InvalidSyntax(s.to_owned()))?;
Ok((a, b))
let f = if b != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to read.

Please add a check on a dedicated line and return Err in the corresponding if cond {}. After that, please return Ok(b)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants