Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM deployment fails with an error "XML error: Invalid PCI address 0000:01:01.0. slot must be <= 0" if MACHINE=q35 #6492

Closed
3 tasks done
OpenNebulaSupport opened this issue Feb 2, 2024 · 1 comment

Comments

@OpenNebulaSupport
Copy link
Collaborator

OpenNebulaSupport commented Feb 2, 2024

Description
https://forum.opennebula.io/t/cannot-deploy-virtual-machine-using-q35-chipset-with-pci-passthrough/8832

VM deployment fails with an error "XML error: Invalid PCI address 0000:01:01.0. slot must be <= 0" if MACHINE=q35.

vm.log:

Fri Feb  2 17:22:23 2024 [Z0][VM][I]: New state is ACTIVE
Fri Feb  2 17:22:23 2024 [Z0][VM][I]: New LCM state is PROLOG
Fri Feb  2 17:22:24 2024 [Z0][VM][I]: New LCM state is BOOT
Fri Feb  2 17:22:24 2024 [Z0][VMM][I]: Generating deployment file: /var/lib/one/vms/296699/deployment.0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_context.
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: ExitCode: 0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: ExitCode: 0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/mkdir -p.
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: ExitCode: 0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/cat - >/var/lib/one//datastores/0/296699/vm.xml.
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: ExitCode: 0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Successfully execute virtualization driver operation: /bin/cat - >/var/lib/one//datastores/0/296699/ds.xml.
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Command execution fail (exit code: 255): cat << 'EOT' | /var/tmp/one/vmm/kvm/deploy '/var/lib/one//datastores/0/296699/deployment.0' 'localhost' 296699 localhost
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: error: Failed to create domain from /var/lib/one//datastores/0/296699/deployment.0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: error: XML error: Invalid PCI address 0000:01:01.0. slot must be <= 0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Could not create domain from /var/lib/one//datastores/0/296699/deployment.0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: ExitCode: 255
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: ExitCode: 0
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Successfully execute network driver operation: clean.
Fri Feb  2 17:22:25 2024 [Z0][VMM][I]: Failed to execute virtualization driver operation: deploy.
Fri Feb  2 17:22:25 2024 [Z0][VMM][E]: DEPLOY: error: Failed to create domain from /var/lib/one//datastores/0/296699/deployment.0 error: XML error: Invalid PCI address 0000:01:01.0. slot must be <= 0 Could not create domain from /var/lib/one//datastores/0/296699/deployment.0 ExitCode: 255
Fri Feb  2 17:22:25 2024 [Z0][VM][I]: New LCM state is BOOT_FAILURE

Corresponding part of onevm show -j output

onevm show -j [vmid]|jq .VM.TEMPLATE.PCI
{
  "ADDRESS": "0000:41:00:0",
  "BUS": "41",
  "CLASS": "0302",
  "DEVICE": "2236",
  "DOMAIN": "0000",
  "FUNCTION": "0",
  "NUMA_NODE": "-",
  "PCI_ID": "0",
  "SHORT_ADDRESS": "41:00.0",
  "SLOT": "00",
  "VENDOR": "10de",
  "VM_ADDRESS": "01:01.0",
  "VM_BUS": "0x01",
  "VM_DOMAIN": "0x0000",
  "VM_FUNCTION": "0",
  "VM_SLOT": "0x01"
}

Corresponding part of deployment.0

<devices>
...
                <hostdev mode='subsystem' type='pci' managed='yes'>
                        <source>
                                <address  domain='0x0000' bus='0x41' slot='0x00' function='0x0'/>
                        </source>
                                <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0'/>
                </hostdev>
        </devices>
        <devices>
                <controller index='0' type='pci' model='pcie-root'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-root-port'/>
                <controller type='pci' model='pcie-to-pci-bridge'/>
        </devices>

The following manual change helps to deploy VM from deployment.0 file:

diff /var/lib/one/datastores/0/<vmid>/deployment.0 /tmp/<vmid>.deployment.0.orig
43c43
<                               <address type='pci' domain='0x0000' bus='0x01' slot='0' function='0'/>
---
>                               <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0'/>

lspci output from the inside VM:

00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
00:01.0 VGA compatible controller: Cirrus Logic GD 5446
00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.6 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.7 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.6 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:03.7 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:1f.0 ISA bridge: Intel Corporation 82801IB (ICH9) LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode] (rev 02)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 02)
01:00.0 3D controller: NVIDIA Corporation GA102GL [A10] (rev a1)
02:00.0 PCI bridge: Red Hat, Inc. Device 000e
04:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
05:00.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI (rev 01)
06:00.0 USB controller: Red Hat, Inc. QEMU XHCI Host Controller (rev 01)
07:00.0 Communication controller: Red Hat, Inc. Virtio console (rev 01)
08:00.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon (rev 01)

To Reproduce
Create VM template with PCI device and MACHINE type 'q35'. Try to deploy it.

Expected behavior
VM should be deployed successfully.

Details

  • Affected Component: [Core]
  • Hypervisor: [KVM]
  • Version: [development]

Progress Status

  • Code committed
  • Testing - QA
  • Documentation (Release notes - resolved issues, compatibility, known issues)
@rsmontero rsmontero self-assigned this Feb 5, 2024
@rsmontero rsmontero added this to the Release 6.8.2 milestone Feb 5, 2024
rsmontero added a commit that referenced this issue Feb 8, 2024
If q35 machine type is detected the slot of the pci device is set to 0
and the bus to pci_id + 1.

Q35 models uses pcie-root-ports to attach PCI devices. Each PCI port is
selected by the bus parameter of the PCI address and it that does not accept a
slot number greater than 0.

Example:

A VM with 2 X710 VFs is defined OpenNebula as:

PCI=[
  ADDRESS="0000:44:0a:0",
  BUS="44",
  CLASS="0200",
  DEVICE="154c",
  DOMAIN="0000",
  FUNCTION="0",
  NUMA_NODE="0",
  PCI_ID="0",
  SHORT_ADDRESS="44:0a.0",
  SLOT="0a",
  VENDOR="8086",
  VM_ADDRESS="01:00.0",
  VM_BUS="0x01",
  VM_DOMAIN="0x0000",
  VM_FUNCTION="0",
  VM_SLOT="0000" ]

PCI=[
  ADDRESS="0000:44:0a:1",
  BUS="44",
  CLASS="0200",
  DEVICE="154c",
  DOMAIN="0000",
  FUNCTION="1",
  NUMA_NODE="0",
  PCI_ID="1",
  SHORT_ADDRESS="44:0a.1",
  SLOT="0a",
  VENDOR="8086",
  VM_ADDRESS="02:00.0",
  VM_BUS="0x02",
  VM_DOMAIN="0x0000",
  VM_FUNCTION="0",
  VM_SLOT="0000" ]

Each PCI VFs is attached to different pcie-root-port, selected with the
VM_BUS parameter:

00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port

The PCI topology is:

-[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
           +-01.0  Cirrus Logic GD 5446
           +-02.0-[01]----00.0  Intel Corporation Ethernet Virtual Function 700 Series
           +-02.1-[02]----00.0  Intel Corporation Ethernet Virtual Function 700 Series
           +-02.2-[03-04]----00.0-[04]--
           +-02.3-[05]----00.0  Red Hat, Inc. Virtio network device
           +-02.4-[06]----00.0  Red Hat, Inc. Virtio SCSI
           +-02.5-[07]----00.0  Red Hat, Inc. QEMU XHCI Host Controller
           +-02.6-[08]----00.0  Red Hat, Inc. Virtio console
           +-02.7-[09]----00.0  Red Hat, Inc. Virtio memory balloon
           +-03.0-[0a]--
           +-03.1-[0b]--
           +-03.2-[0c]--
           +-03.3-[0d]--
           +-03.4-[0e]--
           +-03.5-[0f]--
           +-03.6-[10]--
           +-03.7-[11]--
           +-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface Controller
           +-1f.2  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
           \-1f.3  Intel Corporation 82801I (ICH9 Family) SMBus Controller
@rsmontero
Copy link
Member

Docs commits
Code commits

rsmontero added a commit that referenced this issue Mar 11, 2024
If q35 machine type is detected the slot of the pci device is set to 0
and the bus to pci_id + 1.

Q35 models uses pcie-root-ports to attach PCI devices. Each PCI port is
selected by the bus parameter of the PCI address and it that does not accept a
slot number greater than 0.

Example:

A VM with 2 X710 VFs is defined OpenNebula as:

PCI=[
  ADDRESS="0000:44:0a:0",
  BUS="44",
  CLASS="0200",
  DEVICE="154c",
  DOMAIN="0000",
  FUNCTION="0",
  NUMA_NODE="0",
  PCI_ID="0",
  SHORT_ADDRESS="44:0a.0",
  SLOT="0a",
  VENDOR="8086",
  VM_ADDRESS="01:00.0",
  VM_BUS="0x01",
  VM_DOMAIN="0x0000",
  VM_FUNCTION="0",
  VM_SLOT="0000" ]

PCI=[
  ADDRESS="0000:44:0a:1",
  BUS="44",
  CLASS="0200",
  DEVICE="154c",
  DOMAIN="0000",
  FUNCTION="1",
  NUMA_NODE="0",
  PCI_ID="1",
  SHORT_ADDRESS="44:0a.1",
  SLOT="0a",
  VENDOR="8086",
  VM_ADDRESS="02:00.0",
  VM_BUS="0x02",
  VM_DOMAIN="0x0000",
  VM_FUNCTION="0",
  VM_SLOT="0000" ]

Each PCI VFs is attached to different pcie-root-port, selected with the
VM_BUS parameter:

00:02.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
00:02.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port

The PCI topology is:

-[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
           +-01.0  Cirrus Logic GD 5446
           +-02.0-[01]----00.0  Intel Corporation Ethernet Virtual Function 700 Series
           +-02.1-[02]----00.0  Intel Corporation Ethernet Virtual Function 700 Series
           +-02.2-[03-04]----00.0-[04]--
           +-02.3-[05]----00.0  Red Hat, Inc. Virtio network device
           +-02.4-[06]----00.0  Red Hat, Inc. Virtio SCSI
           +-02.5-[07]----00.0  Red Hat, Inc. QEMU XHCI Host Controller
           +-02.6-[08]----00.0  Red Hat, Inc. Virtio console
           +-02.7-[09]----00.0  Red Hat, Inc. Virtio memory balloon
           +-03.0-[0a]--
           +-03.1-[0b]--
           +-03.2-[0c]--
           +-03.3-[0d]--
           +-03.4-[0e]--
           +-03.5-[0f]--
           +-03.6-[10]--
           +-03.7-[11]--
           +-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface Controller
           +-1f.2  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
           \-1f.3  Intel Corporation 82801I (ICH9 Family) SMBus Controller

(cherry picked from commit 4ce7340)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants