Skip to content

OpenShell MicroVM GPU passthrough fails on DGX Spark / GB10 due to ARM QEMU path and VFIO 1:1 IOMMU mapping requirement #1780

@hugoverjus

Description

@hugoverjus

Agent Diagnostic

Agent Diagnostic

Skills Loaded

Per CONTRIBUTING.md, the agent cloned https://github.com/NVIDIA/OpenShell.git and loaded/used the relevant repo skills:

  • debug-openshell-cluster
  • openshell-cli
  • create-github-issue
  • create-spike

It also used a local project runbook skill from prior investigation:

  • openshell-gpu-vm

Environment Investigated

Target host: DGX Spark / GB10, aarch64

Observed:

hostname: spark-404e
OS: Ubuntu 24.04.3 LTS
kernel: 6.14.0-1013-nvidia
arch: aarch64
GPU: NVIDIA GB10
OpenShell: 0.0.56

GPU / IOMMU state:

GPU PCI BDF: 000f:01:00.0
IOMMU group: 20
IOMMU group type: DMA
GPU is alone in its IOMMU group
/dev/kvm exists
CPU OpenShell VM sandbox works

What The Agent Found In OpenShell Source

The VM GPU/QEMU path is currently x86-specific in crates/openshell-driver-vm/src/runtime.rs:

let mut qemu_cmd = StdCommand::new("qemu-system-x86_64");

qemu_cmd
    .arg("-machine")
    .arg("q35,accel=kvm")

The same path attaches the GPU as:

-device vfio-pci,host=<BDF>,bus=gpu_root

The VM driver GPU inventory logic worked correctly on this host:

GPU inventory initialized gpu_count=1
assigned GPU to sandbox bdf=000f:01:00.0 iommu_group=20

So the initial OpenShell-side discovery/assignment is working once the GPU is bound to vfio-pci.

What The Agent Tried

  1. Stopped the running GPU workloads, which were Docker containers using the GB10.

  2. Installed OpenShell 0.0.56 arm64 package.

  3. Verified CPU VM sandbox works.

  4. Bound 000f:01:00.0 to vfio-pci.

  5. Started root OpenShell VM GPU gateway.

  6. Confirmed gateway sees GPU:

    GPU inventory initialized gpu_count=1
    
  7. Tried openshell sandbox create --gpu.

Initial failure:

VM kernel not found: /root/.local/share/openshell/vm-runtime/0.0.56/vmlinux
  1. Staged a temporary QEMU runtime with host kernel:

    /root/.local/share/openshell/qemu-runtime-host-kernel/vmlinux
    
  2. Patched cached VM rootfs with matching host modules/user-space pieces, similar to previous x86_64 bring-up:

    • overlay.ko
    • veth.ko
    • NVIDIA kernel modules
    • NVIDIA firmware
    • libcuda.so
    • libnvidia-ml.so
    • nvidia-smi
    • policy/device-node fixes
  3. Cloned OpenShell source at v0.0.56.

  4. Patched VM runtime to use ARM QEMU on aarch64:

    qemu-system-aarch64
    -machine virt,accel=kvm,gic-version=3
    
  5. Built a custom openshell-driver-vm and configured the gateway to use it.

This got past the hardcoded x86 QEMU issue. QEMU started, but VFIO then failed.

Current Blocker

QEMU/VFIO fails opening the GB10 device:

qemu-system-aarch64: -device vfio-pci,host=000f:01:00.0,bus=gpu_root:
vfio 000f:01:00.0: error getting device from group 20: Invalid argument
Verify all devices in group 20 are bound to vfio-<bus> or pci-stub and not already in use

Kernel log shows the more specific platform/IOMMU reason:

vfio-pci 000f:01:00.0: Firmware has requested this device have a 1:1 IOMMU mapping,
rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.

IOMMU group reserved regions include direct mappings:

/sys/kernel/iommu_groups/20/type: DMA

reserved_regions:
0x0000000008000000 0x00000000080fffff msi
0x00000000a1600000 0x00000000b97fffff direct
0x0000000200000000 0x0000000302ffffff direct

The agent also tried to set the IOMMU group type to identity, but the kernel rejected it:

echo identity > /sys/kernel/iommu_groups/20/type
write error: Operation not permitted

The agent also tried an explicit QEMU iommufd object, but this distro QEMU build does not support it:

qemu-system-aarch64: -object iommufd,id=iommufd0:
Parameter 'qom-type' does not accept value 'iommufd'

Conclusion

The agent confirmed two separate issues:

  1. OpenShell VM GPU QEMU path is currently x86-specific and needs an aarch64 path:

    • qemu-system-aarch64
    • virt machine
    • ARM-compatible device topology
  2. After patching that locally, DGX Spark / GB10 passthrough is still blocked by platform firmware/IOMMU behavior:

    • firmware requires 1:1 IOMMU mapping
    • kernel refuses VFIO setup without that mapping
    • userspace/OpenShell cannot enable the required mapping at runtime

The host was cleaned up after testing:

  • GPU restored to nvidia
  • OpenShell test gateways stopped
  • No GPU VM sandbox left running

Description

OpenShell VM GPU passthrough on DGX Spark / GB10 fails: after patching the VM driver to use ARM QEMU, QEMU reaches VFIO but cannot open the GPU.
The kernel rejects VFIO setup because firmware requires a 1:1 IOMMU mapping: rejecting configuring the device without a 1:1 mapping.
Expected: openshell sandbox create --gpu should boot an aarch64 QEMU MicroVM with the GB10 passed through, or clearly document that Spark/GB10 VFIO passthrough is unsupported.

Reproduction Steps

  1. Use a DGX Spark / GB10 aarch64 host where the GPU is visible and alone in its IOMMU group:

    • GPU BDF: 000f:01:00.0
    • IOMMU group: 20
    • /dev/kvm exists
  2. Install OpenShell 0.0.56 and verify a non-GPU VM sandbox works.

  3. Stop GPU workloads, then bind the GPU to vfio-pci.

  4. Patch/build openshell-driver-vm so the QEMU path uses:

    • qemu-system-aarch64
    • virt,accel=kvm,gic-version=3
  5. Start a root OpenShell VM GPU gateway with OPENSHELL_VM_GPU=true.

  6. Run:

    openshell sandbox create -g spark-vm-gpu --from base --gpu --no-keep --no-tty -- uname -a
  7. QEMU reaches VFIO but fails opening the device with the 1:1 IOMMU mapping error.

Environment

OS: Ubuntu 24.04.3 LTS (noble), aarch64

Kernel: 6.14.0-1013-nvidia

Docker: installed and running on the host

OpenShell: 0.0.56

Hardware: DGX Spark / GB10

GPU: NVIDIA GB10, PCI BDF 000f:01:00.0

IOMMU: GPU is alone in IOMMU group 20; group type is DMA

KVM: /dev/kvm exists

NVIDIA driver: 580.95.05, open kernel module

QEMU: host is aarch64; default OpenShell VM GPU path expected x86 QEMU until locally patched to qemu-system-aarch64

Logs

Agent-First Checklist

  • I pointed my agent at the repo and had it investigate this issue
  • I loaded relevant skills (e.g., debug-openshell-cluster, debug-inference, openshell-cli)
  • My agent could not resolve this — the diagnostic above explains why

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions