nvidia: make Resizable BAR resize failure non-fatal, and skip proactively on Thunderbolt#1109
Open
orkineric wants to merge 2 commits intoNVIDIA:mainfrom
Open
nvidia: make Resizable BAR resize failure non-fatal, and skip proactively on Thunderbolt#1109orkineric wants to merge 2 commits intoNVIDIA:mainfrom
orkineric wants to merge 2 commits intoNVIDIA:mainfrom
Conversation
nv_resize_pcie_bars() is an optimization: it tries to grow BAR1 to the
largest size the hardware advertises so the CPU can address the full
VRAM directly. When the resize fails -- typically because the upstream
bridge's prefetchable MMIO window is too small to accommodate the
requested size -- the driver currently treats this as a fatal probe
error and bails out via err_zero_dev, preventing the GPU from binding
at all.
This is overly aggressive. The GPU is still perfectly usable with its
existing (un-resized) BAR allocation; that is the entire point of
Resizable BAR being an optional enhancement rather than a hard
requirement. Systems that cannot accommodate the full resize include:
- Thunderbolt / USB4 eGPU enclosures, where the hotplug PCIe bridge
prefetchable window is typically hundreds of MiB, not tens of GiB.
With a modern GPU advertising a maximum BAR1 size of 16-32 GiB,
pci_resize_resource() returns -ENOENT and nv_pci_probe()
fails the whole device, so the eGPU silently never appears in
nvidia-smi.
- Hypervisor guests where the host has passed a constrained MMIO
window through to the guest.
- Older chipsets with small prefetchable windows.
- Platforms where the firmware has locked resources conservatively
(preserve_config set). The existing code already detects
preserve_config and returns early without failure -- this patch
extends the same "skip but keep going" principle to all other
failure modes.
Replace the goto err_zero_dev with a warning print and continue probe.
The device will bind with whatever BAR size was allocated at PCI
enumeration time, which for constrained bridges is already the largest
size that fits.
Tested on an RTX 5090 in a Gigabyte Aorus RTX 5090 AI Box (TB5) on
Fedora 43 / kernel 7.0.0-rc7 + open-gpu-kernel-modules 595.58.03.
Without this patch, the eGPU fails to bind during probe with the
"Fatal Error while attempting to resize PCIe BARs" message and no
further action is possible. With this patch, the eGPU binds
successfully with its initial 256 MiB BAR1 (the largest that fits the
Thunderbolt hotplug bridge prefetch window) and works normally for
CUDA compute workloads.
Signed-off-by: Eric Christenson <eric@neuralnetwork.media>
The previous commit ("nvidia: make Resizable BAR resize failure
non-fatal") is the primary bug fix: it ensures that a failed resize no
longer prevents device binding. This commit is a complementary
optimization on top of that fix.
Thunderbolt / USB4 hotplug PCIe bridges fundamentally cannot host a
GiB-scale prefetchable MMIO window: the bridge prefetchable allocation
on these buses is typically bounded to hundreds of MiB, which is far
smaller than the multi-GiB BAR1 a modern NVIDIA GPU advertises.
Attempting the resize on such a device wastes probe time, emits an
uninformative ENOENT in the kernel log, and then takes the failure
path (now softened to a warning by the previous commit).
Avoid all of that by detecting Thunderbolt attachment up front via
pci_is_thunderbolt_attached(), which walks the parent bridge chain
looking for any bridge with is_thunderbolt set (set by the PCI core's
existing quirks table for known Intel TB host controllers). The
helper has been available in <linux/pci.h> since Linux v4.15
(2017-12-04). For older kernels, the code is gated behind a conftest
check (NV_PCI_IS_THUNDERBOLT_ATTACHED_PRESENT) and the original resize
attempt is used unchanged; older kernels also predate most of the
hardware this optimization targets, so the protection is low-value
there but the guard keeps the driver build-clean on ancient trees.
Non-Thunderbolt devices are unaffected: pci_is_thunderbolt_attached()
returns false for any GPU on a native PCIe slot (CPU root complex or
chipset downstream port), so normal ReBAR continues to run and
GPUs keep their full resized BAR1.
Tested alongside the previous commit on an RTX 5090 in a Gigabyte
Aorus RTX 5090 AI Box (TB5) alongside an internal RTX 5090 in a PCIe
5.0 x16 slot. Result: internal card keeps its full 32 GiB resized
BAR1 (verified by lspci and /sys/bus/pci/devices/.../resource);
the eGPU stays at 256 MiB BAR1 (the largest that fits the TB5 hotplug
bridge window) and binds cleanly without the resize attempt.
Signed-off-by: Eric Christenson <eric@neuralnetwork.media>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
nvidia: make Resizable BAR resize failure non-fatal, and skip proactively on Thunderbolt
Problem
nv_resize_pcie_bars()is invoked duringnv_pci_probe()as anoptimization: it tries to grow BAR1 to the largest size the hardware
advertises so the CPU can address the full VRAM directly. When the
resize fails, the driver currently treats this as a fatal probe error
and bails out via
err_zero_dev, preventing the GPU from binding atall.
This is the code path in question (kernel-open/nvidia/nv-pci.c,
595.58.03 line numbers):
The "fatal error" framing is too strong. Resizable BAR is an optional
enhancement to the PCI 3.0 spec, not a correctness requirement. A GPU
with a non-resized BAR1 is still fully functional for CUDA, graphics,
and everything else the driver supports -- it just uses the DMA path
instead of direct CPU-mapped access to the full VRAM.
Motivating case: Thunderbolt 5 eGPU enclosures
This limitation is particularly visible with Thunderbolt / USB4 eGPU
enclosures, which have become much more common with products like the
Gigabyte Aorus RTX 5090 AI Box, Razer Core X, Sonnet Breakaway Box,
OneXGPU, etc. TB/USB4 hotplug PCIe bridges have a prefetchable MMIO
window on the order of hundreds of MiB, which cannot accommodate the
GiB-scale BAR1 that modern NVIDIA GPUs advertise.
pci_resize_resource()returns-ENOENT,nv_resize_pcie_bars()returns non-zero, the existing code path fails, and the eGPU silently
never appears in
nvidia-smi.A reproduction:
Thunderbolt 5 (Intel JHL9580 "Barlow Ridge" host controller on
motherboard, JHL9480 hub in enclosure)
pci=assign-busses,hpbussize=0x10,hpmmiosize=64M,hpmmioprefsize=384M,realloc pcie_port_pm=off pcie_aspm.policy=performance intel_iommu=off thunderbolt.clx=0(The kernel cmdline sets the hotplug bridge prefetchable window to
384 MiB, which fits BAR1 at the default 256 MiB size but cannot fit
a 32 GiB resized BAR1.)
Observed without this PR:
Observed with this PR:
Beyond eGPUs, the same non-fatal semantics help in other constrained
topologies:
window from the host.
tuned for ReBAR.
host->preserve_configis set but the existing guard didn't coverevery failure path.
Fix (two commits)
Commit 1: "nvidia: make Resizable BAR resize failure non-fatal"
This is the correctness fix. Replace the
goto err_zero_devwith awarning print and continue probe. The device will bind with whatever
BAR size was allocated at PCI enumeration time. This commit alone
fixes the reported symptom.
Commit 2: "nvidia: skip Resizable BAR for Thunderbolt-attached devices"
This is a complementary optimization on top of commit 1. Detect
Thunderbolt attachment via
pci_is_thunderbolt_attached()(in<linux/pci.h>since Linux v4.15 / 2017) and skip the resize attemptproactively, rather than attempting it and falling through to the
warning from commit 1. This saves some probe time and keeps the
kernel log free of the uninformative
-ENOENTfrompci_resize_resource().The helper is gated behind a new conftest
(
NV_PCI_IS_THUNDERBOLT_ATTACHED_PRESENT) inkernel-open/conftest.shand declared inkernel-open/nvidia/nvidia.Kbuild,so older kernels without the helper still build cleanly and fall back
to the generic commit-1 behavior. Non-Thunderbolt devices are
unaffected --
pci_is_thunderbolt_attached()returns false for a GPUon a native PCIe slot, so normal ReBAR continues to run.
Compatibility and review notes
Commit 2 can be dropped if reviewers prefer a minimal change. But
the two together are tidier because commit 2 keeps the kernel log
clean on what is now the most common failure case (TB eGPUs).
behavioral fix; nothing to document in
nv-reg.h.topology-based and only matches GPUs behind TB bridges.
fatal" path:
nv_pci_validate_bars()still bails out hard onBAR0 checks elsewhere in
nv_pci_probe(), anderr_zero_devisstill used for the failures that are genuinely fatal (e.g.
rm_init_private_state()). This PR only changes the one specificfailure mode that had false-positive fatality.
pci_rebar_get_possible_sizescase immediately above the newentry).
Related
registry key" by @roger-pmta). nvidia: add RmForceExternalGpu registry key #984 teaches the driver to treat a
specified GPU as external; this PR makes sure the driver can
actually bind a TB-attached GPU in the first place, which is the
prerequisite for nvidia: add RmForceExternalGpu registry key #984's registry key to do anything useful on a
TB5 eGPU like the Aorus AI Box. We've been running both patches
together in production for a dual-RTX-5090 workstation (internal
https://egpu.io/forums/builds/2023-14-lenovo-thinkpad-x1-carbon-gen-11-13th10cu-rtx-5080-32gbps-tb4-sonnet-breakaway-box-850-t5-linux-rocky-10-1/
(5080 / Sonnet) and in the companion configuration notes for the
RTX 5090 + Aorus AI Box setup this PR was developed against.