Skip to content

fix(install): cargo-features.sh detects ROCm + Vulkan + DirectML, not just CUDA#1002

Merged
joelteply merged 2 commits into
canaryfrom
mac/cargo-features-rocm-directml-vulkan-detection
May 2, 2026
Merged

fix(install): cargo-features.sh detects ROCm + Vulkan + DirectML, not just CUDA#1002
joelteply merged 2 commits into
canaryfrom
mac/cargo-features-rocm-directml-vulkan-detection

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Companion to #1001 (ORT EP cfg branches). cargo-features.sh now picks the right --features per detected GPU runtime. Windows-native gets directml (CUDA too if Nvidia present). Linux+AMD picks rocm or vulkan. Tested on Mac → unchanged.

🤖 Generated with Claude Code

Test and others added 2 commits May 1, 2026 21:46
Per Joel's "OOTB on all architectures from Docker" + "5090 Windows box
available later." Extends the ORT GPU EP coverage from #985 (Mac/CUDA
only) to the full Carl-OOTB matrix:

  --features rocm     → AMD GPU (Linux). ROCmExecutionProvider.
  --features directml → Windows-native, any DX12 GPU (Nvidia/AMD/Intel).
  --features openvino → Intel CPU/GPU/VPU (Linux + Windows).

Each is a cfg-gated branch in build_ort_gpu_execution_providers(). The
no-GPU-EP-configured error message now lists all 5 features so a
contributor on a new arch sees the right --features incantation.

Cargo.toml feature definitions added at lines ~199-207. Per Joel's
"GPU 100%" rule the EPs only activate when explicitly built with the
matching feature flag — no runtime CPU fallback.

Build verified: cargo check --features metal,accelerate clean (the
new cfg branches don't fire on this Mac, no compile cost).

Validation needed on real hardware:
  - BigMama or 5090 Windows box: --features cuda + --features directml
  - Linux+AMD box (when available): --features rocm
  - Intel-Arc Linux box (rarer): --features openvino

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… just CUDA

Per Joel's "OOTB on all architectures from Docker" + the ORT EP
coverage added in #1001. Pre-fix the script only mapped Mac→metal +
Linux+Nvidia→cuda; ROCm was commented-out, Vulkan absent, Windows-
native unhandled entirely.

Detection order on Linux:
  1. nvidia-smi → cuda (highest priority — full ORT/llama.cpp/Candle)
  2. rocminfo  → rocm (AMD with ROCm runtime, full ORT EP)
  3. vulkaninfo → vulkan (AMD/Intel without ROCm; llama.cpp Vulkan
                  path; ORT EPs absent — will hard-fail at session
                  create per #985's helper, surfacing the gap clearly)
  4. else: empty → continuum-core panics at startup per #998 (no CPU
     fallback per architectural rule)

Windows-native (MINGW/MSYS/CYGWIN):
  - DirectML always (DX12 universal on Win10+)
  - +CUDA if nvidia-smi present (ORT picks CUDA first, DirectML for
    non-CUDA-supported ops)

Tested on this Mac: still resolves to "--features metal,accelerate"
(unchanged — Darwin branch).

Validation needed on real hardware:
  - 5090 Windows box: should resolve to "--features cuda,directml"
  - BigMama Linux+Nvidia: still "--features cuda,load-dynamic-ort"
    (unchanged)
  - Future Linux+AMD: will resolve to "--features rocm,load-dynamic-ort"
  - Future Linux+Intel-Arc with Vulkan loader: "--features vulkan,
    load-dynamic-ort"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply merged commit 4ebc556 into canary May 2, 2026
3 checks passed
@joelteply joelteply deleted the mac/cargo-features-rocm-directml-vulkan-detection branch May 2, 2026 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant