Skip to content

bug: --device npu resolves to QNN on AMD machines (should use VitisAI) #429

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Summary

When running winml perf --device npu (or any command that resolves device → EP) on an AMD machine, the tool hardcodes qnn as the NPU provider. Since QNNExecutionProvider is not present on AMD hardware, inference falls through with a warning and likely silently degrades to CPU.

Context

The NPU EP is platform-dependent:

  • Qualcomm / Intel NPUQNNExecutionProvider (short name qnn)
  • AMD NPU (Ryzen AI)VitisAIExecutionProvider (short name vitisai)

The current code has "npu": "qnn" hardcoded as the single NPU provider, so any AMD NPU machine hits the wrong EP.

Observed behavior (screenshot)

winml perf -m openai/clip-vit-base-patch32 --device npu --iterations 100
...
[2026-04-30T15:16:37] WARNING: EP 'qnn' (QNNExecutionProvider) not found in available devices

The model runs but the device shown is npu while the QNN warning indicates the session fell back — likely to CPU policy-based selection.

Root Cause

Three locations share the same hardcoded assumption:

  1. src/winml/modelkit/config/precision.py:66_DEVICE_TO_PROVIDER dict:

    _DEVICE_TO_PROVIDER: dict[str, str | None] = {
        "npu": "qnn",   # ← wrong for AMD
        "gpu": "dml",
        "cpu": None,
    }

    Used at precision.py:296 to set compile_provider during resolve_precision().

  2. src/winml/modelkit/commands/compile.py:280-283_resolve_compile_provider():

    provider = _DEVICE_TO_PROVIDER.get(device.lower())
    if provider is None:
        return "cpu" if device.lower() == "cpu" else "qnn"   # ← also hardcoded
    return provider
  3. src/winml/modelkit/session/session.py:438-451 — once self._ep = "qnn" is propagated from the compile config, _build_session_options tries to find QNNExecutionProvider via _find_ep_device(). On AMD it returns None → warning fires → falls through to policy-based selection.

The session already has the right mechanism for discovery (_find_ep_for_device, session.py:487-510), but it is bypassed when self._ep is pre-set to "qnn".

Desired State

When --device npu is requested:

  • On Qualcomm/Intel: resolves to qnn (QNNExecutionProvider) — current behavior, keep
  • On AMD Ryzen AI: resolves to vitisai (VitisAIExecutionProvider)

The EP selection for NPU should inspect which NPU EP is actually available at runtime (via ort.get_ep_devices() or _get_available_eps() from sysinfo/device.py) rather than hardcoding qnn.

Acceptance Criteria

  • winml perf --device npu on an AMD machine uses VitisAIExecutionProvider without warnings
  • winml perf --device npu on a Qualcomm/Intel machine continues to use QNNExecutionProvider
  • _DEVICE_TO_PROVIDER["npu"] is no longer a static string — it is either removed or replaced with a runtime lookup
  • The compile command (winml compile --device npu) resolves to vitisai on AMD
  • No new architecture-specific hardcoding is introduced (per CLAUDE.md Cardinal Rule 1)
  • Existing tests pass; new tests cover AMD-style EP discovery for NPU device

Technical Notes

  • sysinfo/device.py already has the right EP-to-device map (_EP_DEVICE_MAP) where both QNNExecutionProvider and VitisAIExecutionProvider map to "npu". The inverse _DEVICE_EP_MAP["npu"] already contains both.
  • resolve_device() (sysinfo/device.py:146) uses _DEVICE_EP_MAP and _get_available_eps() correctly — it returns the right device string. The problem is downstream: get_provider_for_device() then re-maps that device back to a single hardcoded EP.
  • The fix should replace get_provider_for_device("npu") with a function that inspects _get_available_eps() and picks the first available NPU EP from _DEVICE_EP_MAP["npu"] (priority: VitisAIExecutionProvider, QNNExecutionProvider, or whatever is present).
  • precision.py is pure decision logic (no I/O, no imports of sysinfo at module level) — keep sysinfo calls in callers (build.py, compile.py) rather than inside precision.py to preserve testability.

Related Files

  • src/winml/modelkit/config/precision.py:64-68_DEVICE_TO_PROVIDER dict (primary fix location)
  • src/winml/modelkit/config/precision.py:296compile_provider assignment in resolve_precision()
  • src/winml/modelkit/commands/compile.py:271-284_resolve_compile_provider() secondary hardcode
  • src/winml/modelkit/session/session.py:438-451 — EP matching in _build_session_options() (symptom surface)
  • src/winml/modelkit/sysinfo/device.py:37-58_EP_DEVICE_MAP / _DEVICE_EP_MAP (already correct, use this)
  • src/winml/modelkit/sysinfo/device.py:112-143_get_available_eps() (runtime EP discovery)

Metadata

Metadata

Labels

0430 bugbashBugs found during 0430 bug bashNPUNPU specificbugSomething isn't workinghardwareHardware relatedneed triageNeeds triage

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions