Skip to content

wmk compile --ep qnn silently falls back to OpenVINO when QNN SDK is unavailable #186

@timenick

Description

@timenick

Summary

wmk compile --ep qnn silently falls back to OpenVINO EP when QNN SDK is not installed, with no warning. The output file is misleadingly named *_qnn_ctx.onnx but contains an OpenVINO binary, causing downstream wmk perf to crash.

Repro steps

# 1. On a machine without QNN SDK (Intel NPU + OpenVINO EP available)
wmk compile -m <quantized.onnx> --output-dir out/ --ep qnn

# Observe: no error, but logs show OpenVINO was used:
#   Session created with policy qnn, providers: ['OpenVINOExecutionProvider', 'CPUExecutionProvider']
# Output: out/resnet_quant_qnn_ctx.onnx + out/resnet_quant_qnn_ctx_OpenVINOExecutionProvider.bin

# 2. Run perf without --device (auto-device picks GPU, crashes)
wmk perf -m out/resnet_quant_qnn_ctx.onnx --iterations 10
# Error: Nv EP could not deserialize engine from cache: ...OpenVINOExecutionProvider.bin

# 3. Workaround: explicit --device npu
wmk perf -m out/resnet_quant_qnn_ctx.onnx --iterations 10 --device npu
# Works correctly

Root cause

compiler/stages/compile.py:69 passes EP name "qnn" as the device parameter to WinMLSession:

winml_session = session_cls(
    onnx_path=model_path,
    device=context.execution_provider,  # "qnn" is an EP name, not a device
)

session.py:409-410 _build_session_options() doesn't find "qnn" in DEVICE_POLICY_MAP and silently defaults to PREFER_NPU:

policy = DEVICE_POLICY_MAP.get(
    device.lower(), ort.OrtExecutionProviderDevicePolicy.PREFER_NPU  # silent fallback
)

ORT's PREFER_NPU policy then walks the fallback chain (QNN → OpenVINO → DML → CPU) without error.

Expected behavior

  • If user explicitly requests --ep qnn and QNN SDK is not available, wmk compile should error (or at minimum warn loudly) rather than silently using a different EP.
  • Output file naming should reflect the actual EP used.

Environment

  • wmk sys output: Intel AI Boost NPU, RTX 5080 GPU, QNN SDK "Not found", OpenVINO EP available
  • OS: Windows 11, ORT 1.23.3 (windowsml)

🤖 Generated with Claude Code

Metadata

Metadata

Labels

NPUNPU specificP0Critical — blocking, crash, data lossbugSomething isn't workingdev experienceDeveloper experience improvementsneed triageNeeds triage

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions