Skip to content

fix: thread ep parameter through to WinMLSession#404

Merged
DingmaomaoBJTU merged 11 commits into
mainfrom
fix/thread-ep-to-session
Apr 29, 2026
Merged

fix: thread ep parameter through to WinMLSession#404
DingmaomaoBJTU merged 11 commits into
mainfrom
fix/thread-ep-to-session

Conversation

@DingmaomaoBJTU
Copy link
Copy Markdown
Collaborator

@DingmaomaoBJTU DingmaomaoBJTU commented Apr 27, 2026

Summary

  • The --ep flag was silently dropped in the model construction path: WinMLAutoModel received it but never forwarded it to WinMLPreTrainedModelWinMLSession
  • Added ep parameter to WinMLPreTrainedModel.__init__() and forwarded it to WinMLSession
  • Passed ep at all three winml_class(...) construction sites in WinMLAutoModel (from_pretrained, from_onnx skip-build path, from_onnx build path)

Default-path behavior change

Previously, when no --ep was set, _build_session_options always fell through to set_provider_selection_policy(...) (PREFER_NPU/GPU/CPU) and let ORT pick the EP. This PR also changes that path: for non-CPU devices, it now first calls _find_ep_for_device and, if a match is found, uses add_provider_for_devices directly — bypassing the device policy. This is intentional: it gives us explicit control over which physical device is used (matching what we already do for the --ep path) and avoids WinML EP registration issues that the providers= string-based path cannot handle.

Fixes #402

🤖 Generated with Claude Code

The `--ep` flag was silently dropped in the model construction path.
`WinMLAutoModel.from_pretrained()` and `from_onnx()` received the `ep`
value but never forwarded it to `WinMLPreTrainedModel`, which in turn
never passed it to `WinMLSession`. This caused `ModelCompiler` to fall
back to policy-based EP selection, which it does not support, resulting
in an empty provider type string and a runtime crash.

Fixes #402
@DingmaomaoBJTU DingmaomaoBJTU requested a review from a team as a code owner April 27, 2026 10:14
Policy-based EP selection (set_provider_selection_policy) does not work
for InferenceSession on the current ORT build — nodes end up with an
empty provider type string. When no explicit ep is set, fall back to
_DEVICE_TO_EP to resolve device ("gpu"→"dml", "npu"→"qnn") and use
add_provider_for_devices instead.

Fixes #402
ort.get_ep_devices() may not list DML, causing _find_ep_device to
return None and falling back to the broken policy-based path. Instead,
resolve the providers list directly from EP name map and pass it via
the InferenceSession(providers=...) parameter, which does not depend
on get_ep_devices().

Fixes #402
_find_ep_device previously matched only on ep_name and returned the
first hit. When multiple devices share the same EP (e.g., integrated
+ discrete GPU both using DmlExecutionProvider), this could select the
wrong physical device. Now also matches on OrtHardwareDeviceType,
consistent with the pattern used in runtime_checker_query and winml.py.

Fixes #402
Replace the static _DEVICE_TO_EP mapping (gpu→dml, npu→qnn) with
runtime discovery via ort.get_ep_devices() filtered by device type.
This correctly handles machines with non-default EPs (e.g., CUDA or
MIGraphX on GPU instead of DML).

Fixes #402
When e2e_eval builds a model then benchmarks the resulting .onnx file,
it calls _run_onnx_benchmark which created WinMLSession with only
device but not ep. This was the actual failing path — config.ep was
available but never forwarded to the session.

Fixes #402
QNN supports GPU via Qualcomm Adreno backend, but _EP_DEVICE_MAP
hardcoded it to NPU only. Change to "npu/gpu" and update _DEVICE_EP_MAP
generation to split multi-device strings so QNN appears in both the NPU
and GPU device lists.

Fixes #402
Two issues fixed:

1. Explicit EP (--ep qnn) no longer filters by device type in
   _find_ep_device. QNN reports as NPU in get_ep_devices() but can
   target GPU — trust the user's choice.

2. InferenceSession now uses _build_session_options() (which calls
   add_provider_for_devices, working with WinML EP registry) instead
   of the providers= string parameter (which tries standard DLL
   loading and fails for WinML-registered EPs like QNN). Falls back
   to providers= only when _build_session_options returns policy-based
   options.

Fixes #402
Comment thread src/winml/modelkit/session/session.py Outdated
WinML-registered EPs (e.g. QNN) do not support the providers= parameter
in InferenceSession. Remove _resolve_providers and the conditional
providers= path entirely. EP is now configured exclusively via
add_provider_for_devices in _build_session_options, or left to ORT
device policy in the fallback case.
Copy link
Copy Markdown
Collaborator

@timenick timenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review — 4 issues found. The core fix (threading ep from WinMLAutoModelWinMLPreTrainedModelWinMLSession) looks correct. Comments below are on the supporting changes in session.py.

🤖 Generated with Claude Code

Comment thread src/winml/modelkit/session/session.py Outdated
Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py
- Remove unused `device` parameter from `_find_ep_device`; the caller
  intentionally skips device-type filtering (QNN reports as NPU but can
  target GPU), so the parameter was dead code that contradicted the
  comment at the call site
- Update `_build_session_options` docstring to document that `"cpu"` is
  excluded from the add_provider_for_devices path and falls through to
  policy-based selection
- Document registry-order dependency in `_find_ep_for_device`: when
  multiple EPs match the same device type the first one wins; callers
  that need a specific EP should set `self._ep` to bypass discovery
@DingmaomaoBJTU DingmaomaoBJTU merged commit ef2db0b into main Apr 29, 2026
9 checks passed
@DingmaomaoBJTU DingmaomaoBJTU deleted the fix/thread-ep-to-session branch April 29, 2026 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: --ep flag is silently dropped, never reaches WinMLSession

3 participants