fix: thread ep parameter through to WinMLSession#404
Merged
Conversation
The `--ep` flag was silently dropped in the model construction path. `WinMLAutoModel.from_pretrained()` and `from_onnx()` received the `ep` value but never forwarded it to `WinMLPreTrainedModel`, which in turn never passed it to `WinMLSession`. This caused `ModelCompiler` to fall back to policy-based EP selection, which it does not support, resulting in an empty provider type string and a runtime crash. Fixes #402
Policy-based EP selection (set_provider_selection_policy) does not work
for InferenceSession on the current ORT build — nodes end up with an
empty provider type string. When no explicit ep is set, fall back to
_DEVICE_TO_EP to resolve device ("gpu"→"dml", "npu"→"qnn") and use
add_provider_for_devices instead.
Fixes #402
ort.get_ep_devices() may not list DML, causing _find_ep_device to return None and falling back to the broken policy-based path. Instead, resolve the providers list directly from EP name map and pass it via the InferenceSession(providers=...) parameter, which does not depend on get_ep_devices(). Fixes #402
_find_ep_device previously matched only on ep_name and returned the first hit. When multiple devices share the same EP (e.g., integrated + discrete GPU both using DmlExecutionProvider), this could select the wrong physical device. Now also matches on OrtHardwareDeviceType, consistent with the pattern used in runtime_checker_query and winml.py. Fixes #402
Replace the static _DEVICE_TO_EP mapping (gpu→dml, npu→qnn) with runtime discovery via ort.get_ep_devices() filtered by device type. This correctly handles machines with non-default EPs (e.g., CUDA or MIGraphX on GPU instead of DML). Fixes #402
When e2e_eval builds a model then benchmarks the resulting .onnx file, it calls _run_onnx_benchmark which created WinMLSession with only device but not ep. This was the actual failing path — config.ep was available but never forwarded to the session. Fixes #402
QNN supports GPU via Qualcomm Adreno backend, but _EP_DEVICE_MAP hardcoded it to NPU only. Change to "npu/gpu" and update _DEVICE_EP_MAP generation to split multi-device strings so QNN appears in both the NPU and GPU device lists. Fixes #402
Two issues fixed: 1. Explicit EP (--ep qnn) no longer filters by device type in _find_ep_device. QNN reports as NPU in get_ep_devices() but can target GPU — trust the user's choice. 2. InferenceSession now uses _build_session_options() (which calls add_provider_for_devices, working with WinML EP registry) instead of the providers= string parameter (which tries standard DLL loading and fails for WinML-registered EPs like QNN). Falls back to providers= only when _build_session_options returns policy-based options. Fixes #402
xieofxie
reviewed
Apr 28, 2026
WinML-registered EPs (e.g. QNN) do not support the providers= parameter in InferenceSession. Remove _resolve_providers and the conditional providers= path entirely. EP is now configured exclusively via add_provider_for_devices in _build_session_options, or left to ORT device policy in the fallback case.
timenick
reviewed
Apr 29, 2026
Collaborator
timenick
left a comment
There was a problem hiding this comment.
Code review — 4 issues found. The core fix (threading ep from WinMLAutoModel → WinMLPreTrainedModel → WinMLSession) looks correct. Comments below are on the supporting changes in session.py.
🤖 Generated with Claude Code
- Remove unused `device` parameter from `_find_ep_device`; the caller intentionally skips device-type filtering (QNN reports as NPU but can target GPU), so the parameter was dead code that contradicted the comment at the call site - Update `_build_session_options` docstring to document that `"cpu"` is excluded from the add_provider_for_devices path and falls through to policy-based selection - Document registry-order dependency in `_find_ep_for_device`: when multiple EPs match the same device type the first one wins; callers that need a specific EP should set `self._ep` to bypass discovery
timenick
approved these changes
Apr 29, 2026
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--epflag was silently dropped in the model construction path:WinMLAutoModelreceived it but never forwarded it toWinMLPreTrainedModel→WinMLSessionepparameter toWinMLPreTrainedModel.__init__()and forwarded it toWinMLSessionepat all threewinml_class(...)construction sites inWinMLAutoModel(from_pretrained,from_onnxskip-build path,from_onnxbuild path)Default-path behavior change
Previously, when no
--epwas set,_build_session_optionsalways fell through toset_provider_selection_policy(...)(PREFER_NPU/GPU/CPU) and let ORT pick the EP. This PR also changes that path: for non-CPU devices, it now first calls_find_ep_for_deviceand, if a match is found, usesadd_provider_for_devicesdirectly — bypassing the device policy. This is intentional: it gives us explicit control over which physical device is used (matching what we already do for the--eppath) and avoids WinML EP registration issues that theproviders=string-based path cannot handle.Fixes #402
🤖 Generated with Claude Code