Skip to content

feat: use resolve device in session creation#708

Merged
xieofxie merged 7 commits into
mainfrom
hualxie/merge_resolve_device
May 25, 2026
Merged

feat: use resolve device in session creation#708
xieofxie merged 7 commits into
mainfrom
hualxie/merge_resolve_device

Conversation

@xieofxie
Copy link
Copy Markdown
Contributor

@xieofxie xieofxie commented May 22, 2026

Use resolve_device in session creation to align device ep resolution with other commands.

Currently we use a manual mapping in resolve_device / resolve_eps/

We could migrate to PREFER_XPU resolution with a temp model + real session later, but it is no harm to unify how we do it.

Closes #700
Closes #656

@xieofxie xieofxie requested a review from a team as a code owner May 22, 2026 06:15
Comment thread src/winml/modelkit/session/session.py Fixed
Comment thread tests/unit/session/test_winml_session.py
@xieofxie
Copy link
Copy Markdown
Contributor Author

after #712

Copy link
Copy Markdown
Collaborator

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good direction — unifying session creation through resolve_device/resolve_eps eliminates a lot of duplicated logic and makes the EP selection path consistent across all commands. The _find_ep_device removal is clean.

However, there are a few issues, one of which is a crash-level bug.

Also note: DEVICE_POLICY_MAP (lines 46–51) is now dead code — no remaining caller after the policy-based fallback was removed. Should be deleted (or kept only if you plan to re-introduce policy fallback). Similarly, the class docstring (lines 98–106) still says "Policy-based device selection" and "does NOT use explicit EP provider names" which is now the opposite of what the code does.

Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py Outdated
Comment thread src/winml/modelkit/session/session.py
Comment thread tests/unit/session/test_winml_session.py
timenick

This comment was marked as outdated.

Copy link
Copy Markdown
Collaborator

@timenick timenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the resolve_device unification. The shape is right but I want to flag two behavior changes that may be unintentional, a missing regression-test anchor, and three minor / nits.

Comment thread src/winml/modelkit/session/session.py Outdated
Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py
Comment thread src/winml/modelkit/session/session.py
Comment thread tests/unit/session/test_winml_session.py
@xieofxie xieofxie merged commit 3cf0de5 into main May 25, 2026
9 checks passed
@xieofxie xieofxie deleted the hualxie/merge_resolve_device branch May 25, 2026 06:37
timenick added a commit that referenced this pull request May 25, 2026
…-less CI

After #708, WinMLSession with device='auto' resolves to a concrete EP
via resolve_device() and force-binds it through add_provider_for_devices.
On the Windows CI runner, the WinML EP registry advertises phantom
NPU/GPU EP devices even without real hardware — force-binding those EPs
crashes natively in InferenceSession creation (#726).

The previously-failing test_run_uses_epcontext_after_compile and the
adjacent device='auto' run/compile tests don't depend on auto-resolution
semantics; they exercise general session mechanics. Switch them to
device='cpu' so the tested behaviour is observable on CPU-only CI.

test_providers_are_valid_and_include_fallback asserted the pre-#708
'auto falls back to CPU' behaviour, which #708 intentionally removed.
Delete it — test_cpu_provider_always_available already covers the CPU
case explicitly.

Coverage note: there is no longer a unit/CI test for device='auto'
runtime behaviour. The remaining device='auto' tests cover only
constructor/io_config paths (no EP binding). End-to-end coverage on
real hardware should live behind an e2e marker.
timenick added a commit that referenced this pull request May 25, 2026
…o e2e

The CI flake on test_run_uses_epcontext_after_compile (and similar
device='auto' compile/run tests) traces to #708: after that PR,
device='auto' force-binds the first resolve_device-chosen EP via
add_provider_for_devices, which segfaults natively when the WinML EP
registry advertises phantom NPU/GPU EPs on a hardware-less Windows CI
runner (#726).

Audit the affected device='auto' tests against existing CPU-explicit
coverage in the same file:

  - test_run_uses_epcontext_after_compile  redundant with test_compile_is_idempotent
  - test_basic_inference                   redundant with test_explicit_cpu_provider + perf tests
  - test_inference_auto_compiles           implicit in every other run-without-compile test
  - test_state_transitions                 redundant with test_ep_name_is_none/after_compile
  - test_reset_returns_to_initialized      redundant with test_reset_clears_error_state
  - test_providers_are_valid_and_include_fallback  asserted pre-#708 fallback
                                                   behaviour that #708 removed

All six are redundant. Delete them rather than mechanically rewriting
to device='cpu'.

Keep test_inference_with_torch_tensor (switched to device='cpu'): only
test covering the torch.Tensor input-conversion path.

Add test_auto_device_runtime_smoke to tests/e2e/test_session.py under
the existing @e2e class marker. End-to-end coverage of the
resolve_device -> add_provider_for_devices -> InferenceSession path
now lives where it can rely on real hardware being present.
timenick added a commit that referenced this pull request May 25, 2026
…o e2e (#727)

## Summary

Closes #726.

After #708, `WinMLSession(device="auto")` resolves to a concrete EP via
`resolve_device()` and force-binds it through
`add_provider_for_devices`. On the Windows CI runner the WinML EP
registry advertises phantom NPU/GPU EP devices even without real
hardware — force-binding those EPs segfaults natively in
`InferenceSession` creation, surfacing as `Process completed with exit
code 1` with no pytest traceback.

The crash is non-deterministic (#719 happened to pass on a re-run while
#717 failed on the same commit), so every PR is exposed to a random
failure until main is fixed.

## Approach

I first considered rewriting the affected tests to `device="cpu"`. On
audit, almost all of them duplicate existing CPU-explicit coverage in
the same file — they used `device="auto"` as a convenience, not to
exercise auto-resolution semantics. Drop the redundant ones instead.

| Test | Overlap |
|---|---|
| `test_run_uses_epcontext_after_compile` | `test_compile_is_idempotent`
(compile()→COMPILED) |
| `test_basic_inference` | `test_explicit_cpu_provider` + 5 perf tests
already call `run(sample)` on cpu |
| `test_inference_auto_compiles` | Implicit in every other test that
calls `run()` without prior `compile()` |
| `test_state_transitions` | `test_ep_name_is_none_before_compile` +
`test_ep_name_after_compile` cover state transitions |
| `test_reset_returns_to_initialized` | `test_reset_clears_error_state`
exercises `reset()` |
| `test_providers_are_valid_and_include_fallback` | Asserted pre-#708
'auto falls back to CPU' behaviour that #708 intentionally removed;
`test_cpu_provider_always_available` covers the CPU-explicit case |

Six tests deleted, one kept and converted:

- **`test_inference_with_torch_tensor`** → `device="cpu"`. Sole test
covering `torch.Tensor` input → numpy conversion path.

## Restoring `device="auto"` runtime coverage

Added `test_auto_device_runtime_smoke` to
[tests/e2e/test_session.py](tests/e2e/test_session.py) under the
existing `@pytest.mark.e2e` class marker. End-to-end coverage of
`resolve_device → add_provider_for_devices → InferenceSession` now lives
where real hardware can be assumed.

## Verification

```
tests\unit\session\test_winml_session.py
=========== 33 passed, 6 skipped in 3.02s ===========
```

The 5 fewer-passing-than-before are exactly the deleted redundant tests;
nothing else moved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants