test(session): drop redundant device='auto' tests + move auto smoke to e2e by timenick · Pull Request #727 · microsoft/winml-cli

timenick · 2026-05-25T08:29:16Z

Summary

Closes #726.

After #708, WinMLSession(device="auto") resolves to a concrete EP via resolve_device() and force-binds it through add_provider_for_devices. On the Windows CI runner the WinML EP registry advertises phantom NPU/GPU EP devices even without real hardware — force-binding those EPs segfaults natively in InferenceSession creation, surfacing as Process completed with exit code 1 with no pytest traceback.

The crash is non-deterministic (#719 happened to pass on a re-run while #717 failed on the same commit), so every PR is exposed to a random failure until main is fixed.

Approach

I first considered rewriting the affected tests to device="cpu". On audit, almost all of them duplicate existing CPU-explicit coverage in the same file — they used device="auto" as a convenience, not to exercise auto-resolution semantics. Drop the redundant ones instead.

Test	Overlap
`test_run_uses_epcontext_after_compile`	`test_compile_is_idempotent` (compile()→COMPILED)
`test_basic_inference`	`test_explicit_cpu_provider` + 5 perf tests already call `run(sample)` on cpu
`test_inference_auto_compiles`	Implicit in every other test that calls `run()` without prior `compile()`
`test_state_transitions`	`test_ep_name_is_none_before_compile` + `test_ep_name_after_compile` cover state transitions
`test_reset_returns_to_initialized`	`test_reset_clears_error_state` exercises `reset()`
`test_providers_are_valid_and_include_fallback`	Asserted pre-#708 'auto falls back to CPU' behaviour that #708 intentionally removed; `test_cpu_provider_always_available` covers the CPU-explicit case

Six tests deleted, one kept and converted:

test_inference_with_torch_tensor → device="cpu". Sole test covering torch.Tensor input → numpy conversion path.

Restoring `device="auto"` runtime coverage

Added test_auto_device_runtime_smoke to tests/e2e/test_session.py under the existing @pytest.mark.e2e class marker. End-to-end coverage of resolve_device → add_provider_for_devices → InferenceSession now lives where real hardware can be assumed.

Verification

tests\unit\session\test_winml_session.py
=========== 33 passed, 6 skipped in 3.02s ===========

The 5 fewer-passing-than-before are exactly the deleted redundant tests; nothing else moved.

@e2e

…o e2e The CI flake on test_run_uses_epcontext_after_compile (and similar device='auto' compile/run tests) traces to #708: after that PR, device='auto' force-binds the first resolve_device-chosen EP via add_provider_for_devices, which segfaults natively when the WinML EP registry advertises phantom NPU/GPU EPs on a hardware-less Windows CI runner (#726). Audit the affected device='auto' tests against existing CPU-explicit coverage in the same file: - test_run_uses_epcontext_after_compile redundant with test_compile_is_idempotent - test_basic_inference redundant with test_explicit_cpu_provider + perf tests - test_inference_auto_compiles implicit in every other run-without-compile test - test_state_transitions redundant with test_ep_name_is_none/after_compile - test_reset_returns_to_initialized redundant with test_reset_clears_error_state - test_providers_are_valid_and_include_fallback asserted pre-#708 fallback behaviour that #708 removed All six are redundant. Delete them rather than mechanically rewriting to device='cpu'. Keep test_inference_with_torch_tensor (switched to device='cpu'): only test covering the torch.Tensor input-conversion path. Add test_auto_device_runtime_smoke to tests/e2e/test_session.py under the existing @e2e class marker. End-to-end coverage of the resolve_device -> add_provider_for_devices -> InferenceSession path now lives where it can rely on real hardware being present.

Review feedback on #727: a few specific assertions from the deleted device='auto' tests weren't pinned elsewhere on device='cpu': - not is_compiled -> run() -> is_compiled (implicit lazy-compile contract) - outputs['C'].dtype == np.float32 (output dtype on a fp32 model) Add both to test_cpu_provider_always_available so the contract stays covered in PR-level CI without resurrecting redundant test methods.

Review feedback on #727: the new auto-device e2e test should pin the assertions that the deleted unit tests covered, since this is now the home of device='auto' runtime coverage. Expand test_auto_device_runtime_smoke to include: - state == INITIALIZED before any work - not is_compiled before run (lazy-compile contract) - state == COMPILED after run - outputs['C'].dtype == np.float32 - second run keeps COMPILED state - reset() -> INITIALIZED + not is_compiled Add test_auto_device_explicit_compile_writes_epcontext to replace the deleted test_run_uses_epcontext_after_compile (covers the explicit compile() + run() ordering).

CodeQL flagged the second 'outputs = session.run(...)' as an unused local. The second run only exercises state preservation; drop the binding rather than fabricate an extra assertion.

timenick requested a review from a team as a code owner May 25, 2026 08:29

xieofxie reviewed May 25, 2026

View reviewed changes

Comment thread tests/e2e/test_session.py

timenick added 2 commits May 25, 2026 16:38

github-advanced-security AI found potential problems May 25, 2026

View reviewed changes

Comment thread tests/e2e/test_session.py Fixed

test(e2e): drop unused outputs assignment on second run

28215b8

CodeQL flagged the second 'outputs = session.run(...)' as an unused local. The second run only exercises state preservation; drop the binding rather than fabricate an extra assertion.

xieofxie approved these changes May 25, 2026

View reviewed changes

Merge branch 'main' into zhiwang/fix-session-auto-device-crash

737b1a9

timenick merged commit 6361251 into main May 25, 2026
9 checks passed

timenick deleted the zhiwang/fix-session-auto-device-crash branch May 25, 2026 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(session): drop redundant device='auto' tests + move auto smoke to e2e#727

test(session): drop redundant device='auto' tests + move auto smoke to e2e#727
timenick merged 5 commits into
mainfrom
zhiwang/fix-session-auto-device-crash

timenick commented May 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

timenick commented May 25, 2026

Summary

Approach

Restoring device="auto" runtime coverage

Verification

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Restoring `device="auto"` runtime coverage