Skip to content

Compile: Add e2e functional validation tests#645

Merged
zhenchaoni merged 5 commits into
mainfrom
private/zhenni/compile_e2e
May 18, 2026
Merged

Compile: Add e2e functional validation tests#645
zhenchaoni merged 5 commits into
mainfrom
private/zhenni/compile_e2e

Conversation

@zhenchaoni
Copy link
Copy Markdown
Member

@zhenchaoni zhenchaoni commented May 15, 2026

fix #488

winml compile E2E coverage

EP-context EPs = qnn, openvino (produce an EPContext model).
Passthrough EPs = cpu, dml (CLI rejects with explicit error).
Note: vitisai and nv_tensorrt_rtx will be added later

Tests use require_ep("<name>") for EP gating: absent EPs auto-skip. QAIRT tests additionally skip if neither QNN_SDK_ROOT nor QAIRT_SDK_ROOT resolves to an existing directory.

CLI surface (no EP runtime required)

  • winml compile --help — every documented option present in stdout
  • winml compile — fails with "missing option" / "model" error
  • winml compile --list — exit 0, lists ort
  • winml compile --list --device npu — lists both ort and qairt
  • winml compile -m FAKE_CTX.onnx (already-compiled EPContext model) — fails with "already a compiled" / "cannot be re-compiled"

Happy path — EP-context EPs

For each EP in {qnn, openvino}:

  • winml compile -m MODEL --ep <EP> -o OUT/out.onnx
  • Asserts: exit 0, Success! banner, valid EPContext artifact with sidecar .bin

Unsupported EPs — explicit error

For each EP in {cpu, cuda, dml, nv_tensorrt_rtx, vitisai, migraphx}:

  • winml compile -m MODEL --ep <EP>
  • Asserts: exit ≠ 0, error contains does not support EPContext compilation, input ONNX byte-identical (sha256 unchanged)

Output paths — EP-context EPs

For each EP in {qnn, openvino}:

  • winml compile -m MODEL --ep <EP> -o OUT/custom_name.onnx — file at exact -o path
  • winml compile -m MODEL --ep <EP> --output-dir OUT — exactly one EPContext *.onnx produced in OUT/

--embed vs sidecar — EP-context EPs

For each EP in {qnn, openvino}:

  • winml compile -m MODEL --ep <EP> --embed -o OUT/embedded.onnxembed_mode=1, inlined bytes
  • winml compile -m MODEL --ep <EP> -o OUT/external.onnxembed_mode=0, sidecar .bin exists alongside

Validation toggle — EP-context EPs

For each EP in {qnn, openvino}:

  • winml compile -m MODEL --ep <EP> --no-validate --verbose -o OUT/no_validate.onnx — no validation log lines in stdout
  • winml compile -m MODEL --ep <EP> --verbose -o OUT/validated.onnx — validation log lines in stdout

QAIRT backend (qnn-only, requires QAIRT SDK)

EP: qnn. Skips when no QAIRT SDK directory is found.

  • winml compile -m MODEL --ep qnn --compiler qairt --qnn-sdk-root $SDK -o OUT/qairt.onnx --verbose — banner shows Compiler: qairt + SDK root, sidecar EPContext artifact, intermediate *_qnn_ctx_qnn.bin and *_cache_info.json exist
  • same plus --embedembed_mode=1, ~28KB QNN payload inlined into the EPContext node

--device → provider resolution (no --ep)

Pins _resolve_compile_provider(ep=None) for every click-allowed --device. Banner is observable even when compile is rejected, so the gpu/cpu rows run on any host.

  • winml compile -m MODEL --device npu -o OUT/npu.onnx — banner Device: npu / Provider: qnn; EPContext artifact produced. EP: qnn.
  • winml compile -m MODEL --device gpu — banner Provider: dml (gpu maps to dml via _DEVICE_TO_PROVIDER, not qnn); fails with does not support EPContext compilation. EP-agnostic.
  • winml compile -m MODEL --device cpu — banner Provider: cpu; fails with does not support EPContext compilation. EP-agnostic.
  • winml compile -m MODEL --device auto -o OUT/auto.onnx — pins the only path that hits the resolver's else "qnn" fall-through (auto is in click's Choice but absent from _DEVICE_TO_PROVIDER); banner Provider: qnn; EPContext artifact produced. EP: qnn.

Already-compiled rejection — real round-trip

Complements the hand-crafted EPContext fixture with the realistic user mistake. Parametrized over {qnn, openvino}.

  • winml compile -m MODEL --ep <EP> -o OUT/first.onnx then winml compile -m OUT/first.onnx --ep <EP> — second invocation fails with already a compiled / cannot be re-compiled.

--config (-c) precedence

Config-file format: JSON. compile block keys: execution_provider, compiler, embed_context, validate, verbose.

  • winml compile -m MODEL -c FULL.json -o OUT/cfg_full.onnx with config setting all 5 fields (provider=qnn, compiler=ort, embed_context=true, validate=false, verbose=true) — config values applied: Provider: qnn, embed=true, no validation logs. EP: qnn.
  • winml compile -m MODEL -c NO_EMBED.json --embed --validate --verbose -o OUT/cli_wins.onnx with config embed_context: false, validate: false — CLI flags override config: embed=true, validation logs present. Guards against "config dataclass default wins over explicit CLI flag" regression. EP: qnn.
  • winml compile -m MODEL -c CPU.json with config execution_provider: cpu — fails with does not support EPContext compilation (unsupported-EP gate fires for provider sourced from config). EP: cpu.

@zhenchaoni zhenchaoni requested a review from a team as a code owner May 15, 2026 08:22
Comment thread tests/e2e/test_compile_e2e.py Fixed
Copy link
Copy Markdown
Collaborator

@timenick timenick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of E2E compile coverage. The test scope is solid; a few items below.

Blocking — existing unit test will fail after merge

tests/unit/session/test_qairt_session.py:260 still asserts the old filename:

assert session._ctx_path == model_dir / f"{model_stem}_qnn_ctx.onnx"

This PR renames _ctx_path in qairt_session.py:63 from {stem}_qnn_ctx.onnx to {stem}_ctx.onnx, so this assertion needs to be updated in lockstep. Please include the fix in this PR; otherwise the unit suite breaks the moment this lands. (That file is not part of this diff so I couldn't leave an inline comment there.)

Remaining items are inline.

🤖 Generated with Claude Code

Comment thread tests/e2e/test_compile_e2e.py Outdated
Comment thread tests/e2e/test_compile_e2e.py
Comment thread tests/e2e/require_ep.py Outdated
Comment thread tests/e2e/require_ep.py Outdated
Comment thread tests/e2e/test_compile_e2e.py Outdated
@timenick
Copy link
Copy Markdown
Collaborator

timenick commented May 15, 2026

@DingmaomaoBJTU could you please comment on the EP support list?

We might need to add new tests once this PR has been merged. #638

@DingmaomaoBJTU
Copy link
Copy Markdown
Collaborator

@DingmaomaoBJTU could you please comment on the EP support list?

We might need to add new tests once this PR has been merged. #638

CUDA is not supported, and MIGraphX is not supported for now.
Other than CPU and DML, all execution providers support compile.
Be cautious: the compile result can differ depending on the EP/device. For example, OV-CPU and OV-GPU may produce different results.

@zhenchaoni
Copy link
Copy Markdown
Member Author

@DingmaomaoBJTU could you please comment on the EP support list?
We might need to add new tests once this PR has been merged. #638

CUDA is not supported, and MIGraphX is not supported for now. Other than CPU and DML, all execution providers support compile. Be cautious: the compile result can differ depending on the EP/device. For example, OV-CPU and OV-GPU may produce different results.

Updated: The current e2e test will cover qnn, ov, cpu and dml. Vitisai and tensor rt will be supported later when they are good.

@zhenchaoni zhenchaoni merged commit 2515c55 into main May 18, 2026
9 checks passed
@zhenchaoni zhenchaoni deleted the private/zhenni/compile_e2e branch May 18, 2026 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Identify functional scenarios and implement E2E tests for winml compile

4 participants