Compile: Add e2e functional validation tests#645
Conversation
timenick
left a comment
There was a problem hiding this comment.
Review of E2E compile coverage. The test scope is solid; a few items below.
Blocking — existing unit test will fail after merge
tests/unit/session/test_qairt_session.py:260 still asserts the old filename:
assert session._ctx_path == model_dir / f"{model_stem}_qnn_ctx.onnx"This PR renames _ctx_path in qairt_session.py:63 from {stem}_qnn_ctx.onnx to {stem}_ctx.onnx, so this assertion needs to be updated in lockstep. Please include the fix in this PR; otherwise the unit suite breaks the moment this lands. (That file is not part of this diff so I couldn't leave an inline comment there.)
Remaining items are inline.
🤖 Generated with Claude Code
|
@DingmaomaoBJTU could you please comment on the EP support list? We might need to add new tests once this PR has been merged. #638 |
CUDA is not supported, and MIGraphX is not supported for now. |
Updated: The current e2e test will cover qnn, ov, cpu and dml. Vitisai and tensor rt will be supported later when they are good. |
fix #488
winml compileE2E coverageEP-context EPs =
qnn,openvino(produce an EPContext model).Passthrough EPs =
cpu,dml(CLI rejects with explicit error).Note: vitisai and nv_tensorrt_rtx will be added later
Tests use
require_ep("<name>")for EP gating: absent EPs auto-skip. QAIRT tests additionally skip if neitherQNN_SDK_ROOTnorQAIRT_SDK_ROOTresolves to an existing directory.CLI surface (no EP runtime required)
winml compile --help— every documented option present in stdoutwinml compile— fails with "missing option" / "model" errorwinml compile --list— exit 0, listsortwinml compile --list --device npu— lists bothortandqairtwinml compile -m FAKE_CTX.onnx(already-compiled EPContext model) — fails with "already a compiled" / "cannot be re-compiled"Happy path — EP-context EPs
For each EP in
{qnn, openvino}:winml compile -m MODEL --ep <EP> -o OUT/out.onnxSuccess!banner, valid EPContext artifact with sidecar.binUnsupported EPs — explicit error
For each EP in
{cpu, cuda, dml, nv_tensorrt_rtx, vitisai, migraphx}:winml compile -m MODEL --ep <EP>does not support EPContext compilation, input ONNX byte-identical (sha256 unchanged)Output paths — EP-context EPs
For each EP in
{qnn, openvino}:winml compile -m MODEL --ep <EP> -o OUT/custom_name.onnx— file at exact-opathwinml compile -m MODEL --ep <EP> --output-dir OUT— exactly one EPContext*.onnxproduced inOUT/--embedvs sidecar — EP-context EPsFor each EP in
{qnn, openvino}:winml compile -m MODEL --ep <EP> --embed -o OUT/embedded.onnx—embed_mode=1, inlined byteswinml compile -m MODEL --ep <EP> -o OUT/external.onnx—embed_mode=0, sidecar.binexists alongsideValidation toggle — EP-context EPs
For each EP in
{qnn, openvino}:winml compile -m MODEL --ep <EP> --no-validate --verbose -o OUT/no_validate.onnx— no validation log lines in stdoutwinml compile -m MODEL --ep <EP> --verbose -o OUT/validated.onnx— validation log lines in stdoutQAIRT backend (qnn-only, requires QAIRT SDK)
EP:
qnn. Skips when no QAIRT SDK directory is found.winml compile -m MODEL --ep qnn --compiler qairt --qnn-sdk-root $SDK -o OUT/qairt.onnx --verbose— banner showsCompiler: qairt+ SDK root, sidecar EPContext artifact, intermediate*_qnn_ctx_qnn.binand*_cache_info.jsonexist--embed—embed_mode=1, ~28KB QNN payload inlined into the EPContext node--device→ provider resolution (no--ep)Pins
_resolve_compile_provider(ep=None)for every click-allowed--device. Banner is observable even when compile is rejected, so the gpu/cpu rows run on any host.winml compile -m MODEL --device npu -o OUT/npu.onnx— bannerDevice: npu/Provider: qnn; EPContext artifact produced. EP:qnn.winml compile -m MODEL --device gpu— bannerProvider: dml(gpu maps to dml via_DEVICE_TO_PROVIDER, not qnn); fails withdoes not support EPContext compilation. EP-agnostic.winml compile -m MODEL --device cpu— bannerProvider: cpu; fails withdoes not support EPContext compilation. EP-agnostic.winml compile -m MODEL --device auto -o OUT/auto.onnx— pins the only path that hits the resolver'selse "qnn"fall-through (auto is in click's Choice but absent from_DEVICE_TO_PROVIDER); bannerProvider: qnn; EPContext artifact produced. EP:qnn.Already-compiled rejection — real round-trip
Complements the hand-crafted EPContext fixture with the realistic user mistake. Parametrized over
{qnn, openvino}.winml compile -m MODEL --ep <EP> -o OUT/first.onnxthenwinml compile -m OUT/first.onnx --ep <EP>— second invocation fails withalready a compiled/cannot be re-compiled.--config(-c) precedenceConfig-file format: JSON.
compileblock keys:execution_provider,compiler,embed_context,validate,verbose.winml compile -m MODEL -c FULL.json -o OUT/cfg_full.onnxwith config setting all 5 fields (provider=qnn, compiler=ort, embed_context=true, validate=false, verbose=true) — config values applied:Provider: qnn, embed=true, no validation logs. EP:qnn.winml compile -m MODEL -c NO_EMBED.json --embed --validate --verbose -o OUT/cli_wins.onnxwith configembed_context: false, validate: false— CLI flags override config: embed=true, validation logs present. Guards against "config dataclass default wins over explicit CLI flag" regression. EP:qnn.winml compile -m MODEL -c CPU.jsonwith configexecution_provider: cpu— fails withdoes not support EPContext compilation(unsupported-EP gate fires for provider sourced from config). EP:cpu.