bug: winml optimize crashes with NOT_IMPLEMENTED when input model is a QNN EPContext model

## Summary

`winml optimize` fails with an `OptimizationError` (ORT `NOT_IMPLEMENTED`) when the input `.onnx` is the output of `winml build` — a compiled QNN EPContext model. The `ort_graph` pipe tries to create an `InferenceSession` over the EPContext model without the QNN EP, which ORT cannot execute.

## Repro

```bash
# Step 1: build a compiled QNN model
winml build -c config.json -m microsoft/resnet-50 -o resnet_build/

# Step 2: try to optimize the compiled output
winml optimize -m resnet_build/model.onnx -o resnet_build/model_optimized.onnx
```

## Error

```
2026-04-03 11:45:34,355 - winml.modelkit.optim.optimizer - INFO - ⚙ Executing ort_graph...
2026-04-03 11:45:34,373 - winml.modelkit.optim.optimizer - ERROR - ✗ ort_graph failed:
ONNX Runtime optimization failed: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED :
Could not find an implementation for EPContext(1) node with name
'QNNExecutionProvider_QNN_16222565441986465013_1_0'
```

Full traceback: `optim/pipes/graph.py:571 → onnxruntime_inference_collection.py:485 → OptimizationError`

## Root Cause

The `ort_graph` optimization pipe creates an ORT `InferenceSession` on the input model using only CPU/default providers. A QNN EPContext model contains `EPContext` nodes that are opaque blobs for the QNN execution provider — they cannot be executed (or optimized) by the standard ORT session without loading the QNN EP.

## Expected behavior

`winml optimize` should detect that the input is an EPContext model (e.g. by checking for `EPContext` op nodes) and either:
- Skip the `ort_graph` pipe (which is not applicable to pre-compiled models), or
- Emit a clear error: \"Input model is a compiled EPContext artifact and cannot be re-optimized. Run `winml optimize` on the original ONNX model before compilation.\"

## Notes

- Reporter: AdinaTru
- Input model: output of `winml config` + `winml build` for ResNet-50 on Qualcomm device
- Same root cause as the `winml quantize` EPContext failure (see related issue)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: winml optimize crashes with NOT_IMPLEMENTED when input model is a QNN EPContext model #256

Summary

Repro

Error

Root Cause

Expected behavior

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bug: winml optimize crashes with NOT_IMPLEMENTED when input model is a QNN EPContext model #256

Description

Summary

Repro

Error

Root Cause

Expected behavior

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions