First public preview release. With the Windows ML 2.0 baseline now in place, this release shifts focus to polishing the CLI surface: faster winml inspect / winml eval, more accurate device & EP resolution, a real PyPI release pipeline, and a meaningful pass over sysinfo and quantization behavior.
🎉 Public preview
- Promoted to
Development Status :: 4 - Betainpyproject.toml. - First release published to PyPI via the new ESRP-signed release pipeline (#473).
✨ Improvements
winml inspect: banner + spinner during HF metadata fetch (#718, hidden in JSON mode #745);--list-tasks<500 ms (#717); processorAuto*lookups gated (#719, #746).winml eval: lazy module loading drops cold-start latency (#711); inputs validated up-front with friendlier errors and a structured--schemaoutput (#694).winml export:model-idandtaskvalidated before the export runs (#714).winml analyze: cleaner EP/device selection, clearer "op-check skipped" UI, merged optimization config (#702).winml perf: estimated model precision (QDQ / block-wise quant / dominant float dtype) is now reported byWinMLSession(#706); expanded perf e2e coverage across EPs and devices (#698).winml monitor: queries all NPU/GPU engines and reports the max utilization (#716).- CLI-wide: did-you-mean suggestions on mistyped subcommands (#699); consistent option-vs-config-file value priority across commands (#720);
op_tracinghidden from the public surface (#738). - Adopted the official
windowsmlusage example — removed the redundantWinMLsingleton, fixing a benign "library already registered" traceback onwinml perf --device npu(#729).
🐛 Fixes
- Quantization (P0) —
--precisionnow rejects invalid values instead of silently falling back touint8/uint8; default image calibration dataset streams rather than downloading ~5 GB; DETR-family object detection supportspixel_maskpadding (#680). winml eval— pinnedpyarrow <24to avoid an EP DLL load-order crash (#750).winml perf— QDQ precision detection fix (#753); NPU monitoring adds3Dengine, device line shows requested vs. actual (#747).- EP / device resolution —
resolve_device/resolve_epsnow useget_registered_ep_devices(#712); dropped misleadingov/vitis/trtrtxaliases (#690);winml sysraises when an EP isn't available on the host (#686); per-providerensure_readyfailures demoted to debug (#703); analyze regression caught during compile e2e (#740). - Native ORT / WinML — suppressed ORT native stderr, fixed a HANDLE leak (#709); nulled the EP catalog handle after enumeration to prevent a QNN NPU crash on exit (#701); fixed the
onnxruntimeDLL search path (#689). winml sys— diagnostic sections gated behind-v, json-mode logs routed to stderr (#737); CPU/Mem scoped to the current process and PDH percent counters no longer artificially capped (#715); host arch reported viaIsWow64Process2on Windows ARM64 (#705).- OpenVINO —
is_npudetection updated (#722).
🔧 Internals & CI
- Added a
winml-cliCopilot skill (#733).
📦 Assets
winml_cli-0.1.0-py3-none-any.whlrules-v0.1.0.zip