Severity: P1
Command: winml inspect
Category: Time-to-First-Output — first byte to user must arrive within 500 ms of entry.
Repro:
Measure-Command { uv run winml inspect -m microsoft/resnet-50 }
Actual: ~24.3 s wall time on warm cache. The first ~14 s after pressing Enter shows nothing on screen, then the panels render almost instantly.
Expected: Print a banner immediately on entry (Inspecting microsoft/resnet-50 …), show a rich.status spinner during HF metadata fetch. inspect should never download model weights — only config.json is needed. With cached config.json, drop to <2 s.
Why it matters: A 14-second silence makes the user assume the command hung and Ctrl-C; then they re-run, get the same silence, and the issue compounds. A spinner inside the first 500 ms removes the ambiguity.
Severity: P1
Command:
winml inspectCategory: Time-to-First-Output — first byte to user must arrive within 500 ms of entry.
Repro:
Actual: ~24.3 s wall time on warm cache. The first ~14 s after pressing Enter shows nothing on screen, then the panels render almost instantly.
Expected: Print a banner immediately on entry (
Inspecting microsoft/resnet-50 …), show arich.statusspinner during HF metadata fetch.inspectshould never download model weights — onlyconfig.jsonis needed. With cachedconfig.json, drop to <2 s.Why it matters: A 14-second silence makes the user assume the command hung and Ctrl-C; then they re-run, get the same silence, and the issue compounds. A spinner inside the first 500 ms removes the ambiguity.