feat(monitor): query all NPU/GPU engines and take max utilization by xieofxie · Pull Request #716 · microsoft/winml-cli

xieofxie · 2026-05-25T03:33:46Z

Summary

PdhPoller previously locked onto a single engine per adapter (first alphabetical Compute* for NPU, 3D for GPU). On NPUs with multiple Compute_* engines, work scheduled to a non-monitored engine read as 0% utilization; on GPUs where DML lands on a Compute engine, the 3D-only query missed the activity entirely.
build_adapter_query now takes engine_types: tuple[str, ...] and registers one util_<engtype> + running_time_<engtype> counter per matching engine. build_npu_query covers all Compute_*; build_gpu_query covers both 3D and Compute_*.
PdhPoller._poll_loop aggregates samples with max() across util_* keys — each counter is a per-engine ratio over the same sample window, so max reports the most-loaded engine. Summing percentages would exceed 100% on multi-engine adapters and duplicate the signal that running-time already provides.
running_time_delta_ns sums per-engine deltas — Running Time is wall-clock ns per engine, and engines are independent HW that can run in parallel, so total adapter compute time is additive.
Adapter resolution (resolve_adapter_luid) is untouched; this only changes how counters are registered and aggregated for the resolved adapter.

🤖 Generated with Claude Code

xieofxie · 2026-05-25T03:40:59Z

Also tested vitisa, migraphx

timenick

Two findings on the multi-engine PDH aggregation: the PR is about aggregation correctness, so I want it pinned by tests and documented for future readers.

feature: query mutiple pdh engine_types

02c90b7

xieofxie requested a review from a team as a code owner May 25, 2026 03:33

This comment was marked as outdated.

Sign in to view

timenick reviewed May 25, 2026

View reviewed changes

Comment thread src/winml/modelkit/session/monitor/_pdh.py

Comment thread src/winml/modelkit/session/monitor/_pdh.py

hualxie added 2 commits May 25, 2026 13:16

doc: clarify util-max vs running-time-sum rationale in comments

57d0c6a

test(pdh): lock in max(util) and sum(running-time) aggregation

54e72e8

timenick approved these changes May 25, 2026

View reviewed changes

xieofxie merged commit 8d21fd2 into main May 25, 2026
9 checks passed

xieofxie deleted the hualxie/multi_pdh branch May 25, 2026 05:53

xieofxie mentioned this pull request May 25, 2026

feature: gpu/npu usage should monitor all pdh variations #640

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(monitor): query all NPU/GPU engines and take max utilization#716

feat(monitor): query all NPU/GPU engines and take max utilization#716
xieofxie merged 3 commits into
mainfrom
hualxie/multi_pdh

xieofxie commented May 25, 2026 •

edited

Loading

Uh oh!

xieofxie commented May 25, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

timenick left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xieofxie commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

xieofxie commented May 25, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

timenick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xieofxie commented May 25, 2026 •

edited

Loading