Skip to content

P1-FEATURE-013: Profiling Integration — IHV Tools (QC / Intel / AMD) #158

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Summary

Integrate ModelKit with IHV-specific profiling tools from Qualcomm, Intel, and AMD to enable operator-level performance analysis on hardware NPUs/GPUs.

Context

IHV profiling tools provide deep insights into hardware execution that generic ONNX Runtime profiling cannot — e.g., which NPU ops caused stalls, memory bandwidth bottlenecks, or kernel dispatch overhead. This integration is needed to make performance optimization actionable.

From plans/release/0501_release_plan/P0_CHECKLIST.md (P1-FEATURE-013). Builds on the base profiling work in #402.

Target tools:

  • Qualcomm: QNN profiling API (HTP backend profiling output)
  • Intel: OpenVINO Performance Analysis tool / VTune
  • AMD: ROCm profiler / Ryzen AI profiler

Current State

Desired State

  • wmk perf --profile enables IHV profiling output for supported EPs
  • Profiling data consumed from QNN/OpenVINO/AMD tools
  • Bottleneck analysis: identify top-N slowest operators per EP
  • Output: profiling summary in artifacts/profiling_report.json

Acceptance Criteria

  • QNN profiling output integrated (EP session profiling via QNN SDK)
  • OpenVINO profiling output integrated (per-layer timing via IE API)
  • AMD profiling output integrated (best-effort — if SDK available)
  • Bottleneck analysis: report top-N slowest ops per EP
  • artifacts/profiling_report.json generated with operator-level timing
  • Works on at least 5 P0 built-in models × 2 EPs (QNN + OpenVINO NPU)

Technical Notes

  • QNN SDK profiling: enable via session option qnn_context_enable_graphs_profiling
  • OpenVINO profiling: InferRequest.get_profiling_info() method
  • AMD Ryzen AI profiling: available via Ryzen AI SDK; check access with hardware team
  • Normalize profiling output to a common schema across all IHV tools

Related Files

Metadata

Metadata

Labels

Type

No fields configured for Task.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions