docs: expand analyze/optimize content, rename hub→catalog by DingmaomaoBJTU · Pull Request #769 · microsoft/winml-cli

DingmaomaoBJTU · 2026-05-27T10:09:15Z

Summary

Rewrote docs/concepts/analyze-and-optimize.md with source-verified content: SupportLevel classification table, lint vs autoconf outputs, analysis modes, optimizer pipe architecture (4 pipes, 43 capabilities, 5 rewrite groups / 12 rules), and autoconf loop SVG diagram
Updated docs/commands/analyze.md with corrected EP aliases, exit-code table, and additional CLI examples
Renamed hub.md → catalog.md and updated all cross-references (inspect, overview, sys, mkdocs.yml)
Fixed check-yaml pre-commit hook to support !!python/name tags in mkdocs.yml (--unsafe)

🤖 Generated with Claude Code

Rewrite the analyze-and-optimize concept page with source-verified content: SupportLevel classification table, lint vs autoconf outputs, analysis modes, three finding classes, local op execution, HTP metadata, internal pipeline stages, optimizer pipe architecture (4 pipes, 43 capabilities across 13 categories, 5 rewrite groups / 12 rules), and the autoconf loop diagram (SVG). Update the analyze command reference with corrected EP aliases, exit-code table, and additional examples. Rename hub.md → catalog.md and update all cross-references. Also fix check-yaml pre-commit hook to support !!python/name tags in mkdocs.yml (--unsafe).

vortex-captain · 2026-05-28T05:28:41Z

+
+When a model is exported with hierarchy-preserving tags (HTP), the export produces a sidecar `_htp_metadata.json` that maps each ONNX node back to its source module (e.g., `encoder.layer.0.attention.self.GELUActivation`). Passing this file via `--htp-metadata` lets the `PatternExtractor` use the module hierarchy to match subgraph patterns more accurately than operator-level heuristics alone.
+
+HTP metadata is consumed at the pattern extraction stage — before any EP-specific runtime checking — so the enriched patterns benefit all target EPs equally (QNN, OpenVINO, VitisAI, etc.). Without HTP metadata, the analyzer falls back to attribute-based tag matching and then the general-purpose `PatternMatcher`; with it, the analyzer can correctly identify fused patterns (GELU, LayerNorm, Attention) that are difficult to detect from the raw operator graph. See the [analyze command reference](../commands/analyze.md) for usage examples.


The explanation of relation between HTP metadata and PatternMatcher seems a bit fuzzy, and unclear on why both would be needed in our system. Maybe simply state the benefits of HTP metadata when it's available, and it can be obtained in winml export/build stage by passing a HTP flag, without mentioning PatternMatcher?

vortex-captain · 2026-05-28T05:43:02Z

+This granularity matters when a specific fusion breaks a downstream step or when you need an exact optimization profile for a given EP. Some capabilities declare dependencies (e.g., `bias-gelu-fusion` requires `gelu-fusion`); the optimizer resolves these automatically when you enable a flag.
+
+**Pattern rewrites** are a complementary mechanism: instead of folding nodes, rewrites replace one subgraph pattern with a structurally equivalent alternative. Rules are defined in JSON files (`default.json` for general rewrites, `qnn.json` for QNN-specific rewrites). The optimizer currently ships 5 rewrite groups containing 12 individual rules — for example, four GELU source variants can each be rewritten to a single `Gelu` op, and a MatMul+Add pattern can be rewritten to a GEMM or to a Conv2D for Qualcomm NPU targets. Run `--list-rewrites` to discover available families and their flag names. Flags follow the form `--enable-<source-slug>-<target-slug>`.
+


nit: a md file with all the optimizer flags would also be convenient for user's reference

DingmaomaoBJTU requested a review from a team as a code owner May 27, 2026 10:09

DingmaomaoBJTU changed the base branch from main to docs/draft May 27, 2026 10:10

DingmaomaoBJTU merged commit 612d692 into docs/draft May 27, 2026
9 checks passed

DingmaomaoBJTU deleted the qiowu/docs/draft branch May 27, 2026 10:28

vortex-captain reviewed May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: expand analyze/optimize content, rename hub→catalog#769

docs: expand analyze/optimize content, rename hub→catalog#769
DingmaomaoBJTU merged 1 commit into
docs/draftfrom
qiowu/docs/draft

DingmaomaoBJTU commented May 27, 2026

Uh oh!

Uh oh!

vortex-captain May 28, 2026 •

edited

Loading

Uh oh!

vortex-captain May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		When a model is exported with hierarchy-preserving tags (HTP), the export produces a sidecar `_htp_metadata.json` that maps each ONNX node back to its source module (e.g., `encoder.layer.0.attention.self.GELUActivation`). Passing this file via `--htp-metadata` lets the `PatternExtractor` use the module hierarchy to match subgraph patterns more accurately than operator-level heuristics alone.

		HTP metadata is consumed at the pattern extraction stage — before any EP-specific runtime checking — so the enriched patterns benefit all target EPs equally (QNN, OpenVINO, VitisAI, etc.). Without HTP metadata, the analyzer falls back to attribute-based tag matching and then the general-purpose `PatternMatcher`; with it, the analyzer can correctly identify fused patterns (GELU, LayerNorm, Attention) that are difficult to detect from the raw operator graph. See the [analyze command reference](../commands/analyze.md) for usage examples.

		This granularity matters when a specific fusion breaks a downstream step or when you need an exact optimization profile for a given EP. Some capabilities declare dependencies (e.g., `bias-gelu-fusion` requires `gelu-fusion`); the optimizer resolves these automatically when you enable a flag.

		Pattern rewrites are a complementary mechanism: instead of folding nodes, rewrites replace one subgraph pattern with a structurally equivalent alternative. Rules are defined in JSON files (`default.json` for general rewrites, `qnn.json` for QNN-specific rewrites). The optimizer currently ships 5 rewrite groups containing 12 individual rules — for example, four GELU source variants can each be rewritten to a single `Gelu` op, and a MatMul+Add pattern can be rewritten to a GEMM or to a Conv2D for Qualcomm NPU targets. Run `--list-rewrites` to discover available families and their flag names. Flags follow the form `--enable-<source-slug>-<target-slug>`.

Conversation

DingmaomaoBJTU commented May 27, 2026

Summary

Uh oh!

Uh oh!

vortex-captain May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vortex-captain May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vortex-captain May 28, 2026 •

edited

Loading