Promote tensor_encoding comment to structured module attribute (P0-1 step 3)

## Context

Follow-up to #473. The StableHLO emitter now surfaces \`TensorSpec.tensorEncoding\` as MLIR comments next to the operations that produce or consume encoded tensors:

\`\`\`mlir
    // tensor_encoding: role=result index=0 name=w encoding=Q8_0
\`\`\`

Comments are great as a cheapest-reversible first hop but they have two shortcomings:
1. They're string-matching territory for downstream consumers — no parser, no validation, no round-trip guarantee through tools that strip comments.
2. They live next to ops instead of at the module level, so a downstream pass that wants to enumerate \"every Q8_0 weight in this function\" has to walk the whole IR instead of reading one structured attribute.

## This PR

Promote the per-op comment annotations to a **module-level MLIR attribute** that enumerates every encoded tensor in one place. The emitted header becomes:

\`\`\`mlir
module attributes {
  skainet.tensor_encodings = {
    w = \"Q8_0\",
    other_weight = \"Q4_K\"
  }
} {
  func.func @main(...) -> (...) {
    ...
  }
}
\`\`\`

Concretely:

1. **Collect phase**: before emitting \`module {\`, walk the \`ComputeGraph\` once, gather every \`TensorSpec\` with non-null \`tensorEncoding\`, and build a map of \`tensor_name → encoding_name\`. No duplication: if the same name appears in multiple nodes, keep a single entry. Diagnostic: if the same name appears with two different encodings, drop with a warning comment — that's a graph-level bug caller should fix.
2. **Emit phase**: if the map is non-empty, emit \`module attributes { skainet.tensor_encodings = { ... } } {\` instead of the bare \`module {\`. If it's empty, preserve the existing bare form exactly — dense graphs look identical to today.
3. **Keep the existing per-op comments** — they're still useful for humans reading diffs and cost nothing now that the structured attribute is the source of truth for tools. A follow-up can remove them if we decide the attribute alone is sufficient.
4. **Unit test**: a graph with a Q8_0 weight and a Q4_K weight must emit a module header containing \`skainet.tensor_encodings = {\` and both \`= \"Q8_0\"\` and \`= \"Q4_K\"\` entries. A dense graph must emit the bare \`module {\` header with no \`attributes\` block.

## Why a module attribute, not an op attribute

Real MLIR op attributes live inside the custom assembly of each \`stablehlo.*\` op (\`stablehlo.dot_general %a, %b {skainet.encoding = \"Q8_0\"} : ...\`). Emitting them correctly requires touching every converter in the registry — 7 files today, all of which hand-build their op strings. A module-level attribute lives in exactly one place (\`StableHloConverter.convert\`) and costs one hook. It's the bigger value-per-line-of-code; per-op attributes can come in a later refactor when the converter registry is reworked to emit via a builder instead of string concatenation.

## Out of scope

- Per-op MLIR attributes. Bigger refactor.
- Real \`quant.\` dialect emission (\`!quant.uniform<i8:f32, 0.1:128>\`). Requires structured quant parameters (scale, zero point) that \`TensorEncoding\` doesn't yet carry.
- Teaching IREE to consume the attribute. That's downstream work.
- Changing the shape of \`TensorEncoding\` or \`TensorSpec\`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Promote tensor_encoding comment to structured module attribute (P0-1 step 3) #477

Context

This PR

Why a module attribute, not an op attribute

Out of scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Promote tensor_encoding comment to structured module attribute (P0-1 step 3) #477

Description

Context

This PR

Why a module attribute, not an op attribute

Out of scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions