Context
Follow-up to #473. The StableHLO emitter now surfaces `TensorSpec.tensorEncoding` as MLIR comments next to the operations that produce or consume encoded tensors:
```mlir
// tensor_encoding: role=result index=0 name=w encoding=Q8_0
```
Comments are great as a cheapest-reversible first hop but they have two shortcomings:
- They're string-matching territory for downstream consumers — no parser, no validation, no round-trip guarantee through tools that strip comments.
- They live next to ops instead of at the module level, so a downstream pass that wants to enumerate "every Q8_0 weight in this function" has to walk the whole IR instead of reading one structured attribute.
This PR
Promote the per-op comment annotations to a module-level MLIR attribute that enumerates every encoded tensor in one place. The emitted header becomes:
```mlir
module attributes {
skainet.tensor_encodings = {
w = "Q8_0",
other_weight = "Q4_K"
}
} {
func.func @main(...) -> (...) {
...
}
}
```
Concretely:
- Collect phase: before emitting `module {`, walk the `ComputeGraph` once, gather every `TensorSpec` with non-null `tensorEncoding`, and build a map of `tensor_name → encoding_name`. No duplication: if the same name appears in multiple nodes, keep a single entry. Diagnostic: if the same name appears with two different encodings, drop with a warning comment — that's a graph-level bug caller should fix.
- Emit phase: if the map is non-empty, emit `module attributes { skainet.tensor_encodings = { ... } } {` instead of the bare `module {`. If it's empty, preserve the existing bare form exactly — dense graphs look identical to today.
- Keep the existing per-op comments — they're still useful for humans reading diffs and cost nothing now that the structured attribute is the source of truth for tools. A follow-up can remove them if we decide the attribute alone is sufficient.
- Unit test: a graph with a Q8_0 weight and a Q4_K weight must emit a module header containing `skainet.tensor_encodings = {` and both `= "Q8_0"` and `= "Q4_K"` entries. A dense graph must emit the bare `module {` header with no `attributes` block.
Why a module attribute, not an op attribute
Real MLIR op attributes live inside the custom assembly of each `stablehlo.*` op (`stablehlo.dot_general %a, %b {skainet.encoding = "Q8_0"} : ...`). Emitting them correctly requires touching every converter in the registry — 7 files today, all of which hand-build their op strings. A module-level attribute lives in exactly one place (`StableHloConverter.convert`) and costs one hook. It's the bigger value-per-line-of-code; per-op attributes can come in a later refactor when the converter registry is reworked to emit via a builder instead of string concatenation.
Out of scope
- Per-op MLIR attributes. Bigger refactor.
- Real `quant.` dialect emission (`!quant.uniform<i8:f32, 0.1:128>`). Requires structured quant parameters (scale, zero point) that `TensorEncoding` doesn't yet carry.
- Teaching IREE to consume the attribute. That's downstream work.
- Changing the shape of `TensorEncoding` or `TensorSpec`.
Context
Follow-up to #473. The StableHLO emitter now surfaces `TensorSpec.tensorEncoding` as MLIR comments next to the operations that produce or consume encoded tensors:
```mlir
// tensor_encoding: role=result index=0 name=w encoding=Q8_0
```
Comments are great as a cheapest-reversible first hop but they have two shortcomings:
This PR
Promote the per-op comment annotations to a module-level MLIR attribute that enumerates every encoded tensor in one place. The emitted header becomes:
```mlir
module attributes {
skainet.tensor_encodings = {
w = "Q8_0",
other_weight = "Q4_K"
}
} {
func.func @main(...) -> (...) {
...
}
}
```
Concretely:
Why a module attribute, not an op attribute
Real MLIR op attributes live inside the custom assembly of each `stablehlo.*` op (`stablehlo.dot_general %a, %b {skainet.encoding = "Q8_0"} : ...`). Emitting them correctly requires touching every converter in the registry — 7 files today, all of which hand-build their op strings. A module-level attribute lives in exactly one place (`StableHloConverter.convert`) and costs one hook. It's the bigger value-per-line-of-code; per-op attributes can come in a later refactor when the converter registry is reworked to emit via a builder instead of string concatenation.
Out of scope