Emit skainet.tensor_encodings module attribute (#477) by michalharakal · Pull Request #478 · SKaiNET-developers/SKaiNET

michalharakal · 2026-04-13T10:56:00Z

Closes #477.

Summary

Promotes the per-op `tensor_encoding` annotation comments introduced in #473 to a structured module-level MLIR attribute that enumerates every encoded tensor in one place. Downstream tools (IREE, any MLIR pass reading SKaiNET output) can now read the encodings via one attribute lookup instead of string-matching against scattered comments.

Emitted output

For a graph with two encoded weight inputs (Q8_0 and Q4_K):

```mlir
module attributes {skainet.tensor_encodings = {w_q4 = "Q4_K", w_q8 = "Q8_0"}} {
func.func @quant_chain(...) -> (...) {
...
}
}
```

Dense graphs with no encoding metadata continue to emit the bare `module {` header byte-for-byte-identically to today — the promotion only fires when the collect phase finds at least one non-null encoding.

Two commits

1. Failing test — `TensorEncodingsModuleAttributeTest`

`encoded_weights_produce_module_attributes_block` builds a graph with two weight inputs (`TensorEncoding.Q8_0` and `TensorEncoding.Q4_K` via `withTensorEncoding`), runs the converter, and asserts the module header contains `module attributes`, `skainet.tensor_encodings`, and both `w_q8 = "Q8_0"` / `w_q4 = "Q4_K"` entries. Red against pre-fix emitter.
`dense_graph_keeps_bare_module_header` asserts a dense graph preserves `module {` exactly and introduces no spurious attributes block. A `null` `tensorEncoding` is the unknown / not-carried state, intentionally distinct from `TensorEncoding.Dense`. Green baseline.

2. The fix

`StableHloConverter.collectTensorEncodings` walks the topologically-ordered node list once, pulling every non-null `tensorEncoding` off each node's input and output specs into a sorted `name -> TensorEncoding` map. First-writer-wins on duplicates.
`StableHloConverter.convert` emits `module attributes {skainet.tensor_encodings = {...}} {` when the map is non-empty, and `module {` otherwise.
`MlirValidator` learned three things: (a) module-header lines are not SSA assignments even when they contain ` = ` inside a dict (syntax pass skips them), (b) same for the semantic pass's defined-values extraction, (c) `validateModule` accepts both `module {` and `module attributes` as valid module declaration forms.

Why a module attribute, not per-op

Real per-op MLIR attributes (`stablehlo.dot_general %a, %b {skainet.encoding = "Q8_0"} : ...`) would need every converter in the registry updated — 7 files today, all hand-building their op strings. A module-level attribute lives in exactly one hook and is the biggest value-per-line-of-code first move. Per-op attributes are a better follow-up once the registry is reworked to emit via a builder instead of string concatenation.

Test plan

`TensorEncodingsModuleAttributeTest` — both cases green
Full `:skainet-compile:skainet-compile-hlo:jvmTest` — green, no regressions from either the emitter or validator edits
CI: full multiplatform build

Out of scope

Per-op MLIR attributes (`stablehlo.*` op custom-assembly-format attributes).
Real `quant.` dialect emission (`!quant.uniform<i8:f32, 0.1:128>`). Requires structured scale / zero-point parameters that `TensorEncoding` doesn't yet carry.
Teaching IREE to consume the attribute. Downstream work.
Removing the per-op `tensor_encoding` comments from Emit TensorEncoding into StableHLO output (P0-1 step 2) #473. They cost nothing now and help human diff readers; can be deleted in a follow-up if we decide the attribute alone suffices.

🤖 Generated with Claude Code

Adds TensorEncodingsModuleAttributeTest with two cases: 1. encoded_weights_produce_module_attributes_block — builds a graph with two weight inputs carrying distinct encodings (TensorEncoding.Q8_0 and TensorEncoding.Q4_K) via withTensorEncoding, runs the converter, and asserts the emitted module header is `module attributes { ... }` with a single `skainet.tensor_encodings = { w_q4 = "Q4_K", w_q8 = "Q8_0" }` dictionary enumerating every encoded tensor in one place. Red against StableHloConverter today — the metadata is only exposed as scattered per-op comments. 2. dense_graph_keeps_bare_module_header — a dense FP32 graph with no encoding metadata must preserve the bare `module {` header and must not introduce a spurious `attributes` block. A `null` tensorEncoding is the unknown / not-carried state, intentionally distinct from TensorEncoding.Dense, and the emitter must stay silent. Green baseline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Teaches StableHloConverter to collect every TensorSpec with a non-null tensorEncoding from the graph in a single pre-emit walk and surface them as a structured MLIR attribute on the module header: module attributes {skainet.tensor_encodings = {w_q4 = "Q4_K", w_q8 = "Q8_0"}} { func.func @quant_chain(...) -> (...) { ... } } Downstream tools (IREE, any MLIR pass reading SKaiNET output) can now enumerate every encoded tensor via one attribute lookup instead of string-matching against the scattered `tensor_encoding` comments introduced in #473. Dense graphs with no encoding metadata continue to emit the bare `module {` header byte-for- byte-identically to today — the promotion only fires when the collect phase finds at least one non-null encoding. The existing per-op comments are left in place; a follow-up can remove them if we decide the module attribute alone is sufficient. Collect phase is first-writer-wins: if the same tensor name surfaces in multiple nodes it collapses to a single map entry. Walks both inputs and outputs of every node so weight-operand references catch encodings that didn't originate as node outputs. Also tweaks MlirValidator to accept the new header shape: - validateSyntax treats `module` lines as module preamble and stops processing them at the header check, so the dict's `name = "value"` entries are no longer fed into the SSA assignment validator (which would fire on the absence of a `%` prefix). - validateSemantics skips `module`-prefixed lines for the same reason in its defined-values extraction. - validateModule accepts both `module {` and `module attributes` as valid module declaration forms. Full :skainet-compile:skainet-compile-hlo:jvmTest suite stays green — no existing test was asserting on `module {` in a way that excludes the attributes form. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

`MutableMap.putIfAbsent(key, value)` is a JVM-only extension (it lives on `java.util.Map`), not part of Kotlin common-stdlib. The usage introduced in #477's `collectTensorEncodings` compiled fine on JVM but broke `compileKotlinWasmJs` (and by extension JS / Native) with `Unresolved reference 'putIfAbsent'` at StableHloConverter.kt:209,213. develop has been red on the multiplatform build-job since #478 merged. Root cause: the local verification for #477 ran only `:skainet-compile:skainet-compile-hlo:jvmTest`. KMP's per-target compilation means a JVM-only stdlib call silently passes JVM checks and only fails on the first non-JVM compile. Replaces both call sites with the common-compatible idiom: if (spec.name !in result) result[spec.name] = encoding Verified locally with `./gradlew :skainet-compile:skainet-compile-hlo:allTests -x kotlinWasmStoreYarnLock` — green across jvmTest, wasmJsTest, wasmJsBrowserTest, wasmWasiTest, wasmWasiNodeTest, macosArm64Test, and iosSimulatorArm64Test (linuxX64Test skipped on macOS host). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Captures the goal, phases, non-goals, and risks of implementing a new `skainet-backend-nnapi` module that runs SKaiNET models on an Amlogic Android dev board's NPU via Android's NNAPI HAL. Important placement notes in the PRD itself: - The backend lives in a NEW sibling repo, not in mainline SKaiNET. Mainline stays general and IREE-focused. - The backend builds on top of the already-merged skainet-backend-api module (#470) and the TensorEncoding metadata flow (#471 / #475 / #478) — no mainline code changes are required to ship Phase 1-3. - Orthogonal to SKaiNET-transformers, which owns LLM modules. Phases: 0. Board bring-up + NNAPI device capability dump 1. FP32 dense matmul end-to-end 2. int8 quantization path hitting the NPU driver 3. Target model (MobileNetV3 int8 or TinyLlama candidate) 4. Optional production packaging Also documents the known deprecation risk (Android 15 marked NNAPI deprecated in favor of LiteRT) and captures this as accepted: ship the Amlogic use case now, plan a LiteRT successor later. This file is a planning artifact; it will be moved / referenced from the new backend repo once that repo exists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

michalharakal and others added 2 commits April 13, 2026 12:55

michalharakal merged commit 9edd5ec into develop Apr 13, 2026
2 of 4 checks passed

michalharakal deleted the feature/477-tensor-encodings-module-attribute branch April 13, 2026 10:57

michalharakal mentioned this pull request Apr 13, 2026

Fix WasmJS/JS/Native compile: replace putIfAbsent with common idiom #485

Merged

3 tasks

This was referenced Apr 13, 2026

IREE gap: TensorEncoding not propagated through intermediate ops in TraceToGraphBuilder.addTrace (P2) #492

Open

IREE gap: quantized weight values not emitted in stablehlo.constant bodies (P1) #493

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emit skainet.tensor_encodings module attribute (#477)#478

Emit skainet.tensor_encodings module attribute (#477)#478
michalharakal merged 2 commits intodevelopfrom
feature/477-tensor-encodings-module-attribute

michalharakal commented Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

michalharakal commented Apr 13, 2026

Summary

Emitted output

Two commits

1. Failing test — `TensorEncodingsModuleAttributeTest`

2. The fix

Why a module attribute, not per-op

Test plan

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant