Skip to content

tensor-layouts 0.3.0

Choose a tag to compare

@jduprat jduprat released this 20 Apr 19:56
· 78 commits to main since this release

What's Changed

Composed layouts

  • ComposedLayout Release v0.3 biggest feature. Layout could accept a single Swizzle, but this did not compose, it was hard-coded. We now can compose Layouts and Swizzles aribitrarily: multi-stage compositions (outer ∘ preoffset ∘ inner) that cannot be collapsed into a single affine Layout; double-swizzle, affine-on-swizzled, and recursive compositions now preserve full mapping semantics
  • LayoutExpr type alias (Layout | ComposedLayout) — all public APIs that accept a layout now accept either form transparently
  • Layout traitsis_layout() widened to the CuTe-style trait; new is_affine_layout(), as_layout_expr(), as_affine_layout() for explicit trait boundaries between generic and affine-only code paths
  • Exact compose() — canonical compose(Swizzle, Layout) fast path preserved; non-canonical cases (double-swizzle, affine-on-swizzled, recursive through existing ComposedLayout) produce a ComposedLayout instead of silently losing information
  • CuTe-specific parity rulesLayout ∘ Swizzle composition, zero-preoffset collapse for Layout ∘ ComposedLayout, swizzled composed inverse support, and swizzle-aware max_common_layout() / max_common_vector()
  • Structural transforms forwardedappend, prepend, replace, group, flatten, sort, coalesce, logical_divide, logical_product operate on the inner domain of a ComposedLayout instead of dropping to affine-only assumptions
  • Slicing without offset leaksslice_and_offset() generalized so fixed-coordinate contributions inside a nonlinear composition stay inside the resulting ComposedLayout (external offset 0) instead of being turned into an incorrect pointer offset
  • Swizzle.__hash__Swizzle objects are now hashable, so ComposedLayout with a Swizzle outer works in sets and dicts

Tensor

  • Tensor now accepts LayoutExpr — indexing, slicing, storage validation, and address computation route through layout-expression-aware helpers; Tensor.stride remains deliberately affine-only and raises clearly on composed layouts

Analysis

  • Generic LayoutExpr consumers — image(), is_injective(), is_surjective(), is_bijective(), is_contiguous(), functionally_equal(), offset_table(), footprint(), bank_conflicts(), coalescing_efficiency(), segment_analysis(), per_group_bank_conflicts(), per_group_coalescing(), cycles(), fixed_points(), order() all accept ComposedLayout transparently
  • Affine-only helpers (to_F2_matrix, weakly_congruent, explain) now fail clearly on composed layouts via as_affine_layout()
  • max_common_vector() and max_common_layout() treat embedded-swizzle Layout(..., swizzle=...) and zero-preoffset ComposedLayout(Swizzle, inner) as the same semantic form

Inverse helpers

  • right_inverse() and left_inverse() preserve embedded swizzles by inverting the affine inner layout and recomposing the original swizzle
  • right_inverse() now skips noncontiguous sorted modes instead of terminating immediately, matching CuTe on broadcast-unit examples
  • Composition divisibility tightened: stricter truncation gate restored for partially fitting non-divisible strides while preserving valid §3.3.3 truncation cases

Visualization

  • draw_layout(), draw_slice(), and multi-panel rendering accept ComposedLayout directly
  • draw_slice titles reflect internal composed preoffset instead of leaking external offset
  • Parameter types widened to LayoutExpr in docs/viz_api.md

Examples & notebooks

  • examples/composed.py — runnable example covering canonical swizzle fast path, exact composed fallback, slicing/tensor offset split, and optional --draw figures
  • examples/gemm.ipynb — fully explained GEMM kernel walkthrough with layout algebra
  • examples/viz.ipynb — composed-layout discoverability note added
  • tests/paper_examples.py — full coverage of all figures (1–12) and tables (1–7) in arXiv 2603.02298, with --draw support for rendering paper figures

Bug fixes

  • oracle_cute_cpp skips gracefully when nvidia package is absent instead of crashing
  • Defensive assertions in _compose_with_tiler() and _logical_divide_with_tiler() guard against composed results leaking into affine-only rebuild paths

Docs & build

  • Composed-layout sections added to docs/layout_api.md, docs/tensor_api.md, docs/analysis_api.md, docs/viz_api.md
  • Composed-layout figures (composed_exact.png, composed_slice.png) generated and checked in
  • Makefile clean target updated for tests/figures/
  • pytest --draw conftest hook renders paper figures into tests/figures/

Tests

  • tests/composed.py — 45 regression tests covering representation contract, trait behavior, exact composition, divide/product cascades, recursive chains, hierarchical inners, full-slice identity, multi-mode, Tensor.view, and generic analysis coverage
  • tests/viz.py — cell-value and panel-color correctness tests for composed layouts
  • tests/oracle_cute_cpp.py — nonzero-preoffset composition, recursive composed chains, composed logical_divide/logical_product, make_tensor with ComposedLayout
  • tests/paper_examples.py — full arXiv 2603.02298 coverage with exact offset-value assertions

Full Changelog: v0.2.1...v0.3.0