tensor-layouts 0.3.0
What's Changed
Composed layouts
ComposedLayoutRelease v0.3 biggest feature. Layout could accept a single Swizzle, but this did not compose, it was hard-coded. We now can compose Layouts and Swizzles aribitrarily: multi-stage compositions (outer ∘ preoffset ∘ inner) that cannot be collapsed into a single affineLayout; double-swizzle, affine-on-swizzled, and recursive compositions now preserve full mapping semanticsLayoutExprtype alias (Layout | ComposedLayout) — all public APIs that accept a layout now accept either form transparently- Layout traits —
is_layout()widened to the CuTe-style trait; newis_affine_layout(),as_layout_expr(),as_affine_layout()for explicit trait boundaries between generic and affine-only code paths - Exact
compose()— canonicalcompose(Swizzle, Layout)fast path preserved; non-canonical cases (double-swizzle, affine-on-swizzled, recursive through existingComposedLayout) produce aComposedLayoutinstead of silently losing information - CuTe-specific parity rules —
Layout ∘ Swizzlecomposition, zero-preoffset collapse forLayout ∘ ComposedLayout, swizzled composed inverse support, and swizzle-awaremax_common_layout()/max_common_vector() - Structural transforms forwarded —
append,prepend,replace,group,flatten,sort,coalesce,logical_divide,logical_productoperate on the inner domain of aComposedLayoutinstead of dropping to affine-only assumptions - Slicing without offset leaks —
slice_and_offset()generalized so fixed-coordinate contributions inside a nonlinear composition stay inside the resultingComposedLayout(external offset 0) instead of being turned into an incorrect pointer offset Swizzle.__hash__—Swizzleobjects are now hashable, soComposedLayoutwith aSwizzleouter works in sets and dicts
Tensor
Tensornow acceptsLayoutExpr— indexing, slicing, storage validation, and address computation route through layout-expression-aware helpers;Tensor.strideremains deliberately affine-only and raises clearly on composed layouts
Analysis
- Generic
LayoutExprconsumers —image(),is_injective(),is_surjective(),is_bijective(),is_contiguous(),functionally_equal(),offset_table(),footprint(),bank_conflicts(),coalescing_efficiency(),segment_analysis(),per_group_bank_conflicts(),per_group_coalescing(),cycles(),fixed_points(),order()all acceptComposedLayouttransparently - Affine-only helpers (
to_F2_matrix,weakly_congruent,explain) now fail clearly on composed layouts viaas_affine_layout() max_common_vector()andmax_common_layout()treat embedded-swizzleLayout(..., swizzle=...)and zero-preoffsetComposedLayout(Swizzle, inner)as the same semantic form
Inverse helpers
right_inverse()andleft_inverse()preserve embedded swizzles by inverting the affine inner layout and recomposing the original swizzleright_inverse()now skips noncontiguous sorted modes instead of terminating immediately, matching CuTe on broadcast-unit examples- Composition divisibility tightened: stricter truncation gate restored for partially fitting non-divisible strides while preserving valid §3.3.3 truncation cases
Visualization
draw_layout(),draw_slice(), and multi-panel rendering acceptComposedLayoutdirectlydraw_slicetitles reflect internal composed preoffset instead of leaking external offset- Parameter types widened to
LayoutExprindocs/viz_api.md
Examples & notebooks
examples/composed.py— runnable example covering canonical swizzle fast path, exact composed fallback, slicing/tensor offset split, and optional--drawfiguresexamples/gemm.ipynb— fully explained GEMM kernel walkthrough with layout algebraexamples/viz.ipynb— composed-layout discoverability note addedtests/paper_examples.py— full coverage of all figures (1–12) and tables (1–7) in arXiv 2603.02298, with--drawsupport for rendering paper figures
Bug fixes
oracle_cute_cppskips gracefully whennvidiapackage is absent instead of crashing- Defensive assertions in
_compose_with_tiler()and_logical_divide_with_tiler()guard against composed results leaking into affine-only rebuild paths
Docs & build
- Composed-layout sections added to
docs/layout_api.md,docs/tensor_api.md,docs/analysis_api.md,docs/viz_api.md - Composed-layout figures (
composed_exact.png,composed_slice.png) generated and checked in Makefileclean target updated fortests/figures/pytest --drawconftest hook renders paper figures intotests/figures/
Tests
tests/composed.py— 45 regression tests covering representation contract, trait behavior, exact composition, divide/product cascades, recursive chains, hierarchical inners, full-slice identity, multi-mode,Tensor.view, and generic analysis coveragetests/viz.py— cell-value and panel-color correctness tests for composed layoutstests/oracle_cute_cpp.py— nonzero-preoffset composition, recursive composed chains, composedlogical_divide/logical_product,make_tensorwithComposedLayouttests/paper_examples.py— full arXiv 2603.02298 coverage with exact offset-value assertions
Full Changelog: v0.2.1...v0.3.0