tensor-layouts 0.2
What's Changed
Tensor class
- Storage-backed tensors with coordinate indexing (tensor[i, j]), write-through, view semantics on slicing, None as free-dimension marker,
Tensor.view(layout)for same-storage reinterpretation, and str with offset notation size(),rank(),cosize(),depth(),mode(),flatten(),image()accept Tensors transparently
GPU atom definitions
- Intel AMX tile matrix multiply atoms
- Intel Xe GPU DPAS atoms
- AMD RDNA3/RDNA4 WMMA atoms
- MMAAtom.str / CopyAtom.str for concise display
- Community feedback request notice added to all atom definition files
Analysis
to_F2_matrix()— convert power-of-2 layouts to binary matrix representation over GF(2); validated against Triton's LinearLayoutConversionsTest.cpp- TV-aware vectorized access modeling —
bank_conflicts(),coalescing_efficiency(), andsegment_analysis()now iterate all values per thread for multi-mode (TV) layouts, correctly modeling vectorized loads is_contiguous()as an alias foris_bijective()weakly_congruent()for partial-order profile matching- element_bytes now required for
bank_conflicts(),coalescing_efficiency(), etc.
Visualization
draw_gemm()for matmul spatial arrangement of A, B, C operand panels- Hierarchical layout support in draw_composite, with auto-computed panel_size and rendering options passed through **kwargs
- cell_labels parameter for user-supplied per-cell text
- interleave_colors option for hue-grouped palette
- transpose option for rank-1 column vectors
- precision parameter for float cell labels
- Remove
show_*()functions —draw_*(filename=None)handles inline display, fixing double-render in Jupyter - Layout.repr now returns eval-safe constructor string, distinct from Layout.str
Notebooks
- algorithms.ipynb — COPY, GEMM, Grouped GEMM, REDUCE, Epilogue Fusion, and Online Softmax visualized with layout algebra
- applications.ipynb — six layout algebra patterns from arXiv:2603.02298v1
Bug fixes
rank()for single-mode Layouts —rank(Layout(32))returns 1, not 0idx2crd()coordinate wrapping for scalar shapesidx2crd()/crd2flat()to accept Layout objects as shape argumentcrd2crd()to thread src_shape through per-mode recursionexplain()crash with tuple tilersdraw_slice()for 1D layoutsdraw_composite()auto-sizing to respect grid_rows/grid_cols overrides- Rank≥3 panel splitting to match CuTe convention
- slice_modes to preserve hierarchical mode boundaries
- Tensor slicing for hierarchical specs with nested Nones
- Trailing comma in Layout.str for 1-tuple shapes
- per_group analysis iteration for TV layouts
Robustness
- Type validation for Layout shape and stride arguments
- Grid overflow warning when panels exceed capacity instead of silently dropping
- Duck-type Tensor detection in viz instead of
isinstance()
Docs & build
- Missing license headers added to all source files
- examples and check targets in Makefile
Full Changelog: v0.1.1...v0.2.0