Skip to content

tensor-layouts 0.2

Choose a tag to compare

@jduprat jduprat released this 06 Apr 19:02

What's Changed

Tensor class

  • Storage-backed tensors with coordinate indexing (tensor[i, j]), write-through, view semantics on slicing, None as free-dimension marker, Tensor.view(layout) for same-storage reinterpretation, and str with offset notation
  • size(), rank(), cosize(), depth(), mode(), flatten(), image() accept Tensors transparently

GPU atom definitions

  • Intel AMX tile matrix multiply atoms
  • Intel Xe GPU DPAS atoms
  • AMD RDNA3/RDNA4 WMMA atoms
  • MMAAtom.str / CopyAtom.str for concise display
  • Community feedback request notice added to all atom definition files

Analysis

  • to_F2_matrix() — convert power-of-2 layouts to binary matrix representation over GF(2); validated against Triton's LinearLayoutConversionsTest.cpp
  • TV-aware vectorized access modeling — bank_conflicts(), coalescing_efficiency(), and segment_analysis() now iterate all values per thread for multi-mode (TV) layouts, correctly modeling vectorized loads
  • is_contiguous() as an alias for is_bijective()
  • weakly_congruent() for partial-order profile matching
  • element_bytes now required for bank_conflicts(), coalescing_efficiency(), etc.

Visualization

  • draw_gemm() for matmul spatial arrangement of A, B, C operand panels
  • Hierarchical layout support in draw_composite, with auto-computed panel_size and rendering options passed through **kwargs
  • cell_labels parameter for user-supplied per-cell text
  • interleave_colors option for hue-grouped palette
  • transpose option for rank-1 column vectors
  • precision parameter for float cell labels
  • Remove show_*() functions — draw_*(filename=None) handles inline display, fixing double-render in Jupyter
  • Layout.repr now returns eval-safe constructor string, distinct from Layout.str

Notebooks

  • algorithms.ipynb — COPY, GEMM, Grouped GEMM, REDUCE, Epilogue Fusion, and Online Softmax visualized with layout algebra
  • applications.ipynb — six layout algebra patterns from arXiv:2603.02298v1

Bug fixes

  • rank() for single-mode Layouts —rank(Layout(32)) returns 1, not 0
  • idx2crd() coordinate wrapping for scalar shapes
  • idx2crd() / crd2flat() to accept Layout objects as shape argument
  • crd2crd() to thread src_shape through per-mode recursion
  • explain() crash with tuple tilers
  • draw_slice() for 1D layouts
  • draw_composite() auto-sizing to respect grid_rows/grid_cols overrides
  • Rank≥3 panel splitting to match CuTe convention
  • slice_modes to preserve hierarchical mode boundaries
  • Tensor slicing for hierarchical specs with nested Nones
  • Trailing comma in Layout.str for 1-tuple shapes
  • per_group analysis iteration for TV layouts

Robustness

  • Type validation for Layout shape and stride arguments
  • Grid overflow warning when panels exceed capacity instead of silently dropping
  • Duck-type Tensor detection in viz instead of isinstance()

Docs & build

  • Missing license headers added to all source files
  • examples and check targets in Makefile

Full Changelog: v0.1.1...v0.2.0