Skip to content

Alpha Release

Latest

Choose a tag to compare

@vhrabar vhrabar released this 01 Jul 11:20
90630b4

Morphottention v0.2.0 (Alpha)

Added

  • Backward pass: MorphoAttention is now fully differentiable. A fused CUDA backward kernel computes gradients for all inputs (x, W_phi, gate_q, gate_k, W_V), wired into autograd via MorphoAttentionFunction.
    • Backward pass signature and autograd wiring.
    • Shared-memory carve-out and data load/store paths for the backward kernel.
    • Central K/V loop ported from the forward pass with on-the-fly LSE recompute (no saved attention matrix).
    • Backward Phi projection GEMM.
    • Stage-1 and stage-2 gradient computation with the full SMEM carve and scratch layout.
  • Matmul kernels: runtime-dynamic (RT-dyn) matmul and transpose support for frag_a, backing the backward GEMMs.
  • Packaging: prebuilt wheels for additional CPython versions (3.12–3.14) and a build/release workflow.