Skip to content

Pinned Loading

  1. flash-linear-attention Public

    šŸš€ Efficient implementations of state-of-the-art linear attention models in Torch and Triton

    Python 2.2k 144

  2. flame Public

    šŸ”„ A minimal training framework for scaling FLA models

    Python 94 15

  3. native-sparse-attention Public

    šŸ³ Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

    Python 608 30

Repositories

Showing 8 of 8 repositories
  • flash-linear-attention Public

    šŸš€ Efficient implementations of state-of-the-art linear attention models in Torch and Triton

    Python 2,206 MIT 144 27 5 Updated Apr 4, 2025
  • fla-zoo Public

    Flash-Linear-Attention models beyond language

    Python 9 1 0 0 Updated Apr 3, 2025
  • flame Public

    šŸ”„ A minimal training framework for scaling FLA models

    Python 94 MIT 15 0 1 Updated Apr 1, 2025
  • fla-rl Public

    A minimal RL frame work for scaling FLA models on long-horizon reasoning and agentic scenarios.

    4 MIT 0 0 0 Updated Apr 1, 2025
  • ThunderKittens Public Forked from HazyResearch/ThunderKittens

    Tile primitives for speedy kernels

    Cuda 2 MIT 133 0 0 Updated Mar 28, 2025
  • native-sparse-attention Public

    šŸ³ Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

    Python 608 MIT 30 9 0 Updated Mar 19, 2025
  • 7 0 0 0 Updated Mar 5, 2025
  • flash-bidirectional-linear-attention Public

    Triton implement of bi-directional (non-causal) linear attention

    Python 44 MIT 1 0 0 Updated Feb 4, 2025