Skip to content

Pinned Loading

  1. flash-linear-attention Public

    🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

    Python 2.2k 142

  2. flame Public

    🔥 A minimal training framework for scaling FLA models

    Python 90 14

  3. native-sparse-attention Public

    🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

    Python 601 30

Repositories

Showing 7 of 7 repositories
  • ThunderKittens Public Forked from HazyResearch/ThunderKittens

    Tile primitives for speedy kernels

    Cuda 1 MIT 130 0 0 Updated Mar 27, 2025
  • fla-zoo Public

    Flash-Linear-Attention models beyond language

    Python 9 1 0 0 Updated Mar 27, 2025
  • flash-linear-attention Public

    🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

    Python 2,173 MIT 142 23 2 Updated Mar 27, 2025
  • flame Public

    🔥 A minimal training framework for scaling FLA models

    Python 90 MIT 14 0 0 Updated Mar 22, 2025
  • native-sparse-attention Public

    🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

    Python 601 MIT 30 7 0 Updated Mar 19, 2025
  • 7 0 0 0 Updated Mar 5, 2025
  • flash-bidirectional-linear-attention Public

    Triton implement of bi-directional (non-causal) linear attention

    Python 44 MIT 1 0 0 Updated Feb 4, 2025

Top languages

Python Cuda

Most used topics

Loading…