Pinned Loading
Repositories
Showing 7 of 7 repositories
- flash-linear-attention Public
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
- native-sparse-attention Public
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
- flash-hybrid-attention Public
- flash-bidirectional-linear-attention Public
Triton implement of bi-directional (non-causal) linear attention