Starred repositories
A lightweight data processing framework built on DuckDB and 3FS.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
DuckDB is an analytical in-process SQL database management system
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Toolkit for linearizing PDFs for LLM datasets/training
SGLang is a fast serving framework for large language models and vision language models.
verl: Volcano Engine Reinforcement Learning for LLMs
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Sparse Decentralized Collaborative Simultaneous Localization and Mapping Framework for Multi-Robot Systems
[ETH Course] Exercise materials of the Machine Learning on Microntrollers course
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
DeepEP: an efficient expert-parallel communication library
Source code for "Building Cryptographic Proofs from Hash Functions"
Runtime for executing procedural macros as WebAssembly
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Pretraining code for a large-scale depth-recurrent language model
An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl