Stars
Good-lookin' diffs. Actually… nah… The best-lookin' diffs. 🎉
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Analyze computation-communication overlap in V3/R1.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Fast and memory-efficient exact attention
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Flax is a neural network library for JAX that is designed for flexibility.
Fully open reproduction of DeepSeek-R1
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…
YaRN: Efficient Context Window Extension of Large Language Models
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
TensorFlow code and pre-trained models for BERT
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Master programming by recreating your favorite technologies from scratch.
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, and more.
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
A best practices guide for day 2 operations, including operational excellence, security, reliability, performance efficiency, and cost optimization.