We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Understanding R1-Zero-Like Training: A Critical Perspective
Python 621 25
Forked from NVIDIA/Megatron-LM
Zero Bubble Pipeline Parallelism
Python 373 22
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Python 622 38
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
C++ 1.1k 108
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
Python 3.4k 196
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Python 782 66
Exchange correlation functionals translated from libxc to jax
Automatic Functional Differentiation in JAX
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
A JAX-based Differentiable Density Functional Theory Framework for Materials
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Megatron for Sailor2/Qwen2.5
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling