- Beijing, China
Stars
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Lightweight version of MAPPO to help you quickly migrate to your local environment.
This is the official implementation of Multi-Agent PPO (MAPPO).
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
Align Anything: Training All-modality Model with Feedback
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Scalable toolkit for efficient model alignment
Accelerating new GitHub Actions workflows
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Official Repo for Open-Reasoner-Zero
Code for the paper Fine-Tuning Language Models from Human Preferences
Fully open reproduction of DeepSeek-R1
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
Janus-Series: Unified Multimodal Understanding and Generation Models
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
This framework provides out-of-the-box implementations of Referential Games variants in order to study the emergence of artificial languages using deep learning, relying on PyTorch (https://www.pyt…
EGG: Emergence of lanGuage in Games
Stable Diffusion web UI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.