Starred repositories
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
verl: Volcano Engine Reinforcement Learning for LLMs
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Knowledge-Reasoning Synergy Reinforcement Learning.
Make websites accessible for AI agents
Use PEFT or Full-parameter to finetune 500+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Llama3.2-Vi…
Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
A live stream development of RL tunning for LLM agents
No fortress, purely open ground. OpenManus is Coming.
Official repo of "Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs"
this is a repository that gives the power of mixture of workflows a concept inspired by the mixture of agents.
A series of technical report on Slow Thinking with LLM
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
Official Repo for Open-Reasoner-Zero
Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas…
Tree-of-Debate converts scientific papers into LLM personas that debate their respective novelties. To emphasize structured, critical reasoning rather than focusing solely on outcomes, Tree-of-Deba…
[ICML2024] Official repo for paper "Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning"
The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose Motion Generators"