RL
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
✨✨Latest Advances on Multimodal Large Language Models
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
verl: Volcano Engine Reinforcement Learning for LLMs
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.
An Open-Source Large-Scale Reinforcement Learning Project for Search Agents
A version of verl to support diverse tool use
A fully open source framework for creating RL training swarms over the internet.
A Survey of Reinforcement Learning for Large Reasoning Models
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.
Verlog: A Multi-turn RL framework for LLM agents
The absolute trainer to light up AI agents.
