Skip to content
View ZZfive's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ZZfive

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

RL

17 repositories

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python 16,650 4,953 Updated Aug 1, 2024

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

Python 2,310 421 Updated Jul 9, 2024

✨✨Latest Advances on Multimodal Large Language Models

17,347 1,112 Updated Feb 7, 2026

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 2,004 127 Updated Nov 4, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,288 3,267 Updated Feb 20, 2026

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,346 111 Updated Jan 16, 2026

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,558 295 Updated Feb 15, 2026

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,842 220 Updated Feb 21, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)

Python 9,012 882 Updated Feb 21, 2026

📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.

416 20 Updated Feb 10, 2026

An Open-Source Large-Scale Reinforcement Learning Project for Search Agents

Python 557 36 Updated Nov 26, 2025

A version of verl to support diverse tool use

Python 875 75 Updated Feb 19, 2026

A fully open source framework for creating RL training swarms over the internet.

Python 1,696 638 Updated Jan 5, 2026

A Survey of Reinforcement Learning for Large Reasoning Models

TeX 2,334 129 Updated Nov 9, 2025

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

Python 595 61 Updated Feb 15, 2026

Verlog: A Multi-turn RL framework for LLM agents

Python 67 7 Updated Feb 9, 2026

The absolute trainer to light up AI agents.

Python 15,082 1,283 Updated Feb 11, 2026