nano-verl

A lightweight verl-style RL training framework implemented from scratch.

Core Features

Readability: nanoverl has about 6k lines of code, compared with 90K+ lines in verl.
Distributed training: uses FSDP+vLLM as the training and inference backends, with Ray for distributed management. Supports rollout load balancing, dynamic batch, remove padding, and more.
Asynchronous support: supports one-step-off-policy asynchronous training, enabled by setting trainer.mode=one_step_off.

Installation

Clone the code:

git clone https://github.com/kidding-404/nano-verl.git
cd nano-verl

Install dependencies with uv:

uv sync

Find the compatible flash-attn wheel and install it separately:

uv run pip install <flash_attn_wheel_url>

Quick Start

Train qwen3-0.6B on the gsm8k dataset:

uv run python main.py --config configs/gsm8k-qwen3-0.6b-single-gpu.yaml

You can also train qwen3-1.7B asynchronously on two GPUs:

uv run python main.py --config configs/gsm8k-qwen3-1.7b-1p1-async.yaml

Benchmark

Test configuration:

Model: Qwen3-4B
Trainset: DAPO-17K
Reward: 1/-1 accuracy reward
Steps: 150
Global batch size: 64
Rollout n: 8
Prompt length: 1024
Response length: 8192
Hardware: 1 node, 8 x NVIDIA H100 80GB HBM3

Reward curve:

Performance comparison:

Setting	AIME24 avg16	AIME24 pass@16	AIME25 avg16	AIME25 pass@16
Qwen3-4B Base	0.4333	0.7000	0.3563	0.5333
Qwen3-4B + verl	0.5313	0.8333	0.4417	0.6667
Qwen3-4B + nano-verl	0.535	0.8333	0.429	0.6667

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
configs		configs
data/gsm8k		data/gsm8k
docs		docs
nanoverl		nanoverl
.gitignore		.gitignore
README.md		README.md
README_CN.md		README_CN.md
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nano-verl

Core Features

Installation

Quick Start

Benchmark

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

nano-verl

Core Features

Installation

Quick Start

Benchmark

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages