v0.0.1
Features
Self-Contained Single-Node RL Training
AReno ships as a single Python package with its own CUDA kernels, tensor-parallel inference engine, and OpenAI-compatible serving — no external training/inference backend to wire together. The RL loop is a short cycle of Trainer calls:
from areno.api import Trainer, ArenoConfig, Areno, SamplingParams, gspo_loss_fn
trainer = Trainer(
world_size=1,
model_path="Qwen/Qwen3-0.6B",
backend_type=Areno,
custom_config=ArenoConfig(tp_size=1),
)
trainer.init()Or from the CLI:
areno train --ckpt Qwen/Qwen3-0.6B --dataset-path gsm8k:main \
--reward-fn-path examples/math/math_verify_reward.py --algo gspo --tp-size 4Swap algorithms via --algo: sft, dpo, gspo, grpo, ppo.
Agentic RL with Tool-Calling Trajectories
Built-in support for training agents that call tools and produce multi-turn trajectories. The trainer provides a local OpenAI-compatible proxy during rollout, parses tool calls, logs message-level trajectories, and assigns rewards at token boundaries — no external agent framework needed.
Key design choices:
- Continuous batching: new samples enter as completions finish, maximizing GPU utilization during multi-turn rollout
- Async rollout: separate event loop eliminates GIL contention with the training autograd engine
- Shared tool parser: same parsing logic in training rollout and
areno serve, ensuring consistent behavior
OpenAI-Compatible Serving
areno serve provides /v1/chat/completions and /v1/completions with tensor-parallel inference:
areno serve --model-path /path/to/model --tp-size 1 --port 8000Multi-Model Support
Per-family model adapters for Qwen3, Qwen3.5, LLaMA, Gemma4, Bailing, and MiniCPM-V, registered through areno/models/registry.py. Add a new model family by creating areno/models/<family>/ — no core changes needed.
PyPI Distribution
Install via pip install areno --no-build-isolation. Automated sdist release workflow publishes from git tags.
Fixes
- Unified reward function contract across all algorithms
- Fixed PPO loss function export missing from public API
- Improved agentic rollout batching and trajectory coalescing
- Fixed CUDA extension build for sdist/metadata-only installs
- Aligned serve sampling defaults with training rollout
Documentation and Examples
- README with installation guide, quick start, and feature highlights
- Developer contributing guide
- CLI and SDK operation guides for training, serving, and agentic rollout
- Agentic rollout examples with rollout sessions
What's Changed
- ci: publish pypi from release tag by @xsuler in 0c476db
- fix: unify reward function contract by @xsuler in 9b46a1e
- docs: update README add acknowledgements and split CLI section by @adohe in 53bd862
- fix: align serve sampling defaults by @xsuler in 1c10882
- feat: share tool parsing in serve by @xsuler in 260c424
- docs: fix agentic rollout examples by @xsuler in dda2ccf
- feat: add agentic trajectories and continuous batching by @xsuler in b745e42
- docs: use rollout session in examples by @xsuler in 1c266c8
- fix: improve agentic rollout batching by @xsuler in f9cb223
- docs: update agentic cli and sdk usage by @xsuler in 53a5c2d
- fix: log agentic message trajectories by @xsuler in 094c840
- feat: improve agentic rollout trajectories by @xsuler in 82b85da
- fix: restore agentic rollout coalesce wait by @xsuler in 5c27fd2
- feat: improve async rollout batching by @xsuler in 30c3064
- feat(agentic): add agentic functionality by @xsuler in 3c71838
- fix: issue and pr template format by @adohe in ebc990f
- chore: add various issue and pr templates by @adohe in 6fb6a32
- chore: update pyproject.toml to add more description and missing dep by @adohe in b47a26a
- chore: enhance git ignore file by @adohe in 1f4756a
- fix(api): add missing ppo loss fn export by @adohe in 07cdc2c
- docs: add developer contributing guide by @adohe in 10f8473
- docs: add operation guides for various agent by @adohe in 8b7ab43
- ci: fix sdist release packaging by @xsuler in 5d9ff5c
- fix(build): skip cuda extensions for sdist metadata by @xsuler in 72cefa0
- docs: add README.md and LICENSE file by @adohe in f2b487c
- ci: add pypi sdist release workflow by @xsuler in 9e002b3
Full Changelog: 3447d53...0c476db