Release v0.0.1 · inclusionAI/AReno

Features

Self-Contained Single-Node RL Training

AReno ships as a single Python package with its own CUDA kernels, tensor-parallel inference engine, and OpenAI-compatible serving — no external training/inference backend to wire together. The RL loop is a short cycle of Trainer calls:

from areno.api import Trainer, ArenoConfig, Areno, SamplingParams, gspo_loss_fn

trainer = Trainer(
    world_size=1,
    model_path="Qwen/Qwen3-0.6B",
    backend_type=Areno,
    custom_config=ArenoConfig(tp_size=1),
)
trainer.init()

Or from the CLI:

areno train --ckpt Qwen/Qwen3-0.6B --dataset-path gsm8k:main \
  --reward-fn-path examples/math/math_verify_reward.py --algo gspo --tp-size 4

Swap algorithms via --algo: sft, dpo, gspo, grpo, ppo.

Agentic RL with Tool-Calling Trajectories

Built-in support for training agents that call tools and produce multi-turn trajectories. The trainer provides a local OpenAI-compatible proxy during rollout, parses tool calls, logs message-level trajectories, and assigns rewards at token boundaries — no external agent framework needed.

Key design choices:

Continuous batching: new samples enter as completions finish, maximizing GPU utilization during multi-turn rollout
Async rollout: separate event loop eliminates GIL contention with the training autograd engine
Shared tool parser: same parsing logic in training rollout and areno serve, ensuring consistent behavior

OpenAI-Compatible Serving

areno serve provides /v1/chat/completions and /v1/completions with tensor-parallel inference:

areno serve --model-path /path/to/model --tp-size 1 --port 8000

Multi-Model Support

Per-family model adapters for Qwen3, Qwen3.5, LLaMA, Gemma4, Bailing, and MiniCPM-V, registered through areno/models/registry.py. Add a new model family by creating areno/models/<family>/ — no core changes needed.

PyPI Distribution

Install via pip install areno --no-build-isolation. Automated sdist release workflow publishes from git tags.

Fixes

Unified reward function contract across all algorithms
Fixed PPO loss function export missing from public API
Improved agentic rollout batching and trajectory coalescing
Fixed CUDA extension build for sdist/metadata-only installs
Aligned serve sampling defaults with training rollout

Documentation and Examples

README with installation guide, quick start, and feature highlights
Developer contributing guide
CLI and SDK operation guides for training, serving, and agentic rollout
Agentic rollout examples with rollout sessions

What's Changed

ci: publish pypi from release tag by @xsuler in 0c476db
fix: unify reward function contract by @xsuler in 9b46a1e
docs: update README add acknowledgements and split CLI section by @adohe in 53bd862
fix: align serve sampling defaults by @xsuler in 1c10882
feat: share tool parsing in serve by @xsuler in 260c424
docs: fix agentic rollout examples by @xsuler in dda2ccf
feat: add agentic trajectories and continuous batching by @xsuler in b745e42
docs: use rollout session in examples by @xsuler in 1c266c8
fix: improve agentic rollout batching by @xsuler in f9cb223
docs: update agentic cli and sdk usage by @xsuler in 53a5c2d
fix: log agentic message trajectories by @xsuler in 094c840
feat: improve agentic rollout trajectories by @xsuler in 82b85da
fix: restore agentic rollout coalesce wait by @xsuler in 5c27fd2
feat: improve async rollout batching by @xsuler in 30c3064
feat(agentic): add agentic functionality by @xsuler in 3c71838
fix: issue and pr template format by @adohe in ebc990f
chore: add various issue and pr templates by @adohe in 6fb6a32
chore: update pyproject.toml to add more description and missing dep by @adohe in b47a26a
chore: enhance git ignore file by @adohe in 1f4756a
fix(api): add missing ppo loss fn export by @adohe in 07cdc2c
docs: add developer contributing guide by @adohe in 10f8473
docs: add operation guides for various agent by @adohe in 8b7ab43
ci: fix sdist release packaging by @xsuler in 5d9ff5c
fix(build): skip cuda extensions for sdist metadata by @xsuler in 72cefa0
docs: add README.md and LICENSE file by @adohe in f2b487c
ci: add pypi sdist release workflow by @xsuler in 9e002b3

Full Changelog: 3447d53...0c476db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.0.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Features

Self-Contained Single-Node RL Training

Agentic RL with Tool-Calling Trajectories

OpenAI-Compatible Serving

Multi-Model Support

PyPI Distribution

Fixes

Documentation and Examples

What's Changed

Contributors

Uh oh!