Skip to content

Releases: inclusionAI/AReno

v0.0.2

22 Jun 03:51
36487e9

Choose a tag to compare

Features

Setup and Training Diagnostics

AReno now has first-class setup diagnostics for local development and release
validation. The new diagnostic commands help users inspect their Python, CUDA,
PyTorch, package, and source-build environment before starting training or
opening an install issue.

The training CLI also prints a resolved configuration summary before backend
initialization, making it easier to confirm the selected model, dataset, reward
function, algorithm, rollout settings, and backend options before expensive
runtime work begins.

Native Attention Backend

This release adds a native attention backend path so AReno can support more GPU
environments without requiring FlashAttention in every setup. Source builds also
now target visible CUDA architectures by default, reducing build surprises on
mixed or older NVIDIA environments.

CLI Usability

areno train --help is grouped by user intent, so common setup, data, reward,
algorithm, rollout, and backend flags are easier to scan. CPU tests now cover
CLI config validation to keep those ergonomics stable.

Fixes

  • Exposed Trainer from the top-level areno package for the public SDK import path.
  • Added preflight validation for malformed dataset, reward, and agent task hooks.
  • Fixed Qwen3.5 angle-style tool-call parsing.
  • Fixed Gemma tool-call parsing for nested action arrays.
  • Explained the psutil build dependency for --no-build-isolation installs.
  • Improved setup guardrails and source-build diagnostics.

Documentation and Release Hygiene

  • Added a tiny official training sanity path.
  • Documented Docker as a setup-DX escape hatch.
  • Documented FlashAttention installation ordering and CUDA build tuning with
    TORCH_CUDA_ARCH_LIST and MAX_JOBS.
  • Fixed public SDK import examples in docs.
  • Renamed repository URLs from asystem-areno to AReno.
  • Added the v0.0.2 release hygiene checklist, including the tag convention,
    sdist-only publishing expectation, milestone-driven release notes, and required
    release checks.

What's Changed

  • Add pull request unit test CI by @xsuler in e7c4c00
  • chore: remove benchmark script by @xsuler in 80f8dc8
  • ci: enhance ci workflow for better code quality by @adohe in 3f0d061
  • chore: fix existing lint issue to make ci happy by @adohe in e234dc9
  • style: fix code formatting and lint issues across the repo by @adohe in #17
  • fix: explain psutil build dependency by @xsuler in #14
  • docs: install flash attention dependencies first by @xsuler in #18
  • fix: target visible CUDA architectures during source builds by @xsuler in #16
  • docs: add training smoke sanity path by @xsuler in #19
  • feat: add setup diagnostics commands by @xsuler in #20
  • fix: expose trainer from top-level package by @xsuler in #22
  • ci: enforce lint and commit message checks before commit by @adohe in #21
  • test: cover train cli config validation by @xsuler in #23
  • feat: print resolved train config summary by @xsuler in #26
  • fix: preflight train task hook validation by @xsuler in #25
  • fix: support qwen angle tool calls by @xsuler in #28
  • docs: rename repo URLs from asystem-areno to AReno by @adohe in 0581e49
  • feat: add native attention backend by @xsuler in #31
  • fix: improve setup guardrails by @xsuler in #33
  • docs: add Docker setup check path by @xsuler in #34
  • fix: parse nested Gemma tool calls by @xsuler in #36
  • feat(cli): group areno train --help by user intent by @adohe in f4b34c7
  • chore(release): define v0.0.2 release hygiene by @adohe in #37

Full Changelog: v0.0.1...v0.0.2

v0.0.1

16 Jun 03:38

Choose a tag to compare

Features

Self-Contained Single-Node RL Training

AReno ships as a single Python package with its own CUDA kernels, tensor-parallel inference engine, and OpenAI-compatible serving — no external training/inference backend to wire together. The RL loop is a short cycle of Trainer calls:

from areno.api import Trainer, ArenoConfig, Areno, SamplingParams, gspo_loss_fn

trainer = Trainer(
    world_size=1,
    model_path="Qwen/Qwen3-0.6B",
    backend_type=Areno,
    custom_config=ArenoConfig(tp_size=1),
)
trainer.init()

Or from the CLI:

areno train --ckpt Qwen/Qwen3-0.6B --dataset-path gsm8k:main \
  --reward-fn-path examples/math/math_verify_reward.py --algo gspo --tp-size 4

Swap algorithms via --algo: sft, dpo, gspo, grpo, ppo.

Agentic RL with Tool-Calling Trajectories

Built-in support for training agents that call tools and produce multi-turn trajectories. The trainer provides a local OpenAI-compatible proxy during rollout, parses tool calls, logs message-level trajectories, and assigns rewards at token boundaries — no external agent framework needed.

Key design choices:

  • Continuous batching: new samples enter as completions finish, maximizing GPU utilization during multi-turn rollout
  • Async rollout: separate event loop eliminates GIL contention with the training autograd engine
  • Shared tool parser: same parsing logic in training rollout and areno serve, ensuring consistent behavior

OpenAI-Compatible Serving

areno serve provides /v1/chat/completions and /v1/completions with tensor-parallel inference:

areno serve --model-path /path/to/model --tp-size 1 --port 8000

Multi-Model Support

Per-family model adapters for Qwen3, Qwen3.5, LLaMA, Gemma4, Bailing, and MiniCPM-V, registered through areno/models/registry.py. Add a new model family by creating areno/models/<family>/ — no core changes needed.

PyPI Distribution

Install via pip install areno --no-build-isolation. Automated sdist release workflow publishes from git tags.

Fixes

  • Unified reward function contract across all algorithms
  • Fixed PPO loss function export missing from public API
  • Improved agentic rollout batching and trajectory coalescing
  • Fixed CUDA extension build for sdist/metadata-only installs
  • Aligned serve sampling defaults with training rollout

Documentation and Examples

  • README with installation guide, quick start, and feature highlights
  • Developer contributing guide
  • CLI and SDK operation guides for training, serving, and agentic rollout
  • Agentic rollout examples with rollout sessions

What's Changed

Full Changelog: 3447d53...0c476db