🚀 Key Highlights
Micro-service Architecture
Includes training, inference, agent, and weight update services.
Hermes Agent Example
End-to-end agent built on the agent service and wired into the 2.0 training loop. Check out the Hermes RL training example for more details.
CLI Commands
A unified areal CLI to launch, manage, and operate each micro-service. See the per-service guides: training, inference, agent.
What's Changed
- perf(infra): pipeline controller initialization with background threads by @garrett4wade in #1294
- ci: add nightly workflow for long-running tests by @garrett4wade in #1312
- feat: support pp for Sglang by @TaoZex in #1162
- perf(archon): parallelize inference controller initialization by @garrett4wade in #1314
- fix(archon): improve HTTP client reliability and reduce logging verbosity by @garrett4wade in #1315
- refactor(experimental): remove redundant capacity grant from inference service by @garrett4wade in #1318
- refactor(experimental): add unified group semantics to inference service session lifecycle by @garrett4wade in #1321
- chore: migrate repo references from InclusionAI to areal-project by @garrett4wade in #1325
- fix(proxy): refuse default admin API key on non-loopback bind by @sebastiondev in #1323
- refactor: consolidate admin key validation into shared helper by @garrett4wade in #1328
- fix: Add error detection function and test for ZeroDivisionError and other errors alike by @chenzhiyi021 in #1332
- docs: Add OpenSSF Best Practices badge to README 📛 by @mingcheng in #1348
- gov: add new maintainer by @sitabulaixizawaluduo in #1349
- fix(utils): mask 2d sequence advantages by @haoyang9804 in #1346
- feat(awex): add colocated CUDA IPC weight transfer by @garrett4wade in #1310
- fix(infra): correct staleness capacity inflation after recovery by @daihaowz in #1345
- fix(test): add missing colocate field to wu_controller connect tests by @sitabulaixizawaluduo in #1357
- refactor(infra): backport rl infra cleanup by @sitabulaixizawaluduo in #1353
- feat(megatron): implement async_save with AsyncCallsQueue by @dingzhiqiang in #1356
- feat: controller v2 refactor by @sitabulaixizawaluduo in #1354
- fix(fsdp): maintain fp32 master weights for AdamW (#1292) by @guozhihao-224 in #1369
- chore: add 0516 community meeting materials and update agenda for the next biweekly sync by @sitabulaixizawaluduo in #1371
- docs: fix some typos by @jeis4wpi in #1352
- feat:enable v2 training pipeline with controller parity by @sitabulaixizawaluduo in #1363
- fix(trainer): skip controller-side CUDA sync in single-controller mode by @Adiactive in #1377
- fix[v2]: localize RTensor trajectories before reading on controller by @sitabulaixizawaluduo in #1387
- chore: migrate community governance files to external repository 📋 by @mingcheng in #1386
- feat(megatron): Qwen3.5 dense + MoE training/inference support via megatron-bridge by @Adiactive in #1384
- feat(distillation): add on-policy distillation using RolloutEngine by @zahrayousefijamarani in #1376
- docs(roadmap): add 2026 Q2 and H2 milestones by @sitabulaixizawaluduo in #1390
- Fix LoRA model training by @lifeiteng in #1385
- fix(v2/awex): unblock weight-update bring-up by @sitabulaixizawaluduo in #1401
- fix(engine): default shard_ids=None on clear_batches across all engines by @Adiactive in #1402
- docs: add community section with WeChat QR code to README 📢 by @mingcheng in #1409
- Supporting features for IcePop and KPop by @guojiapub in #1405
- fix: per-sample version tracking with loss_mask filter and multi-turn… by @pyq623 in #1408
- docs: add IcePop/KPop feature introduction by @guojiapub in #1424
- fix(vllm): forward frequency_penalty and stop in generation requests by @EazyReal in #1429
- fix(awex): allow disabling batch_send_recv use_group via AWEX_WU_USE_GROUP by @sitabulaixizawaluduo in #1414
- feat: disable megatron grad buffers CPU backup to save host memory by @HT-Yuan in #1393
- fix(network): find_free_ports ignores out-of-range exclude_ports by @EazyReal in #1436
- fix(reward): guard clevr_count_70k_reward_fn against scoring failures by @EazyReal in #1430
- feat(megatron): make MTP head opt-in to support Qwen3.6 MoE RL by @Adiactive in #1403
- feat(ppo): add CISPO loss surrogate (MiniMax-M1) by @EazyReal in #1412
- fix(CI): fix vlm_grpo CI OOM bug by @sitabulaixizawaluduo in #1438
- feat(cli): add experimental cli scaffold for service-style subcommands by @sitabulaixizawaluduo in #1440
- refactor: move 5 experimental modules into areal/v2 for 2.0 release by @sitabulaixizawaluduo in #1448
- feat: support v2 weight update disk mode for lora RL by @sitabulaixizawaluduo in #1450
- feat(cli): add training service cli by @sitabulaixizawaluduo in #1446
- feat(cli): add agent service cli by @sitabulaixizawaluduo in #1447
- feat(cli): add inference service cli by @sitabulaixizawaluduo in #1434
- feat(agent_service): add agent service with OpenClaw and Hermes examples by @IF007 in #1383
- feat(ppo): add reuse_train_logp proximal logp method by @Le8r0nJames in #1453
- fix(ppo): coerce ppo_n_minibatches to 1 for reuse_train_logp instead of raising by @Le8r0nJames in #1457
- Chore/add areal 2 report paper by @sitabulaixizawaluduo in #1461
- feat(launcher): add best-effort post-exit hook by @Le8r0nJames in #1459
- feat(megatron): add CP-safe vocab stats and MoE config support by @Le8r0nJames in #1460
- fix(ppo): handle variable-size trajectory groups in reward normalization by @Le8r0nJames in #1454
- fix(openai): parse tool_call arguments from JSON string to dict before chat template by @Le8r0nJames in #1463
- feat(swe): add SWE-bench RL training workflow example by @Le8r0nJames in #1462
- fix: fix safe-to-test CI workflow by @sitabulaixizawaluduo in #1456
- feat(openai): add proxy preprocessors and Qwen tool-call parsing by @Le8r0nJames in #1458
- test(ppo): update singleton leave-one-out group expectation by @Le8r0nJames in #1464
- chore: update AReaL 2.0 report paper by @sitabulaixizawaluduo in #1465
- chore: bump v2.0.0 by @sitabulaixizawaluduo in #1466
New Contributors
- @sebastiondev made their first contribution in #1323
- @mingcheng made their first contribution in #1348
- @haoyang9804 made their first contribution in #1346
- @jeis4wpi made their first contribution in #1352
- @zahrayousefijamarani made their first contribution in #1376
- @lifeiteng made their first contribution in #1385
- @guojiapub made their first contribution in #1405
- @pyq623 made their first contribution in #1408
- @EazyReal made their first contribution in #1429
- @IF007 made their first contribution in #1383
- @Le8r0nJames made their first contribution in #1453
Full Changelog: v1.0.4...v2.0.0