Skip to content

Releases: Infini-AI-Lab/astraflow

AstraFlow v0.1.1

05 Jun 18:51
316756b

Choose a tag to compare

AstraFlow v0.1.1 broadens the system from RLVR math/code training toward multi-agent and recursive-agent RL, adds a second training backend, and modernizes the toolchain. Recipes, docs, and Docker images are updated to match.

Major Updates

  • Megatron-LM training backend. A second trainer backend alongside FSDP, with weight sync through an HF-space transfer buffer (supports tensor/pipeline/expert/virtual-pipeline parallelism) so the RaaS receive path is identical to FSDP. Includes a direct-DMA weight-offload path (~23× faster) and a streaming Megatron→HF per-tensor exporter.
  • Recursive-agent workflows. New recursive_agent-style workflows where an agent recursively spawns sub-agents that share state under a team reward: TextCraft (+ recipe variants), Oolong, and DeepDive, each with a Qwen3-4B recipe.
  • Spawn-sub-agents workflow for math RL — parallel sub-agent rollouts under a single trajectory.
  • Offline training support. Offline math datasets and recipes for clusters without internet access at train time.

Other Notable Changes

  • astraEnv tooling: a minimal LLM-as-judge library, a reward_mode selector, a CMU RAG search client, and an AI-rubric checklist grader.
  • Workflows can opt out of the producer's default group-reward statistics; sub-agent reward routing fixes.
  • Stability fixes: hardened Megatron weight offload, capped TextCraft recursive max_concurrent_rollouts, normalized apply_chat_template output to token ids for transformers 5, and GPU-arch-aware SGLang attention-backend / norm-path selection.

Environment and CI

  • Toolchain bump: CUDA 13 image, SGLang 0.5.12, transformers 5 support (with kernels<0.13 pin and a relaxed RaaS health watchdog).
  • Pre-built Megatron Docker image (astraflowai/astraflow:v0.1.1.megatron) documented; FSDP image remains astraflowai/astraflow:v0.1.1.
  • New docs: TextCraft recursive-agent and offline-math recipe pages, a Megatron weight-sync architecture page, and CUDA 13 install steps.

What's Changed

New Contributors

Full Changelog: v0.1.0...v0.1.1

AstraFlow v0.1.0

20 May 06:44

Choose a tag to compare

First public release of AstraFlow, a dataflow-oriented RL system for (multi-)agentic LLMs.

Highlights

  • Fully async multi-policy collaborative RL
  • Elastic heterogeneous cross-region rollouts (RaaS)
  • Substitutable rollout and trainer services
  • Composable data algorithms (GRESO, dynamic sampling, buffer replay)

Recipes

math/, math-multi-agent/, math-efficient-data/, code/, code-multi-agent/, search/, alfworld/, webshop/. See the docs for details.

Install

docker run --gpus all --net=host --shm-size=512g -it astraflowai/astraflow:v0.1.0

Or from source — see the installation guide. Requires Linux, Python 3.10–3.12, CUDA 12.9.

Status

Alpha — runs end-to-end, but 0.x APIs may evolve. Issues welcome: https://github.com/Infini-AI-Lab/astraflow/issues