Release AstraFlow v0.1.1 · Infini-AI-Lab/astraflow

AstraFlow v0.1.1 broadens the system from RLVR math/code training toward multi-agent and recursive-agent RL, adds a second training backend, and modernizes the toolchain. Recipes, docs, and Docker images are updated to match.

Major Updates

Megatron-LM training backend. A second trainer backend alongside FSDP, with weight sync through an HF-space transfer buffer (supports tensor/pipeline/expert/virtual-pipeline parallelism) so the RaaS receive path is identical to FSDP. Includes a direct-DMA weight-offload path (~23× faster) and a streaming Megatron→HF per-tensor exporter.
Recursive-agent workflows. New recursive_agent-style workflows where an agent recursively spawns sub-agents that share state under a team reward: TextCraft (+ recipe variants), Oolong, and DeepDive, each with a Qwen3-4B recipe.
Spawn-sub-agents workflow for math RL — parallel sub-agent rollouts under a single trajectory.
Offline training support. Offline math datasets and recipes for clusters without internet access at train time.

Other Notable Changes

astraEnv tooling: a minimal LLM-as-judge library, a reward_mode selector, a CMU RAG search client, and an AI-rubric checklist grader.
Workflows can opt out of the producer's default group-reward statistics; sub-agent reward routing fixes.
Stability fixes: hardened Megatron weight offload, capped TextCraft recursive max_concurrent_rollouts, normalized apply_chat_template output to token ids for transformers 5, and GPU-arch-aware SGLang attention-backend / norm-path selection.

Environment and CI

Toolchain bump: CUDA 13 image, SGLang 0.5.12, transformers 5 support (with kernels<0.13 pin and a relaxed RaaS health watchdog).
Pre-built Megatron Docker image (astraflowai/astraflow:v0.1.1.megatron) documented; FSDP image remains astraflowai/astraflow:v0.1.1.
New docs: TextCraft recursive-agent and offline-math recipe pages, a Megatron weight-sync architecture page, and CUDA 13 install steps.

What's Changed

Core reorg by @haizhongzheng in #2
Core reorg by @haizhongzheng in #3
feat: support offline mode for math training datasets by @haizhongzheng in #4
chore: bump sglang 0.5.5.post1 -> 0.5.12.post1 (FSDP path) by @WWWjiahui in #5
Feat/megatron weight sync dev by @jsw-zorro in #9
Spawn solution by @haizhongzheng in #12
fix(megatron): build Transformer Engine from source for CUDA 13 by @jsw-zorro in #14
AstraFlow v0.1.1: Megatron backend, offline math, recursive/spawn agents by @haizhongzheng in #16
docs: fix textcraft recipe docs URL by @haizhongzheng in #17

New Contributors

@WWWjiahui made their first contribution in #5
@jsw-zorro made their first contribution in #9

Full Changelog: v0.1.0...v0.1.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AstraFlow v0.1.1

Choose a tag to compare

Sorry, something went wrong.