Release Release v1.1.0 · NVIDIA/nvflow

Highlights

Two-environment GRPO pipeline with split configs to prevent cross-environment leaks:

Full GRPO config for Qwen3-30B-A3B MoE with curriculum ordering, dynamic sampling, and context parallelism.

Scale-independent rollout pipeline with multi-node vLLM, logical chunking, and fault-tolerant multi-seed execution via dependent_jobs.

DTensor v2 safetensors checkpoint conversion with .hf_metadata auto-recreation
Separate per-environment eval output directories
Standalone eval support (cross-session Slurm dependency handling)