v0.3.6
What's Changed
- fix: More for training strategy consistency and metrics demonstration by @lfengad in #356
- Fix: qwen3_vl_moe encoder use FlashAttnMeta by @kane-vln in #358
- Support TP for HFModel by @kane-vln in #355
- fix: Fix regression for more metrics in validation case by @lfengad in #360
- feat: Add post process for rollout generation in data packer by @lfengad in #361
- feat: SFT training with DDP to load model at only master rank by @lfengad in #359
- Support video input for qwen3-vl/hf vlm datapacker by @kane-vln in #365
- Update tests for datapacker by @kane-vln in #367
- Enable local dataset loading and fetching for Policy and Rollout. by @foreverlms in #354
- feat: Decoupled loss for async RL by @lfengad in #368
- Remove prompt_idxs which is not needed now. by @foreverlms in #371
- Fix: add tp_slice_dim initialization in state dict conversion by @kane-vln in #372
- [FRC] Couple tokenizer with data packer by @heslami in #311
- Support Nemotron-Nano SFT by @kane-vln in #373
- Support sequence packing for HFModel by @kane-vln in #369
- feat: dapo case move rollout filter into rollout worker by @lfengad in #387
- Add expandable segmentation for pytorch allocator by @yy-code-nv in #388
- Add the deepep support for Qwen3-MoE models by @yufanhuangNV in #389
- Fix: resolve version incompatibility between FA3 and TE by @kane-vln in #391
- Add sanity check for parallelism by @foreverlms in #390
- fix: Fix hf gradient checking by @lfengad in #394
- Enable FP4 dynamic quantization of linear layers for policy training by @yufanhuangNV in #374
- rfc: restructure of some common used logic in parallel map by @lfengad in #395
- Fix: fp4 compatible with python env by @lfengad in #398
- fix: qwen2.5 vl case execution fix by @lfengad in #399
- RFC: Refactor rollout worker part by @lfengad in #396
- fix: resume from ckpt of hf buffer handling by @lfengad in #401
- Fix qwen2-5 modeling by @yy-code-nv in #404
- fix: stop issue due to validation by @lfengad in #405
- Fix qwen3-moe and qwen3-vl-moe safetensors export by @foreverlms in #406
- Support allgather moe dispatcher by @kane-vln in #402
- Force FSDP warp to ensure consistent mix-precision training behavior, fix qwen3-moe deepep bug by @yy-code-nv in #409
New Contributors
- @yy-code-nv made their first contribution in #388
- @yufanhuangNV made their first contribution in #389
Full Changelog: v0.3.5...v0.3.6