Skip to content

v0.3.6

Choose a tag to compare

@lfengad lfengad released this 27 Nov 03:25
· 241 commits to main since this release
490bf93

What's Changed

  • fix: More for training strategy consistency and metrics demonstration by @lfengad in #356
  • Fix: qwen3_vl_moe encoder use FlashAttnMeta by @kane-vln in #358
  • Support TP for HFModel by @kane-vln in #355
  • fix: Fix regression for more metrics in validation case by @lfengad in #360
  • feat: Add post process for rollout generation in data packer by @lfengad in #361
  • feat: SFT training with DDP to load model at only master rank by @lfengad in #359
  • Support video input for qwen3-vl/hf vlm datapacker by @kane-vln in #365
  • Update tests for datapacker by @kane-vln in #367
  • Enable local dataset loading and fetching for Policy and Rollout. by @foreverlms in #354
  • feat: Decoupled loss for async RL by @lfengad in #368
  • Remove prompt_idxs which is not needed now. by @foreverlms in #371
  • Fix: add tp_slice_dim initialization in state dict conversion by @kane-vln in #372
  • [FRC] Couple tokenizer with data packer by @heslami in #311
  • Support Nemotron-Nano SFT by @kane-vln in #373
  • Support sequence packing for HFModel by @kane-vln in #369
  • feat: dapo case move rollout filter into rollout worker by @lfengad in #387
  • Add expandable segmentation for pytorch allocator by @yy-code-nv in #388
  • Add the deepep support for Qwen3-MoE models by @yufanhuangNV in #389
  • Fix: resolve version incompatibility between FA3 and TE  by @kane-vln in #391
  • Add sanity check for parallelism by @foreverlms in #390
  • fix: Fix hf gradient checking by @lfengad in #394
  • Enable FP4 dynamic quantization of linear layers for policy training by @yufanhuangNV in #374
  • rfc: restructure of some common used logic in parallel map by @lfengad in #395
  • Fix: fp4 compatible with python env by @lfengad in #398
  • fix: qwen2.5 vl case execution fix by @lfengad in #399
  • RFC: Refactor rollout worker part by @lfengad in #396
  • fix: resume from ckpt of hf buffer handling by @lfengad in #401
  • Fix qwen2-5 modeling by @yy-code-nv in #404
  • fix: stop issue due to validation by @lfengad in #405
  • Fix qwen3-moe and qwen3-vl-moe safetensors export by @foreverlms in #406
  • Support allgather moe dispatcher by @kane-vln in #402
  • Force FSDP warp to ensure consistent mix-precision training behavior, fix qwen3-moe deepep bug by @yy-code-nv in #409

New Contributors

Full Changelog: v0.3.5...v0.3.6