Release v0.3.6 · nvidia-cosmos/cosmos-rl

What's Changed

fix: More for training strategy consistency and metrics demonstration by @lfengad in #356
Fix: qwen3_vl_moe encoder use FlashAttnMeta by @kane-vln in #358
Support TP for HFModel by @kane-vln in #355
fix: Fix regression for more metrics in validation case by @lfengad in #360
feat: Add post process for rollout generation in data packer by @lfengad in #361
feat: SFT training with DDP to load model at only master rank by @lfengad in #359
Support video input for qwen3-vl/hf vlm datapacker by @kane-vln in #365
Update tests for datapacker by @kane-vln in #367
Enable local dataset loading and fetching for Policy and Rollout. by @foreverlms in #354
feat: Decoupled loss for async RL by @lfengad in #368
Remove prompt_idxs which is not needed now. by @foreverlms in #371
Fix: add tp_slice_dim initialization in state dict conversion by @kane-vln in #372
[FRC] Couple tokenizer with data packer by @heslami in #311
Support Nemotron-Nano SFT by @kane-vln in #373
Support sequence packing for HFModel by @kane-vln in #369
feat: dapo case move rollout filter into rollout worker by @lfengad in #387
Add expandable segmentation for pytorch allocator by @yy-code-nv in #388
Add the deepep support for Qwen3-MoE models by @yufanhuangNV in #389
Fix: resolve version incompatibility between FA3 and TE by @kane-vln in #391
Add sanity check for parallelism by @foreverlms in #390
fix: Fix hf gradient checking by @lfengad in #394
Enable FP4 dynamic quantization of linear layers for policy training by @yufanhuangNV in #374
rfc: restructure of some common used logic in parallel map by @lfengad in #395
Fix: fp4 compatible with python env by @lfengad in #398
fix: qwen2.5 vl case execution fix by @lfengad in #399
RFC: Refactor rollout worker part by @lfengad in #396
fix: resume from ckpt of hf buffer handling by @lfengad in #401
Fix qwen2-5 modeling by @yy-code-nv in #404
fix: stop issue due to validation by @lfengad in #405
Fix qwen3-moe and qwen3-vl-moe safetensors export by @foreverlms in #406
Support allgather moe dispatcher by @kane-vln in #402
Force FSDP warp to ensure consistent mix-precision training behavior, fix qwen3-moe deepep bug by @yy-code-nv in #409

New Contributors

@yy-code-nv made their first contribution in #388
@yufanhuangNV made their first contribution in #389

Full Changelog: v0.3.5...v0.3.6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.6

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!