Skip to content

v0.1.11

Latest

Choose a tag to compare

@Luosuu Luosuu released this 26 May 07:42
· 32 commits to main since this release
f90b3dc

Highlights

  • topk fused kernel for OPD
  • precompute metadata in dataloader for VLM
  • Qwen-Image support with ckpt/lora/inference
  • rework moe load balance monitor
  • NPU OpSlot fixes

What's Changed

  • [model] feat: support qwen-image by @FoolPlayer in #770
  • [model, data, trainer] feat: precompute multimodal forward metadata in dataloader by @TimYangst in #772
  • [model, ops] feat: chunked fused-linear top-k forward-KL distillation kernel by @Luosuu in #771
  • [omni] chore: add Omni Molde inference script by @FoolPlayer in #777
  • [model, data, ci] refactor: collapse multimodal metadata to a single grid_thw key by @TimYangst in #778
  • [parallel] feat: disable HSDP gradient all-reduce during gradient accumulation by @nono-Sang in #781
  • [ops, model] refactor: shorten loss-wrapper return to (loss, logits, fused_linear_aux) by @Luosuu in #780
  • [model, ops] refactor: add NPU support and OpSlot guard for Qwen3/VL/MoE, Qwen3.5/MoE by @yanghw116 in #710
  • [model, omni] feat: add Qwen-Image lora config by @FoolPlayer in #784
  • [model, ci, agent] feat: wire qwen2-family ViT to the multimodal metadata precompute hook by @TimYangst in #779
  • [trainer, ops] feat: rework MoE load-balance monitor (model-agnostic, EP/DP-aware) by @Luosuu in #787
  • [ckpt, lora] feat: Save lora ckpt and add omni-infer with lora by @FoolPlayer in #785
  • [model, omni] feat: Update Qwen-Image & Add veomni fsdp state API by @FoolPlayer in #786
  • [model, ops] fix: complete #780 3-tuple loss-wrapper migration by @TimYangst in #790
  • [release] chore: bump version to 0.1.11 by @Luosuu in #792

New Contributors

Full Changelog: v0.1.10...v0.1.11