Skip to content

v0.1.6

Pre-release
Pre-release

Choose a tag to compare

@Luosuu Luosuu released this 28 Jan 18:42
· 274 commits to main since this release
aa88bf4

Highlights

  • Now we gradually migrate VeOmni modeling to patching the HuggingFace modeling so that users can easily know what is different from original HuggingFace implementation. Currently we migrate dense models first. @Coach257 @FoolPlayer
    • In the following, we will work on generating modeling like HuggingFace, which will happen at the same time with HuggingFace Transformers v5 upgrade. @piyifan123
  • Support Qwen3-Omni-MoE by @Crystal-jiang .
  • Now VeOmni does not override HuggingFace Transformers ALL_ATTENTION_FUNCTION registry. To ensure backward compatibility, when flash_attention_2/3 is passed to model arguments, they will be replaced by veomni flash attention key names. @Luosuu
  • Support padding for packed input when rmpad_with_pos_ids, which will eliminate expensive Triton compiling for varying input size and pave the way for torch.compile integration.
  • Support torchcodec -based video preprocessing by @TimYangst
  • Many fixes.

What's Changed

  • [docs] fix: Optimize document links in Markdown rendering by @Crystal-jiang in #380
  • [config] feat: add MFU calculation for qwen3_vl_moe by @ZhuYajun-AI in #385
  • [data, model] feat: support Qwen3-VL textual token-based time encoding by @Coach257 in #386
  • [data,ci] test: enhance video_utils test suite with robust validation and benchmarks by @TimYangst in #375
  • [data,ci,docs] feat: add torchcodec-based video processing with ffmpeg support and comprehensive testing by @TimYangst in #221
  • [perf, dist] feat: add zero2 in fsdp1 and use_orig_params configurable by @zhtao303 in #382
  • [model] fix: Fused operator fix for qwen3vl by @phdddd in #378
  • [docs] feat: add async doc in ulysses.md by @zbian99 in #388
  • [data] feat: add data collators for embedding classification. by @yiwzhao in #376
  • [model] chore: add moe split script by @FoolPlayer in #390
  • [task] feat: train qwen2.5 omni by @Coach257 in #396
  • [model] refactor: change to model patch in qwen3 by @FoolPlayer in #392
  • [docker] fix: update npu dockerfile with ffmpeg by @FoolPlayer in #398
  • [config] feat: refactor args to support multi-level config by @FoolPlayer in #397
  • [docs] fix: update async ulysses document by @zbian99 in #394
  • [model] fix: add fa3 for Qwen3VL vision attention SP path by @Luosuu in #400
  • [model] feat: patch qwen3vlmoe; qwen3vl; qwen25vl by @Coach257 in #399
  • [model] fix: qwen25vl_config by @Coach257 in #404
  • [model] fix: fsdp1 load model weights by @Coach257 in #403
  • [config] fix: remove None option from choice type args by @FoolPlayer in #405
  • [misc] feat: add live star history in readme by @Luosuu in #406
  • [config, model] chore: avoid polluting huggingface transformer attention registry while keeping job config backward compatiable by @Luosuu in #407
  • [docs] fix: correct typos in documentation by @zwhe99 in #411
  • [data] feat: decouple rmpad and dyn_bsz by @yangtian6781 in #408
  • [model] feat: refactor the rot_pos_emb and fast_pos_embed_interpolate funcs in modeling_qwen_vl by @lipengfei1409 in #391
  • [ci] fix: use wheel URLs to avoid build dependencies in CI by @TimYangst in #417
  • [model] fix: remove the check that self.training==True for SP by @A1waysBeenHere in #421
  • [model] feat: Add pure HuggingFace version of Qwen3-Omni-MoE by @Crystal-jiang in #422
  • [data, perf, ops] feat: option to pad packed input to a fixed shape for text-only models by @Luosuu in #410
  • [ops, perf] feat: Liger-Kernel is now available for NPU by @zheliuyu in #415
  • [model] feat: Support qwen3-omni-moe model by @Crystal-jiang in #409
  • [dist] feat: No resharding enabled for accelerated small model training by @yangtian6781 in #413
  • [model] fix: qwen_vl rope index by @Coach257 in #430
  • [model] refactor: change dense llm to patch style by @FoolPlayer in #431
  • [perf] fix: The H200 compute power is being recognized as H20 by @HSYZhang in #428

New Contributors

Full Changelog: v0.1.5...v0.1.6