v1.3.0
中文版
新特性
- 新增 model_type 支持:kimi_k25、hy_v3、llava_onevision。
- mlp_padding_free 兼容 Sequence Parallelism。
- 移除对 megatron-core 0.12 - 0.14 版本的依赖支持。
English Version
New Features
- Added model_type support: kimi_k25, hy_v3, llava_onevision.
- mlp_padding_free is now compatible with Sequence Parallelism.
- Removed dependency support for megatron-core versions 0.12 - 0.14.
What's Changed
- [docs] update readme by @Jintao-Huang in #49
- update requirements by @Jintao-Huang in #51
- npu qwen3.5 megatron padding_free fix by @addsubmuldiv in #50
- [model] support kimi_k25 by @Jintao-Huang in #52
- [model] support hy_v3 by @Jintao-Huang in #53
- Add support for LLaVA-OneVision-1.5 model by @randydl in #54
- [bugfix] fix torch_dtype by @Jintao-Huang in #57
- fix qwen3_next by @Jintao-Huang in #58
- remove mcore0.12-mcore0.14 by @Jintao-Huang in #59
- fix kwargs by @Jintao-Huang in #61
- [megatron] support mlp_padding_free & sp; refactor TransformerLayer by @Jintao-Huang in #62
- [bugfix] fix gather_from_sp by @Jintao-Huang in #63
- update transformers by @Jintao-Huang in #65
- update requirements by @Jintao-Huang in #66
New Contributors
Full Changelog: v1.2.0...v1.3.0