新特性
- 新增 model_type 支持:gemma4_unified;kimi_k25 新增多模态支持。
- 新增 language_model_only 参数,启用后仅创建语言模型部分,并只加载与保存语言模型相关权重。
- 修复若干 Bug。
New Features
- Added model_type support for gemma4_unified; added multimodal support for kimi_k25.
- Added language_model_only parameter, which when enabled, only creates the language model component and exclusively loads/saves language model weights.
- Fixed several bugs.
What's Changed
- [bugfix] fix: clamp num_tokens=0 in MTP loss & add normalized scale for MTP per token loss by @YaoweiFan in #104
- [bugfix] fix tie_word_embeddings by @Jintao-Huang in #105
- [bugfix] fix deepseek-v4 dev branch by @Jintao-Huang in #107
- [model] support gemma4_unified by @Jintao-Huang in #108
- update batch_p2p_comm by @Jintao-Huang in #111
- support language_model_only by @Jintao-Huang in #112
- support kimi_k25 mm by @Jintao-Huang in #113
- update mla rope mcore>=0.18 (0.15-0.18 compat) by @Jintao-Huang in #114
New Contributors
- @YaoweiFan made their first contribution in #104
Full Changelog: v1.4.2...v1.4.3