v1.4.1
中文版
新特性
- 新增 model_type 支持:gemma4、deepseek_v4。
- README 新增使用 Mcore-Bridge 创建模型并执行 forward、计算损失的最简示例。
- 兼容 megatron-core main 与 dev 分支。
English Version
New Features
- Added model_type support for: gemma4, deepseek_v4.
- Added a minimal example in README demonstrating how to create a model using Mcore-Bridge to perform forward pass and compute loss.
- Compatible with both megatron-core main and dev branches.
What's Changed
- [model] Support gemma4 by @Jintao-Huang in #56
- [docs] update readme by @Jintao-Huang in #84
- compat megatron dev branch by @Jintao-Huang in #87
- [model] support gemma4 padding_free by @Jintao-Huang in #88
- [docs] update docs by @Jintao-Huang in #89
- update gemma4 rope by @Jintao-Huang in #90
- refactor MLA by @Jintao-Huang in #91
- compat mtp megatron_core main branch by @Jintao-Huang in #92
- [model] Support deepseek-v4 by @Jintao-Huang in #86
- [bugfix] fix bugs by @Jintao-Huang in #95
- [model] support deepseek v4 mtp by @Jintao-Huang in #93
- Support fp4 blockwise load by @Jintao-Huang in #96
- [bugfix] fix gdn conv1d by @Jintao-Huang in #97
- update lora add by @Jintao-Huang in #98
Full Changelog: v1.4.0...v1.4.1