[megatron] remove megatron core 0.12-0.14#9260
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the minimum version requirements for megatron-core to 0.15 and mcore-bridge to 1.2.0, removing legacy version checks and patches for older Megatron-Core versions. Feedback identifies a NameError in base.py caused by the removal of track_moe_kwargs while it is still being referenced, and a logic regression in trainer.py where context parallel all-reduction is now incorrectly enabled for newer mcore versions. Additionally, documentation updates contain potential typos regarding CUDA version labels.
| # transformer_engine | ||
| # 若出现安装错误,可以参考该issue解决: https://github.com/modelscope/ms-swift/issues/3793 | ||
| pip install --no-build-isolation transformer-engine[pytorch] --no-cache-dir | ||
| # cuda13 |
| # transformer_engine | ||
| # If an installation error occurs, you can refer to this issue for resolution: https://github.com/modelscope/ms-swift/issues/3793 | ||
| pip install --no-build-isolation transformer-engine[pytorch] --no-cache-dir | ||
| # cuda13 |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request updates the minimum version requirements for megatron-core to 0.15 and mcore-bridge to 1.2.0. Consequently, it removes legacy version checks and patches for older Megatron-Core versions (0.12 to 0.14) across the codebase, simplifying the logic in trainers and utility modules. Documentation has been updated to reflect these new requirements and includes a command to install TransformerEngine from its stable branch. A typo was noted in the documentation where # cuda13 is used instead of # cuda12.
| # transformer_engine | ||
| # 若出现安装错误,可以参考该issue解决: https://github.com/modelscope/ms-swift/issues/3793 | ||
| pip install --no-build-isolation transformer-engine[pytorch] --no-cache-dir | ||
| # cuda13 |
| # transformer_engine | ||
| # If an installation error occurs, you can refer to this issue for resolution: https://github.com/modelscope/ms-swift/issues/3793 | ||
| pip install --no-build-isolation transformer-engine[pytorch] --no-cache-dir | ||
| # cuda13 |
No description provided.