[megatron] compat megatron-core 0.12-0.14 #6599

Jintao-Huang · 2025-11-14T04:31:20Z

No description provided.

gemini-code-assist · 2025-11-14T04:31:38Z

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the compatibility of Megatron-SWIFT with a broader range of megatron-core versions, specifically targeting 0.12 through 0.14. It introduces dynamic version-based adjustments to API calls and argument handling within the codebase, ensuring smoother operation across different megatron-core environments. Additionally, the documentation has been refined to provide clearer guidance on dependencies and optional configurations, such as running without apex.

Highlights

Megatron-Core Compatibility: Implemented version checks and conditional logic to ensure compatibility with megatron-core versions from 0.12 to 0.14, addressing potential API changes in newer releases.
Documentation Updates: Updated quick-start guides (both English and Chinese) to specify the megatron_core dependency as >=0.12 and added a note about running Megatron-SWIFT without apex by setting --no_gradient_accumulation_fusion true.
Optimizer CPU Offload Requirement: Added a version check to enforce megatron-core>=0.13 when the optimizer_cpu_offload feature is enabled, ensuring proper functionality with this specific optimization.
Dynamic Argument Handling: Adjusted how vp_stage, qk_l2_norm, and use_kitchen arguments are passed to megatron-core components based on the detected megatron-core version, accommodating changes in their API signatures.
Robust Metric Reduction: Improved the _all_reduce_metric function in the base trainer to use torch.stack for creating the reporting_metric tensor, enhancing robustness during metric aggregation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces compatibility for megatron-core versions 0.12 through 0.14. The changes primarily consist of conditional logic based on version checks to handle API differences, particularly in model and layer specification initialization. The approach is sound and the implementation appears correct. The documentation has also been updated accordingly. I have one suggestion to improve maintainability by centralizing the version check logic.

gemini-code-assist · 2025-11-14T04:33:50Z

swift/megatron/model/gpt_model.py


 logger = get_logger()

+megatron_core_013 = version.parse(megatron.core.__version__) >= version.parse('0.13.0rc0')


This version check logic is also present in swift/megatron/model/model_provider.py. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider centralizing this version check into a shared utility file (e.g., within swift/megatron/utils/). This would prevent potential inconsistencies and make future version-dependent changes easier to manage.

Jintao-Huang · 2025-11-14T05:32:44Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces compatibility for megatron-core versions 0.12 through 0.14. The changes primarily consist of adding version checks and conditional logic to handle API differences across these versions, along with some variable renames for conciseness. However, I've identified a critical bug in swift/megatron/trainers/utils.py where the logic for handling pipeline parallelism stages is inverted, which would break the training pipeline. This needs to be addressed.

swift/megatron/trainers/utils.py

Jintao-Huang · 2025-11-14T07:12:40Z

/gemini review

gemini-code-assist

Code Review

This pull request primarily focuses on enhancing compatibility with different versions of megatron-core (0.12-0.14). The changes involve introducing version-specific conditional logic, refactoring variable names for better readability, and updating documentation and example scripts to reflect these compatibility adjustments. The use of mcore_013 and mcore_014 flags throughout the codebase is a robust approach to manage API differences across megatron-core versions, ensuring the system remains functional and adaptable.

swift/megatron/model/gpt/qwen3_next.py

swift/megatron/model/gpt_model.py

examples/models/qwen3_next/mcore.sh

swift/megatron/model/gpt/qwen3_next.py

swift/megatron/model/gpt_model.py

docs/source/Megatron-SWIFT/Command-line-parameters.md

docs/source/Megatron-SWIFT/Quick-start.md

swift/megatron/init.py

swift/megatron/trainers/base.py

swift/megatron/trainers/reward_trainer.py

update

3a8289c

gemini-code-assist bot reviewed Nov 14, 2025

View reviewed changes

update

3a85e8b

gemini-code-assist bot reviewed Nov 14, 2025

View reviewed changes

swift/megatron/trainers/utils.py Outdated Show resolved Hide resolved

fix

e2bb8bf

hjh0119 approved these changes Nov 14, 2025

View reviewed changes

Jintao-Huang added 2 commits November 14, 2025 14:01

fix

7ec2d87

update

ff0e0a8

tastelikefeet approved these changes Nov 14, 2025

View reviewed changes

Jintao-Huang added 3 commits November 14, 2025 14:21

fix qwen3_next

f9e9231

fix

3fdcdfc

update

193786e

gemini-code-assist bot reviewed Nov 14, 2025

View reviewed changes

Jintao-Huang merged commit 82a2b22 into modelscope:main Nov 14, 2025
1 of 2 checks passed

vx120 pushed a commit to vx120/ms-swift that referenced this pull request Nov 19, 2025

[megatron] compat megatron-core 0.12-0.14 (modelscope#6599)

1ee2cd4


		logger = get_logger()

		megatron_core_013 = version.parse(megatron.core.__version__) >= version.parse('0.13.0rc0')

[megatron] compat megatron-core 0.12-0.14 #6599

[megatron] compat megatron-core 0.12-0.14 #6599

Uh oh!

Conversation

Jintao-Huang commented Nov 14, 2025

Uh oh!

gemini-code-assist bot commented Nov 14, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Jintao-Huang commented Nov 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Jintao-Huang commented Nov 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants