Skip to content

Conversation

@Jintao-Huang
Copy link
Collaborator

No description provided.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the compatibility of Megatron-SWIFT with a broader range of megatron-core versions, specifically targeting 0.12 through 0.14. It introduces dynamic version-based adjustments to API calls and argument handling within the codebase, ensuring smoother operation across different megatron-core environments. Additionally, the documentation has been refined to provide clearer guidance on dependencies and optional configurations, such as running without apex.

Highlights

  • Megatron-Core Compatibility: Implemented version checks and conditional logic to ensure compatibility with megatron-core versions from 0.12 to 0.14, addressing potential API changes in newer releases.
  • Documentation Updates: Updated quick-start guides (both English and Chinese) to specify the megatron_core dependency as >=0.12 and added a note about running Megatron-SWIFT without apex by setting --no_gradient_accumulation_fusion true.
  • Optimizer CPU Offload Requirement: Added a version check to enforce megatron-core>=0.13 when the optimizer_cpu_offload feature is enabled, ensuring proper functionality with this specific optimization.
  • Dynamic Argument Handling: Adjusted how vp_stage, qk_l2_norm, and use_kitchen arguments are passed to megatron-core components based on the detected megatron-core version, accommodating changes in their API signatures.
  • Robust Metric Reduction: Improved the _all_reduce_metric function in the base trainer to use torch.stack for creating the reporting_metric tensor, enhancing robustness during metric aggregation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces compatibility for megatron-core versions 0.12 through 0.14. The changes primarily consist of conditional logic based on version checks to handle API differences, particularly in model and layer specification initialization. The approach is sound and the implementation appears correct. The documentation has also been updated accordingly. I have one suggestion to improve maintainability by centralizing the version check logic.


logger = get_logger()

megatron_core_013 = version.parse(megatron.core.__version__) >= version.parse('0.13.0rc0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This version check logic is also present in swift/megatron/model/model_provider.py. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider centralizing this version check into a shared utility file (e.g., within swift/megatron/utils/). This would prevent potential inconsistencies and make future version-dependent changes easier to manage.

@Jintao-Huang
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces compatibility for megatron-core versions 0.12 through 0.14. The changes primarily consist of adding version checks and conditional logic to handle API differences across these versions, along with some variable renames for conciseness. However, I've identified a critical bug in swift/megatron/trainers/utils.py where the logic for handling pipeline parallelism stages is inverted, which would break the training pipeline. This needs to be addressed.

@Jintao-Huang
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request primarily focuses on enhancing compatibility with different versions of megatron-core (0.12-0.14). The changes involve introducing version-specific conditional logic, refactoring variable names for better readability, and updating documentation and example scripts to reflect these compatibility adjustments. The use of mcore_013 and mcore_014 flags throughout the codebase is a robust approach to manage API differences across megatron-core versions, ensuring the system remains functional and adaptable.

@Jintao-Huang Jintao-Huang merged commit 82a2b22 into modelscope:main Nov 14, 2025
1 of 2 checks passed
vx120 pushed a commit to vx120/ms-swift that referenced this pull request Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants