-
Notifications
You must be signed in to change notification settings - Fork 994
[megatron] compat megatron-core 0.12-0.14 #6599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[megatron] compat megatron-core 0.12-0.14 #6599
Conversation
Summary of ChangesHello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the compatibility of Megatron-SWIFT with a broader range of Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces compatibility for megatron-core versions 0.12 through 0.14. The changes primarily consist of conditional logic based on version checks to handle API differences, particularly in model and layer specification initialization. The approach is sound and the implementation appears correct. The documentation has also been updated accordingly. I have one suggestion to improve maintainability by centralizing the version check logic.
swift/megatron/model/gpt_model.py
Outdated
|
|
||
| logger = get_logger() | ||
|
|
||
| megatron_core_013 = version.parse(megatron.core.__version__) >= version.parse('0.13.0rc0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version check logic is also present in swift/megatron/model/model_provider.py. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider centralizing this version check into a shared utility file (e.g., within swift/megatron/utils/). This would prevent potential inconsistencies and make future version-dependent changes easier to manage.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces compatibility for megatron-core versions 0.12 through 0.14. The changes primarily consist of adding version checks and conditional logic to handle API differences across these versions, along with some variable renames for conciseness. However, I've identified a critical bug in swift/megatron/trainers/utils.py where the logic for handling pipeline parallelism stages is inverted, which would break the training pipeline. This needs to be addressed.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request primarily focuses on enhancing compatibility with different versions of megatron-core (0.12-0.14). The changes involve introducing version-specific conditional logic, refactoring variable names for better readability, and updating documentation and example scripts to reflect these compatibility adjustments. The use of mcore_013 and mcore_014 flags throughout the codebase is a robust approach to manage API differences across megatron-core versions, ensuring the system remains functional and adaptable.
No description provided.