Skip to content

[docs] update megatron docs#9249

Merged
Jintao-Huang merged 5 commits into
modelscope:mainfrom
Jintao-Huang:update_megatron_docs_0439
Apr 30, 2026
Merged

[docs] update megatron docs#9249
Jintao-Huang merged 5 commits into
modelscope:mainfrom
Jintao-Huang:update_megatron_docs_0439

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds PR submission guidelines and updates the documentation index in both Chinese and English. Feedback points out that the referenced 'Megatron-SWIFT/Custom-Model.md' file is missing, which will result in broken links. It is also recommended to update the pre-commit instructions to include 'pre-commit install' for setting up automated git hooks.

Comment thread docs/source/index.rst
Megatron-SWIFT/GRPO.md
Megatron-SWIFT/GKD.md
Megatron-SWIFT/Ascend.md
Megatron-SWIFT/Custom-Model.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The file Megatron-SWIFT/Custom-Model.md is being added to the table of contents, but this file does not appear to be included in the pull request's changes or existing files. Please ensure that this file exists in the correct path to avoid broken links in the documentation.

Comment thread docs/source_en/index.rst
Megatron-SWIFT/GRPO.md
Megatron-SWIFT/GKD.md
Megatron-SWIFT/Ascend.md
Megatron-SWIFT/Custom-Model.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The file Megatron-SWIFT/Custom-Model.md is being added to the table of contents, but this file does not appear to be included in the pull request's changes or existing files. Please ensure that this file exists in the correct path to avoid broken links in the documentation.

Comment thread docs/source/BestPractices/MLLM-Registration.md
Comment thread docs/source_en/BestPractices/MLLM-Registration.md
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new documentation guide, "Megatron-SWIFT Custom Model," in both Chinese and English, detailing the process of registering and training custom models using Mcore-Bridge. It also updates existing registration guides with instructions for using pre-commit hooks before submitting pull requests. Review feedback suggests replacing hardcoded local file paths and specific checkpoint directories with generic placeholders in the shell command examples to improve clarity and usability for end-users.

NPROC_PER_NODE=2 \
CUDA_VISIBLE_DEVICES=0,1 \
megatron sft \
--model /root/.cache/modelscope/models/MiniMax/MiniMax-M2.7 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It is recommended to use a placeholder like <model_path> instead of a hardcoded local path to make the example more general and easier for users to follow.

Suggested change
--model /root/.cache/modelscope/models/MiniMax/MiniMax-M2.7 \
--model <model_path> \

```shell
CUDA_VISIBLE_DEVICES=0 \
swift infer \
--model megatron_output/v3-20260430-143926/checkpoint-310 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The checkpoint path is hardcoded to a specific timestamped directory. Using a placeholder like <checkpoint_path> would be clearer for users.

Suggested change
--model megatron_output/v3-20260430-143926/checkpoint-310 \
--model <checkpoint_path> \

NPROC_PER_NODE=2 \
CUDA_VISIBLE_DEVICES=0,1 \
megatron sft \
--model /root/.cache/modelscope/models/MiniMax/MiniMax-M2.7 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It is recommended to use a placeholder like <model_path> instead of a hardcoded local path to make the example more general and easier for users to follow.

Suggested change
--model /root/.cache/modelscope/models/MiniMax/MiniMax-M2.7 \
--model <model_path> \

```shell
CUDA_VISIBLE_DEVICES=0 \
swift infer \
--model megatron_output/v3-20260430-143926/checkpoint-310 \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The checkpoint path is hardcoded to a specific timestamped directory. Using a placeholder like <checkpoint_path> would be clearer for users.

Suggested change
--model megatron_output/v3-20260430-143926/checkpoint-310 \
--model <checkpoint_path> \

@Jintao-Huang Jintao-Huang merged commit e79c3d5 into modelscope:main Apr 30, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants