feat: support override HF model name in convert_megatron_to_hf#2202
Merged
yuki-97 merged 1 commit intoApr 4, 2026
Merged
Conversation
yuki-97
reviewed
Apr 3, 2026
Contributor
There was a problem hiding this comment.
thanks @dhineshkumar-r , changes lgtm.
could you help to add what you mentioned in the PR description to docs/design-docs/checkpointing.md, this will help people who meet the same situation.
5951c90 to
bed5b3b
Compare
4d22f5a to
d162d13
Compare
Contributor
Author
|
Done. Please take a look. |
yuki-97
previously approved these changes
Apr 4, 2026
Contributor
|
/ok to test d162d13 |
Contributor
|
hi @dhineshkumar-r , there's a lint check fail, could you use |
auto-merge was automatically disabled
April 4, 2026 14:26
Head branch was pushed to by a user without write access
d162d13 to
33bdeec
Compare
…F format. Signed-off-by: Dhineshkumar Ramasubbu <dhineshkumar.ramasubbu@gmail.com>
33bdeec to
3ea26b7
Compare
Contributor
Author
|
Yes, I don't see it fail anymore. Please let me know if anything else. |
yuki-97
approved these changes
Apr 4, 2026
Contributor
yuki-97
left a comment
There was a problem hiding this comment.
thanks, let me re-trigger CI.
Contributor
|
/ok to test 3ea26b7 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…F format.
What does this PR do ?
Enables a way to override hf_model_name when converting checkpoints from megatron to HF format. This is useful for models like GPT-OSS whose base checkpoint precision(mxfp4) is different from supported export precision(bfloat16) in Megatron-Bridge, Ref.
Issues
List issues that this PR closes (syntax):
closes #2124
Usage
If
openai/gpt-oss-20bis finetuned in bfloat16 precision and checkpoints are stored in megatron format, the override argument can be used to pass the supportedunsloth/gpt-oss-20b-BF16hf model name to use the config corresponding to bf16 precision.Before your PR is "Ready for review"
Pre checks:
Additional Information