Skip to content

[Fix]Fix some bugs/clean up#1756

Merged
zhuzilin merged 1 commit intoTHUDM:mainfrom
coding-famer:bug_fix
Mar 29, 2026
Merged

[Fix]Fix some bugs/clean up#1756
zhuzilin merged 1 commit intoTHUDM:mainfrom
coding-famer:bug_fix

Conversation

@coding-famer
Copy link
Copy Markdown
Contributor

  1. Use the load_path instead of args.hf_checkpoint in _load_checkpoint_hf
  2. Remove multimodal_num_items, which is used for fsdp.
  3. In newer Transformers, the processor will return mm_token_type_ids by default, which is not a tensor and won't be used by Megatron, so we set return_mm_token_type_ids to False.

@zhuzilin zhuzilin merged commit 6f70479 into THUDM:main Mar 29, 2026
SmallMelon-L pushed a commit to fnlp-agentRL/slime that referenced this pull request Mar 31, 2026
@coding-famer coding-famer deleted the bug_fix branch April 13, 2026 12:59
AstroSkape pushed a commit to Andrew-Koulogeorge/slime that referenced this pull request Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants