Skip to content

[qwen3.5] compat transformers 5.4.0#18

Merged
Jintao-Huang merged 2 commits intomodelscope:mainfrom
Jintao-Huang:compat_transformers_540
Apr 8, 2026
Merged

[qwen3.5] compat transformers 5.4.0#18
Jintao-Huang merged 2 commits intomodelscope:mainfrom
Jintao-Huang:compat_transformers_540

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an is_mtp flag to several methods in gpt_bridge.py to correctly handle Multi-Token Prediction (MTP) layers during model conversion, particularly for MoE architectures. It also adds a safety check for labels in the _postprocess method of gpt_model.py. The review feedback identifies an improvement in _get_hf_experts_attr to better support multimodal models using Qwen 3.5 MoE and to ensure MTP layers correctly fall through the logic by removing redundant entries.

@Jintao-Huang Jintao-Huang merged commit 975c8e7 into modelscope:main Apr 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants