Skip to content

[model] Support Marco#9137

Merged
Jintao-Huang merged 4 commits into
modelscope:mainfrom
Jintao-Huang:support_maro
Apr 17, 2026
Merged

[model] Support Marco#9137
Jintao-Huang merged 4 commits into
modelscope:mainfrom
Jintao-Huang:support_maro

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Qwen3.5 best practices documentation by adding links to the Megatron-LM repository and introduces support for Marco-Nano and Marco-Mini models. However, these dense models were incorrectly registered under a Mixture-of-Experts (MoE) architecture block, which will cause loading errors as they do not match the expected architecture.

Comment thread swift/model/models/qwen.py Outdated
Comment on lines +611 to +617
ModelGroup([
Model('AIDC-AI/Marco-Nano-Base', 'AIDC-AI/Marco-Nano-Base'),
Model('AIDC-AI/Marco-Nano-Instruct', 'AIDC-AI/Marco-Nano-Instruct'),
Model('AIDC-AI/Marco-Mini-Base', 'AIDC-AI/Marco-Mini-Base'),
Model('AIDC-AI/Marco-Mini-Instruct', 'AIDC-AI/Marco-Mini-Instruct'),
Model('AIDC-AI/Marco-Mini-Global-Base', 'AIDC-AI/Marco-Mini-Global-Base'),
], TemplateType.qwen3),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Marco-Nano and Marco-Mini models are dense models, but they are currently registered within the LLMModelType.qwen3_moe block, which is configured with architectures=['Qwen3MoeForCausalLM']. Since these models do not have a Mixture-of-Experts architecture, loading them using this registration will fail. They should be moved to a dense model registration group, such as LLMModelType.qwen3 (which uses Qwen3ForCausalLM) or LLMModelType.qwen2 (which uses Qwen2ForCausalLM), depending on their underlying architecture.

@Jintao-Huang Jintao-Huang merged commit a977837 into modelscope:main Apr 17, 2026
3 checks passed
Jintao-Huang added a commit that referenced this pull request Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants