Skip to content

Fix conversion mapping for Qwen3VL-MoE#43953

Closed
zucchini-nlp wants to merge 1 commit intohuggingface:mainfrom
zucchini-nlp:fix-qwen2-vl-moe
Closed

Fix conversion mapping for Qwen3VL-MoE#43953
zucchini-nlp wants to merge 1 commit intohuggingface:mainfrom
zucchini-nlp:fix-qwen2-vl-moe

Conversation

@zucchini-nlp
Copy link
Member

Fixes #43931, no transpose needed after standardizing the model impl to inherit from Qwen3-MoE

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Cyrilvallez
Copy link
Member

Nop, this is not the right conversion. Weights are already merged into the final experts, but need transposition, see #43913!

@Cyrilvallez
Copy link
Member

Though if all weights ONLY need the transpose, that PR would be much simpler actually - gonna check

@zucchini-nlp
Copy link
Member Author

Weights are already merged

Ah right, didn't see that. Indeed, we shouldn't need to merge. In that case no conversion is needed ig

@Cyrilvallez
Copy link
Member

They still need to be transposed 🥲 Finally fixed it in #44037

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Model loading error: weight shapes mismatch of Qwen3-VL-30B-A3B-Instruct

3 participants

Comments