Skip to content

Conversation

ywang96
Copy link
Member

@ywang96 ywang96 commented Sep 20, 2025

Purpose

Previously self.load_fused_expert_weights check is too strict and will prevent server from launching with --enable-expert-parallel. This PR fixes it.

Test Plan

The MMMU from server launched with/without --enable-expert-parallel matched.

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Roger Wang <hey@rogerw.io>
@ywang96 ywang96 requested a review from sighingnow as a code owner September 20, 2025 03:52
@mergify mergify bot added the qwen Related to Qwen models label Sep 20, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in the weight loading logic for Qwen3-VL-MoE models when using expert parallelism (EP). The previous implementation of load_fused_expert_weights was too strict, causing weight loading to fail if any expert was not present on a given rank. The change correctly modifies the logic to consider the loading successful if at least one expert's weights are loaded on the rank, which is the expected behavior for expert parallelism. The changes are correct and effectively address the issue. There is also a minor whitespace change to add a newline at the end of a file.

@ywang96 ywang96 requested a review from Isotr0py September 20, 2025 03:55
@Isotr0py Isotr0py enabled auto-merge (squash) September 20, 2025 04:05
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 20, 2025
@vllm-bot vllm-bot merged commit be874c0 into vllm-project:main Sep 20, 2025
56 of 58 checks passed
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: charlifu <charlifu@amd.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants