Skip to content

Conversation

@JJJYmmm
Copy link
Contributor

@JJJYmmm JJJYmmm commented Sep 28, 2025

Purpose

fix QwenLM/Qwen3-VL#1523

Test Plan

test with command

python3 -m vllm.entrypoints.openai.api_server   
  --model  /path/to/Qwen3-VL-235B-A22B-Instruct   
  --served-model-name Qwen3-VL-235B-A22B-Instruct   
  --tensor-parallel-size 4
  --pipeline-parallel-size 2   
  --mm-encoder-tp-mode data   
  --limit-mm-per-prompt.video 0   
  --mm-processor-cache-type shm   
  --enable-expert-parallel   
  --host 0.0.0.0   
  --port 22002   
  --dtype bfloat16   
  --gpu-memory-utilization 0.95   
  --distributed-executor-backend mp 
  --max-model-len 40960

Test Result

before

(EngineCore_DP0 pid=9735) ERROR 09-28 20:05:29 [core.py:712] param = params_dict[name]
(EngineCore_DP0 pid=9735) ERROR 09-28 20:05:29 [core.py:712] ~~~~~~~~~~~^^^^^^
(EngineCore_DP0 pid=9735) ERROR 09-28 20:05:29 [core.py:712] KeyError: 'layers.0.mlp.experts.w2_weight'

now fixed!

Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
@JJJYmmm JJJYmmm requested a review from sighingnow as a code owner September 28, 2025 14:23
@JJJYmmm JJJYmmm changed the title [Bugfix] fix Qwen3VLMoe load when pp > 2 [Bugfix] fix Qwen3VLMoe load when pp > 1 Sep 28, 2025
@mergify mergify bot added the qwen Related to Qwen models label Sep 28, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a KeyError that occurs when loading Qwen3VLMoe models with pipeline parallelism (pp > 2). The issue stems from the is_pp_missing_parameter check being incorrectly placed within the logic for non-fused experts only, causing it to be skipped for fused experts. The fix correctly relocates this check to a common path before the branching logic for fused and non-fused experts. This ensures that weights for experts on other pipeline stages are correctly skipped, resolving the loading error. The change is logical, well-targeted, and effectively fixes the reported bug.

Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@Isotr0py Isotr0py enabled auto-merge (squash) September 28, 2025 16:09
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 28, 2025
@Isotr0py Isotr0py added this to the v0.11.0 Cherry Picks milestone Sep 28, 2025
@Isotr0py Isotr0py merged commit 471997a into vllm-project:main Sep 28, 2025
54 checks passed
baonudesifeizhai pushed a commit to baonudesifeizhai/vllm that referenced this pull request Sep 28, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: baonudesifeizhai <baonudesifeizhai@gmail.com>
simon-mo pushed a commit that referenced this pull request Sep 29, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
shyeh25 pushed a commit to shyeh25/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: simon-mo <simon.mo@hey.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: liuye.hj <liuye.hj@alibaba-inc.com>
Co-authored-by: liuye.hj <liuye.hj@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Qwen3vl多机多卡部署中qwen3_vl_moe.py报错

2 participants