[Model] Supplement to PR 24862: Pass param prefix to LLMHead #25805

whx-sjtu · 2025-09-27T13:06:30Z

Purpose

This PR is a supplement to PR #24862, passing some prefix parameters that were previously missed in that PR to LLMHead

Test Plan

No extra test needed.

Test Result

All current ci tests should pass.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request aims to pass the prefix parameter to ParallelLMHead instances across various models, which is a crucial step for ensuring correct quantization. While the intent is correct, the implementation introduces inconsistencies in several models where the provided prefix does not match the module's attribute name. This could lead to problems with quantization configurations that rely on accurate module paths. Furthermore, in models that utilize a ModuleList for multiple LM heads, the same prefix is incorrectly applied to all heads, which would prevent them from being quantized differently if needed. I've added specific comments and suggestions to address these inconsistencies.

vllm/model_executor/models/deepseek_mtp.py

vllm/model_executor/models/glm4_moe_mtp.py

vllm/model_executor/models/mlp_speculator.py

vllm/model_executor/models/whisper.py

vllm/model_executor/models/medusa.py

vllm/model_executor/models/gpt_neox.py

vllm/model_executor/models/mlp_speculator.py

mergify · 2025-10-01T17:29:31Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @whx-sjtu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: whx-sjtu <2952154980@qq.com>

Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

…oject#25805) Signed-off-by: whx-sjtu <2952154980@qq.com>

whx-sjtu requested review from luccafong and sighingnow as code owners September 27, 2025 13:06

mergify bot added deepseek Related to DeepSeek models qwen Related to Qwen models speculative-decoding labels Sep 27, 2025

gemini-code-assist bot reviewed Sep 27, 2025

View reviewed changes

Isotr0py reviewed Sep 27, 2025

View reviewed changes

vllm/model_executor/models/medusa.py Outdated Show resolved Hide resolved

whx-sjtu force-pushed the adapt_mtp_quant branch 3 times, most recently from 92c39e0 to d18f880 Compare September 27, 2025 13:36

Isotr0py reviewed Sep 29, 2025

View reviewed changes

vllm/model_executor/models/gpt_neox.py Outdated Show resolved Hide resolved

vllm/model_executor/models/mlp_speculator.py Outdated Show resolved Hide resolved

vllm/model_executor/models/mlp_speculator.py Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Oct 1, 2025

whx-sjtu force-pushed the adapt_mtp_quant branch 2 times, most recently from 6bddb18 to 5163bb6 Compare October 1, 2025 17:36

mergify bot removed the needs-rebase label Oct 1, 2025

whx-sjtu requested a review from Isotr0py October 1, 2025 17:37

Isotr0py approved these changes Oct 2, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) October 2, 2025 06:43

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 2, 2025

complement missing prefix to LLMHead

655cb6a

Signed-off-by: whx-sjtu <2952154980@qq.com>

auto-merge was automatically disabled October 3, 2025 11:06
Head branch was pushed to by a user without write access

whx-sjtu force-pushed the adapt_mtp_quant branch from 0b7827b to 655cb6a Compare October 3, 2025 11:06

DarkLight1337 merged commit cbf9221 into vllm-project:main Oct 3, 2025
54 checks passed

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[Model] Supplement to PR 24862: Pass param prefix to LLMHead (#25805)

fac9b43

Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Oct 3, 2025

[Model] Supplement to PR 24862: Pass param prefix to LLMHead (vllm-pr…

da8218d

…oject#25805) Signed-off-by: whx-sjtu <2952154980@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Supplement to PR 24862: Pass param prefix to LLMHead #25805

[Model] Supplement to PR 24862: Pass param prefix to LLMHead #25805

Uh oh!

whx-sjtu commented Sep 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Model] Supplement to PR 24862: Pass param prefix to LLMHead #25805

[Model] Supplement to PR 24862: Pass param prefix to LLMHead #25805

Uh oh!

Conversation

whx-sjtu commented Sep 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

All current ci tests should pass.

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

whx-sjtu commented Sep 27, 2025 •

edited by github-actions bot

Loading