Skip to content

Conversation

yma11
Copy link
Contributor

@yma11 yma11 commented Sep 18, 2025

Purpose

The query of models like OpenGVLab/InternVL3_5-38B is 4 dimension so will pop error:

^[[1;36m(Worker_TP2 pid=946)^[[0;0m ERROR 09-15 13:21:14 [multiproc_executor.py:654]   File "/workspace/vllm/vllm/attention/layer.py", line 383, in forward

^[[1;36m(Worker_TP2 pid=946)^[[0;0m ERROR 09-15 13:21:14 [multiproc_executor.py:654]     bsz, q_len, _ = query.size()

^[[1;36m(Worker_TP2 pid=946)^[[0;0m ERROR 09-15 13:21:14 [multiproc_executor.py:654]     ^^^^^^^^^^^^^

^[[1;36m(Worker_TP2 pid=946)^[[0;0m ERROR 09-15 13:21:14 [multiproc_executor.py:654] ValueError: too many values to unpack (expected 3)

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@yma11
Copy link
Contributor Author

yma11 commented Sep 18, 2025

@Isotr0py can you help review this?

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a ValueError that occurs in MultiHeadAttention when processing input query tensors with more than three dimensions, as seen with models like OpenGVLab/InternVL3_5-38B. The fix correctly extracts the batch size and sequence length by slicing the shape tuple, making the implementation more robust and compatible with higher-dimensional inputs. The change is correct and effectively resolves the bug. I've added one high-severity comment regarding an outdated docstring, which should be updated to reflect the new capability and ensure long-term maintainability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this change correctly handles tensors with more than 3 dimensions, the method's docstring on line 433 was not updated. It still states Input shape: batch_size x seq_len x hidden_size, which is now misleading. Inaccurate documentation for a core component like this can lead to incorrect usage and bugs in the future. Please update the docstring to reflect that the input can have more than three dimensions, for example, by mentioning that additional dimensions are allowed after seq_len.

Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Input shape: batch_size x seq_len x hidden_size"""
"""Input shape:
(batch_size x seq_len x hidden_size) or
(batch_size x seq_len x num_heads x head_size)
"""

Let's also update the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also remove TODO(Isotr0py): Use existing backend implementations and support FA3 as FA3 supported by #24337?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Signed-off-by: Yan Ma <yan.ma@intel.com>
Signed-off-by: Yan Ma <yan.ma@intel.com>
@Isotr0py Isotr0py enabled auto-merge (squash) September 19, 2025 06:43
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 19, 2025
@Isotr0py Isotr0py merged commit a684c01 into vllm-project:main Sep 19, 2025
52 checks passed
debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025
…ect#25146)

Signed-off-by: Yan Ma <yan.ma@intel.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
…ect#25146)

Signed-off-by: Yan Ma <yan.ma@intel.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
…ect#25146)

Signed-off-by: Yan Ma <yan.ma@intel.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: charlifu <charlifu@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants