-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Remove Redundant Assignment in Qwen3_VisionPatchMerger #25224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to refactor Qwen3_VisionPatchMerger
to remove redundant assignments. However, the current change introduces a critical bug by removing the assignment to self.use_postshuffle_norm
, which is used in the forward
method. This will lead to an AttributeError
at runtime. Additionally, the logic for initializing the normalization layer is now incorrect when use_postshuffle_norm
is true, which would cause a tensor shape mismatch. I've provided a suggestion to fix this.
self.use_postshuffle_norm = use_postshuffle_norm | ||
self.norm = norm_layer( | ||
self.hidden_size if use_postshuffle_norm else context_dim) | ||
self.norm = norm_layer(context_dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change introduces a bug. By removing the assignment to self.use_postshuffle_norm
, the forward
method will raise an AttributeError
because it relies on this attribute.
Furthermore, the normalization layer self.norm
is now always initialized with context_dim
. However, when use_postshuffle_norm
is True
, it should be initialized with self.hidden_size
to match the tensor shape in the forward
method. The current implementation will cause a shape mismatch error.
To fix this, you should restore the assignment and the conditional logic for the normalization dimension. Here is a suggested correction that keeps the logic correct while improving readability slightly.
self.use_postshuffle_norm = use_postshuffle_norm
norm_dim = self.hidden_size if use_postshuffle_norm else context_dim
self.norm = norm_layer(norm_dim)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.use_postshuffle_norm is initialized in L220 and context_dim will be equal to self.hidden_size if self.use_postshuffle_norm is true in L221
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for finding this out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…25224) Signed-off-by: Junhong <liujunhong11@huawei.com> Co-authored-by: Junhong <liujunhong11@huawei.com> Co-authored-by: Roger Wang <hey@rogerw.io>
…25224) Signed-off-by: Junhong <liujunhong11@huawei.com> Co-authored-by: Junhong <liujunhong11@huawei.com> Co-authored-by: Roger Wang <hey@rogerw.io>
…25224) Signed-off-by: Junhong <liujunhong11@huawei.com> Co-authored-by: Junhong <liujunhong11@huawei.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: charlifu <charlifu@amd.com>
Purpose
This PR refactors the Qwen3_VisionPatchMerger class to remove redundant assignments related to use_postshuffle_norm and the initialization of the normalization layer.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.