Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions vllm/model_executor/models/qwen3_vl.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,9 +223,7 @@ def __init__(

if norm_layer is None:
norm_layer = partial(nn.LayerNorm, eps=1e-6)
self.use_postshuffle_norm = use_postshuffle_norm
self.norm = norm_layer(
self.hidden_size if use_postshuffle_norm else context_dim)
self.norm = norm_layer(context_dim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change introduces a bug. By removing the assignment to self.use_postshuffle_norm, the forward method will raise an AttributeError because it relies on this attribute.

Furthermore, the normalization layer self.norm is now always initialized with context_dim. However, when use_postshuffle_norm is True, it should be initialized with self.hidden_size to match the tensor shape in the forward method. The current implementation will cause a shape mismatch error.

To fix this, you should restore the assignment and the conditional logic for the normalization dimension. Here is a suggested correction that keeps the logic correct while improving readability slightly.

        self.use_postshuffle_norm = use_postshuffle_norm
        norm_dim = self.hidden_size if use_postshuffle_norm else context_dim
        self.norm = norm_layer(norm_dim)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.use_postshuffle_norm is initialized in L220 and context_dim will be equal to self.hidden_size if self.use_postshuffle_norm is true in L221

self.linear_fc1 = ColumnParallelLinear(self.hidden_size,
self.hidden_size,
bias=True,
Expand Down