Skip to content

Conversation

zhoukezi
Copy link
Contributor

@zhoukezi zhoukezi commented Sep 30, 2025

Purpose

In PR #25854, we addressed an issue in MiDashengLM’s audio encoder attention. The initial fix used a hand-written attention implementation, which, per review feedback, was then replaced with an SDPA-based implementation before merge.

SDPA interprets attention masks in the opposite way compared to our original hand-written code. To stay consistent, the previous logical inversion of the mask should have been removed. However, that adjustment was inadvertently omitted in the submitted changes, causing the audio encoder to produce incorrect embeddings.

This PR removes the leftover inversion and restores correct outputs from the MiDashengLM encoder.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a crucial bugfix for the MiDashengLM audio encoder. The change correctly removes an erroneous logical_not() call on the attention mask. This inversion was a leftover from a previous implementation and is incompatible with the scaled_dot_product_attention function, which expects True values for unmasked positions. By removing the inversion, this PR ensures the attention mask is correctly interpreted, fixing the generation of incorrect audio embeddings. The change is precise, well-explained, and I find no further issues.

@Isotr0py Isotr0py enabled auto-merge (squash) September 30, 2025 06:29
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 30, 2025
@Isotr0py Isotr0py merged commit 2e1b8bc into vllm-project:main Sep 30, 2025
52 checks passed
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
…ect `logical_not` (vllm-project#25925)

Signed-off-by: zhoukz <me@zhoukz.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
…ect `logical_not` (#25925)

Signed-off-by: zhoukz <me@zhoukz.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants