fix: avoid GPU-CPU sync in MTP by AlpinDale · Pull Request #1558 · dphnAI/sonar

AlpinDale · 2025-11-03T22:57:59Z

No description provided.

gemini-code-assist

Code Review

This pull request aims to fix a performance issue by avoiding a GPU-CPU synchronization in the MTP forward pass. The change replaces an indexed assignment with torch.where. While this is a valid approach, I've suggested an alternative using masked_fill_ which is more memory-efficient as it performs an in-place modification, similar to the original code's intent.

fix: avoid GPU-CPU sync in MTP

8b9b770

gemini-code-assist Bot reviewed Nov 3, 2025

View reviewed changes

Comment thread aphrodite/modeling/models/deepseek_mtp.py

AlpinDale merged commit b72fba8 into main Nov 3, 2025
0 of 4 checks passed

AlpinDale deleted the mtp-d2h branch November 3, 2025 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: avoid GPU-CPU sync in MTP#1558

fix: avoid GPU-CPU sync in MTP#1558
AlpinDale merged 1 commit into
mainfrom
mtp-d2h

AlpinDale commented Nov 3, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

AlpinDale commented Nov 3, 2025

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant