Skip to content

Conversation

@i3hz
Copy link
Contributor

@i3hz i3hz commented Oct 14, 2025

What does this PR do?

This PR fixes the performance regression due to bidirectional masks.

Fixes # (issue): 41566

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@vasqu @Cyrilvallez

Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like a rubber band patch but it's a good solution without breaking too much of the existing code. Checked locally for perf and we are like ~ -5% slower so imo a good balance. Thanks a lot!

Want to have a final opinion from @Cyrilvallez tho

@i3hz i3hz requested a review from vasqu October 14, 2025 16:54
Copy link
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this mostly nits to just simplify things. Cyril will take a look tomorrow!

i3hz and others added 2 commits October 15, 2025 14:46
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
@Cyrilvallez
Copy link
Member

I think you performed a bad rebase or merge, so I just force pushed to restore the branch in correct state (I was pushing on it as well)

@Cyrilvallez
Copy link
Member

Thanks a lot for the PR! I will commit a bit more to make it slightly more robust then we can merge!

@i3hz
Copy link
Contributor Author

i3hz commented Oct 15, 2025

Thanks a lot idk why the qualify test keeps failing , even tho there is no blank space D: , but i'm also blind so i won't be surprised if i missed it

@Cyrilvallez
Copy link
Member

@vasqu I cleaned the PR a bit and made it more general, could you have a last look before we merge? Also, we'll need to revert the changes you made in executorch integration, as it won't be needed anymore!

@vasqu
Copy link
Contributor

vasqu commented Oct 15, 2025

LGTM, just a bit confused why we need another slicing 🤔 let me open a follow up PR for executorch, keeping things a bit cleaner

@Cyrilvallez Cyrilvallez merged commit 56a727d into huggingface:main Oct 15, 2025
21 checks passed
@Cyrilvallez
Copy link
Member

Merged, thanks a lot for the PR @i3hz!

@i3hz i3hz deleted the bidirectional_masks branch October 15, 2025 14:14
ngazagna-qc pushed a commit to ngazagna-qc/transformers that referenced this pull request Oct 23, 2025
…ingface#41586)

* fixed performance regression

* also fixed the older_torch function

* Update src/transformers/masking_utils.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* fix

* more general

* fix slicing

* fix data dependent

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants