-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Add fast path for bidirectional mask creation to fix regression #41586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
vasqu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like a rubber band patch but it's a good solution without breaking too much of the existing code. Checked locally for perf and we are like ~ -5% slower so imo a good balance. Thanks a lot!
Want to have a final opinion from @Cyrilvallez tho
vasqu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, this mostly nits to just simplify things. Cyril will take a look tomorrow!
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
773a90e to
6f81af8
Compare
|
I think you performed a bad rebase or merge, so I just force pushed to restore the branch in correct state (I was pushing on it as well) |
|
Thanks a lot for the PR! I will commit a bit more to make it slightly more robust then we can merge! |
|
Thanks a lot idk why the qualify test keeps failing , even tho there is no blank space D: , but i'm also blind so i won't be surprised if i missed it |
|
@vasqu I cleaned the PR a bit and made it more general, could you have a last look before we merge? Also, we'll need to revert the changes you made in |
|
LGTM, just a bit confused why we need another slicing 🤔 let me open a follow up PR for executorch, keeping things a bit cleaner |
|
Merged, thanks a lot for the PR @i3hz! |
…ingface#41586) * fixed performance regression * also fixed the older_torch function * Update src/transformers/masking_utils.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * fix * more general * fix slicing * fix data dependent --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
What does this PR do?
This PR fixes the performance regression due to bidirectional masks.
Fixes # (issue): 41566
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@vasqu @Cyrilvallez