fix: BART attention fusion for key with bias🐛 #25046
Open
+21
−15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
With #24857 attention fusion for Whisper (and BART) was revamped. 💯 This PR extends the previous pr and adds support for attention fusion for BART encoders with keys + bias term.
Minimum reproducable example:
Output after PR:
Motivation and Context
Extends #24857. Closes #23864.
@kunal-vaishnavi @justinchuby Could you please review? I'd also like to add a test case. Could you provide some guidance where it should go? Add modelling code to
onnxruntime/test/python/transformers/test_bart.py
? Any feedback is greatly appreciated.