gpt2
with output_attentions=True
has different attentions shape between flash and eager
#33417
Open
2 of 4 tasks
Labels
System Info
transformers
version: 4.44.2Who can help?
@ArthurZucker @gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
output_attentions=True
should be result in an error forGPT2FlashAttention2
Additionally I'd like to understand what's being returned here.
The text was updated successfully, but these errors were encountered: