You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
first of all nice work! and thank you for sharing the code.
I noticed that the code use ALiBi in encoder-decoder attention but not in the transformer's self-Attention. Have you tried ALiBi in transformer self-attention? And Is there a reason you didn't use it for the self-attention layer?
The text was updated successfully, but these errors were encountered:
Thanks!
We've actually currently implemented ALiBi for self-attention (casually masked) only. We have not implemented it for encoder-decoder attention or non-masked encoder attention yet.
What is your question?
first of all nice work! and thank you for sharing the code.
I noticed that the code use ALiBi in encoder-decoder attention but not in the transformer's self-Attention. Have you tried ALiBi in transformer self-attention? And Is there a reason you didn't use it for the self-attention layer?
The text was updated successfully, but these errors were encountered: