Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flash attention #28

Closed
Taytay opened this issue Jan 22, 2024 · 2 comments
Closed

Flash attention #28

Taytay opened this issue Jan 22, 2024 · 2 comments

Comments

@Taytay
Copy link

Taytay commented Jan 22, 2024

Firstly, thank you so much for this repo! I'm a huge fan of T5, and these results are extremely impressive.

I saw that you experimented with different positional embeddings like ALiBi in order to facilitate FA down the line. Was that attempt due to the fact that FA doesn't support bias? If so, there is a PR to add it that is making progress:

Dao-AILab/flash-attention#617

It would be fun to see this repo get even faster.

@PiotrNawrot
Copy link
Owner

@Taytay Thanks for the nice comments, I'm glad you like the repo! Please accept my apologies for the late reply. I've been very busy lately with the ICML submission.

Yes, exactly. FA didn't support back propagation through the extra additive bias (after dot-products, before softmax). I've just noticed this PR, it looks great - I'm sure that backprop through these bias would help not only in the T5 case! Can't wait to have it merged to FA. I'll defo test it soon after it : ).

Closing for now

@harish-kamath
Copy link

Someone has started a repo, based off of this one, with FA2 support
@catie-aq

https://github.com/catie-aq/flashT5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants