Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future support for attention bias or masking #242

Open
subercui opened this issue May 26, 2023 · 5 comments
Open

Future support for attention bias or masking #242

subercui opened this issue May 26, 2023 · 5 comments

Comments

@subercui
Copy link

Hi, I noticed this plan for customized attention bias was recently removed. Do you still have this in plan anytime? I feel Flash-attention is such a great project, and having this feature will make it perfect 😄

40a25c8#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L80

@tridao
Copy link
Contributor

tridao commented May 26, 2023

As mentioned in the README, we have an experimental implementation in Triton that support attention
bias (e.g. ALiBi):
https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attn_triton.py

@subercui
Copy link
Author

Thank you! I'll give it a try

@robflynnyh
Copy link

are there plans to implement gradients for the bias? i.e for learnt attention biases. Or how difficult would this be too implement do you think?

@tridao
Copy link
Contributor

tridao commented Jun 2, 2023

are there plans to implement gradients for the bias? i.e for learnt attention biases. Or how difficult would this be too implement do you think?

I'm not planning to work on it as I don't use attention bias in my work. Implementing it in Triton is probably not hard, all the necessary ingredients are there in the Triton backward pass implementation. I suspect one just has to add some code to save the gradient.

@Zhang-Shubo
Copy link

Hi, I find flashattn of triton version is very experimental, only support 64 head dimension size. Does this condition have any changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants