Future support for attention bias or masking #242

subercui · 2023-05-26T16:43:22Z

Hi, I noticed this plan for customized attention bias was recently removed. Do you still have this in plan anytime? I feel Flash-attention is such a great project, and having this feature will make it perfect 😄

40a25c8#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L80

The text was updated successfully, but these errors were encountered:

tridao · 2023-05-26T16:58:56Z

As mentioned in the README, we have an experimental implementation in Triton that support attention
bias (e.g. ALiBi):
https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attn_triton.py

subercui · 2023-05-26T17:06:06Z

Thank you! I'll give it a try

robflynnyh · 2023-06-02T13:02:45Z

are there plans to implement gradients for the bias? i.e for learnt attention biases. Or how difficult would this be too implement do you think?

tridao · 2023-06-02T15:21:18Z

are there plans to implement gradients for the bias? i.e for learnt attention biases. Or how difficult would this be too implement do you think?

I'm not planning to work on it as I don't use attention bias in my work. Implementing it in Triton is probably not hard, all the necessary ingredients are there in the Triton backward pass implementation. I suspect one just has to add some code to save the gradient.

Zhang-Shubo · 2023-06-08T05:11:12Z

Hi, I find flashattn of triton version is very experimental, only support 64 head dimension size. Does this condition have any changes？

samvanstroud mentioned this issue Sep 25, 2023

[v2] Attention Masking #352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Future support for attention bias or masking #242

Future support for attention bias or masking #242

subercui commented May 26, 2023

tridao commented May 26, 2023

subercui commented May 26, 2023

robflynnyh commented Jun 2, 2023

tridao commented Jun 2, 2023

Zhang-Shubo commented Jun 8, 2023

Future support for attention bias or masking #242

Future support for attention bias or masking #242

Comments

subercui commented May 26, 2023

tridao commented May 26, 2023

subercui commented May 26, 2023

robflynnyh commented Jun 2, 2023

tridao commented Jun 2, 2023

Zhang-Shubo commented Jun 8, 2023