Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add the attention mask for flash-attention. #76

Closed
wants to merge 5 commits into from

Conversation

MayDomine
Copy link

No description provided.

@MayDomine MayDomine mentioned this pull request Nov 16, 2022
@ZhongYingMatrix
Copy link

Could we use (-inf, 0) bias to mock mask feature?

@MayDomine
Copy link
Author

Could we use (-inf, 0) bias to mock mask feature?

Yes, that would be another way to implement mask. But if the input mask has a whole row of false, it will be quite tricky to handle the nan value in the softmax(qk). I wrote this because I have this kind of need that bias can not satisfy.

@tridao tridao force-pushed the main branch 4 times, most recently from fa580a4 to 4a6eaa9 Compare November 29, 2022 12:46
@Zyriix
Copy link

Zyriix commented Mar 16, 2023

it this work? why it was closed?

@nofreewill42
Copy link

"But if the input mask has a whole row of false, it will be quite tricky to handle the nan value in the softmax(qk)."

What about handling it with an assertion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants