-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: Dao-AILab/flash-attention
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
flash_attn_with_kvcach return block_lse or attention_score
#1301
opened Oct 28, 2024 by
NonvolatileMemory
FlashSelfAttention and SelfAttention in flash_attn.modules.mha give different results
#1300
opened Oct 28, 2024 by
senxiu-puleya
In unit test,how is the dropout_fraction diff tolerance selected?
#1286
opened Oct 18, 2024 by
muoshuosha
FlashAttention installation error: "CUDA 11.6 and above" requirement issue
#1282
opened Oct 17, 2024 by
21X5122
Unable to import my new kernel function after compilation success.
#1278
opened Oct 15, 2024 by
jpli02
Why does the flash_attn_varlen_func method increase GPU memory usage?
#1277
opened Oct 15, 2024 by
shaonan1993
Is there a way to install flash-attention without specific cuda version ?
#1276
opened Oct 14, 2024 by
HuangChiEn
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.