Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 1.3k
Star 14k

Code
Issues 565
Pull requests 47
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: Dao-AILab/flash-attention

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

565 Open 531 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Operation Error: /usr/bin/ld: cannot find -lcuda

#1312 opened Nov 2, 2024 by ying123ww

Varlen flash attention: CUDA illegal memory access

#1311 opened Nov 1, 2024 by clessig

flash attention 3 benchmark for H20 hopper

#1310 opened Nov 1, 2024 by aftersnow

Looking for compatible version

#1309 opened Oct 31, 2024 by mahmoodn

ROCm compilation error with PyTorch 2.5.1

#1307 opened Oct 31, 2024 by calebthomas259

Result mismatch with headdim=256 bwd

#1306 opened Oct 31, 2024 by zidanehuang001

whl for torch 2.5.0

#1302 opened Oct 28, 2024 by Galaxy-Husky

flash_attn_with_kvcach return block_lse or attention_score

#1301 opened Oct 28, 2024 by NonvolatileMemory

FlashSelfAttention and SelfAttention in flash_attn.modules.mha give different results

#1300 opened Oct 28, 2024 by senxiu-puleya

using out argument will change the output

#1299 opened Oct 27, 2024 by youkaichao

Build stuck on torch2.5.0

#1295 opened Oct 23, 2024 by ycformal

any plan for varlen fwd support hopper FP8?

#1294 opened Oct 22, 2024 by pengwu22

Request for New Release with PT Compile Ops

#1293 opened Oct 22, 2024 by kostum123

Support for CUDA 12.4 and above? URGENT PERHAPS?

#1292 opened Oct 22, 2024 by BBC-Esq

Support different shape attention mask

#1291 opened Oct 22, 2024 by SunzeY

CUTLASS 3.5.1 makes Flash Attention 3 slower?

#1289 opened Oct 22, 2024 by fno2010

undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

#1287 opened Oct 20, 2024 by LanXingXuan

In unit test，how is the dropout_fraction diff tolerance selected?

#1286 opened Oct 18, 2024 by muoshuosha

How to profile standard attention written in pytorch?

#1283 opened Oct 18, 2024 by woongjoonchoi

FlashAttention installation error: "CUDA 11.6 and above" requirement issue

#1282 opened Oct 17, 2024 by 21X5122

Softcap for FlashAttention v3

#1281 opened Oct 16, 2024 by Jeff-Zilence

ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface, why i cannot import it?? QAQ

#1280 opened Oct 16, 2024 by YANGTUOMAO

Unable to import my new kernel function after compilation success.

#1278 opened Oct 15, 2024 by jpli02

Why does the flash_attn_varlen_func method increase GPU memory usage?

#1277 opened Oct 15, 2024 by shaonan1993

Is there a way to install flash-attention without specific cuda version ?

#1276 opened Oct 14, 2024 by HuangChiEn

Previous 1 2 3 4 5 … 22 23 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly