[feat] add flash attention #1762

oahzxl · 2022-10-26T05:11:46Z

add cuda and triton flash attention

oahzxl · 2022-10-26T05:16:16Z

tests/test_utils/test_flash_attention.py

colossalai/kernel/cuda_native/flash_attention.py

tests/test_utils/test_flash_attention.py

feifeibear · 2022-10-26T05:35:14Z

colossalai/kernel/cuda_native/flash_attention.py

+@triton.jit
+def _fwd_kernel(
+    Q, K, V, sm_scale,
+    TMP, L, M,  # NOTE: TMP is a scratchpad buffer to workaround a compiler bug


can you add type hints for each parameter?
for example
TMP : torch.Tensor, L : int

Not sure about some param. Will update it with alphafold version flash attention

oahzxl added 2 commits October 26, 2022 13:10

update flashatt

cffe627

update shell

bd912ba

oahzxl changed the title ~~Add flash attention~~ [feat] add flash attention Oct 26, 2022

feifeibear reviewed Oct 26, 2022

View reviewed changes

polish code

5d66bf9

feifeibear added the Run Build and Test label Oct 26, 2022

oahzxl added 6 commits October 26, 2022 14:33

update requirement

ace5ff3

update triton version

b0e3c1d

update shell

a7c92db

update subprocess

1a4afc9

update nvcc path

ba36757

update path

74f4a7d

feifeibear approved these changes Oct 26, 2022

View reviewed changes

feifeibear merged commit 25952b6 into hpcaitech:main Oct 26, 2022

oahzxl deleted the flashatt branch November 7, 2022 10:35

Provide feedback