Memory efficient attention not gain speedups on A10 and V100 #762

lucasjinreal · 2023-06-07T07:56:18Z

Using diffuers and enable enable_xformers_memory_efficient_attention

But the speed didn't get any changed. Why?

yjhong89 · 2023-06-08T04:05:02Z

Using xformers on V100 doesn't gain any speedup for me either.
Time per iteration increase if using xformers on V100.

Above (using xformers), Below (vanilla cross attention) when training stable diffusion

danthe3rd · 2023-06-08T06:52:58Z

Hi,
What is the version of XFormers you are using? Is XFormers using less GPU memory?
It might be because "vanilla" diffusers is now using XFormers kernels which were integrated inside Pytorch, but not sure about this. Might be better to open an issue in diffusers (please tag me if you do so)

yjhong89 · 2023-06-08T09:29:23Z

My XFormer version is 0.0.20.
And I check XFormer using less GPU memory than vanilla version.
But elapsed time per iteration little bit increased than vanilla attention.
Do I have to use fp16 ?? (Current setting using fp32)

danthe3rd · 2023-06-08T11:43:00Z

Do I have to use fp16 ?? (Current setting using fp32)

Oh yes - good catch! We have kernels for f32 but they are not really efficient. You should use f16 or bf16 if possible to get the best speed. In fact, it's very likely that xFormers induces a slow-down when training in f32.

yjhong89 · 2023-06-08T12:36:49Z

Okay, Thank you! I'll try using fp16

bnabis93 mentioned this issue Jul 12, 2023

xFormers-ViT performance degradation in A100 GPU bnabis93/vision-language-examples#14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory efficient attention not gain speedups on A10 and V100 #762

Memory efficient attention not gain speedups on A10 and V100 #762

lucasjinreal commented Jun 7, 2023

yjhong89 commented Jun 8, 2023

danthe3rd commented Jun 8, 2023

yjhong89 commented Jun 8, 2023 •

edited

Loading

danthe3rd commented Jun 8, 2023

yjhong89 commented Jun 8, 2023

Memory efficient attention not gain speedups on A10 and V100 #762

Memory efficient attention not gain speedups on A10 and V100 #762

Comments

lucasjinreal commented Jun 7, 2023

yjhong89 commented Jun 8, 2023

danthe3rd commented Jun 8, 2023

yjhong89 commented Jun 8, 2023 • edited Loading

danthe3rd commented Jun 8, 2023

yjhong89 commented Jun 8, 2023

yjhong89 commented Jun 8, 2023 •

edited

Loading