memory-efficient attention is default opened? if i dont use flash attn #48

wac81 · 2023-04-07T03:22:14Z

or if i want use memory-efficient attention, i must call scaled_dot_product_attention?

PyTorch 2.0 includes an optimized and memory-efficient attention implementation through the torch.nn.functional.scaled_dot_product_attention function

conceptofmind · 2023-04-09T02:16:10Z

PyTorch 2.0 will automatically set the most appropriate version of attention based on your system specs.

All implementations are enabled by default. Scaled dot product attention attempts to automatically select the most optimal implementation based on the inputs.

scaled_dot_product_attention is used in the repo. If you have Flash set to true but do not have an A100 it should default to mem efficient attn, math, or cpu.

wac81 · 2023-04-24T02:36:13Z

thanks, but how do i know which one to use? how to check it?

wac81 · 2023-04-24T02:40:41Z

memory_efficient_attention of xformers
it's faster implement than torch? any idea?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory-efficient attention is default opened? if i dont use flash attn #48

memory-efficient attention is default opened? if i dont use flash attn #48

wac81 commented Apr 7, 2023

conceptofmind commented Apr 9, 2023 •

edited

Loading

wac81 commented Apr 24, 2023

wac81 commented Apr 24, 2023

memory-efficient attention is default opened? if i dont use flash attn #48

memory-efficient attention is default opened? if i dont use flash attn #48

Comments

wac81 commented Apr 7, 2023

conceptofmind commented Apr 9, 2023 • edited Loading

wac81 commented Apr 24, 2023

wac81 commented Apr 24, 2023

conceptofmind commented Apr 9, 2023 •

edited

Loading