You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PyTorch 2.0 will automatically set the most appropriate version of attention based on your system specs.
All implementations are enabled by default. Scaled dot product attention attempts to automatically select the most optimal implementation based on the inputs.
scaled_dot_product_attention is used in the repo. If you have Flash set to true but do not have an A100 it should default to mem efficient attn, math, or cpu.
or if i want use memory-efficient attention, i must call scaled_dot_product_attention?
PyTorch 2.0 includes an optimized and memory-efficient attention implementation through the torch.nn.functional.scaled_dot_product_attention function
The text was updated successfully, but these errors were encountered: