Switch to using max for attention slicing in all cases for the time being#2569
Merged
Conversation
Collaborator
Author
|
At a future point in time, we can switch to a dynamic approach that chooses either no slicing or All of this may be obsoleted by the inclusion of sub-quadratic attention in diffusers or our own implementation. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Justification of this change is that it enables very large image generation while imposing a slight performance penalty over no attention slicing. The current choice of
autohas the same performance characteristics ofmax, and doesn't give the benefit of large image generation.Stats on a 12GB CUDA card:
no slicing:
1280x1280
OOM
1024x1024
OOM
768x768
>> 1 image(s) generated in 23.67s512x512
>> 1 image(s) generated in 8.41sauto:1280x1280
OOM
1024x1024
>> 1 image(s) generated in 70.30s768x768
>> 1 image(s) generated in 23.64s512x512
>> 1 image(s) generated in 8.55smax:1280x1280
>> 1 image(s) generated in 147.21s1024x1024
>> 1 image(s) generated in 65.62s768x768
>> 1 image(s) generated in 23.19s512x512
>> 1 image(s) generated in 9.06s