You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey folks, first of all thanks for all the effort on building this amazing open source community. Here's my two cents, I may be mistaken, but in the spatial transformers we are using attention without positional encodings. Is that correct? The attention does not have any mechanism to know the original order of pixels, may that be impacting performance?
Hey folks, first of all thanks for all the effort on building this amazing open source community. Here's my two cents, I may be mistaken, but in the spatial transformers we are using attention without positional encodings. Is that correct? The attention does not have any mechanism to know the original order of pixels, may that be impacting performance?
SpatialSelfAttention:
stable-diffusion/ldm/modules/attention.py
Lines 99 to 149 in ce05de2
SpatialTransformer:
stable-diffusion/ldm/modules/attention.py
Lines 218 to 261 in ce05de2
BasicTransformerBlock:
stable-diffusion/ldm/modules/attention.py
Lines 196 to 215 in ce05de2
The text was updated successfully, but these errors were encountered: