Blank black image output with stable diffusion 2.1 using autocast #1614

fralumz · 2022-12-08T17:54:44Z

Describe the bug

Using stable diffusion pipeline with torch.autocast and the stabilityai/stable-diffusion-2-1 model, the images generate are all blank black images.

Reproduction

import torch
from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler

model_id = "stabilityai/stable-diffusion-2-1"

scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"

image = pipe(prompt, height=768, width=768).images[0]

image.save("astronaut_rides_horse.png") # works fine

with torch.autocast("cuda"):
    image = pipe(prompt, height=768, width=768).images[0] # generates blank image
    image.save("astronaut_rides_horse_autocast.png")

Logs

Python 3.10.8 | packaged by conda-forge | (main, Nov 24 2022, 14:07:00) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> from diffusers import StableDiffusionPipeline, EulerDiscreteScheduler
>>>
>>> model_id = "stabilityai/stable-diffusion-2-1"
>>>
>>> scheduler = EulerDiscreteScheduler.from_pretrained(model_id, subfolder="scheduler")
>>> pipe = StableDiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
Fetching 12 files: 100%|█████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 6015.50it/s]
>>> pipe = pipe.to("cuda")
>>>
>>> prompt = "a photo of an astronaut riding a horse on mars"
>>>
>>> image = pipe(prompt, height=768, width=768).images[0]
100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:14<00:00,  3.57it/s]
>>>
>>> image.save("astronaut_rides_horse.png") # works fine
>>>
>>> with torch.autocast("cuda"):
...     image = pipe(prompt, height=768, width=768).images[0] # generates blank image
...     image.save("astronaut_rides_horse_autocast.png")
...
100%|██████████████████████████████████████████████████████████████████████████████████| 50/50 [00:12<00:00,  3.95it/s]

System Info

diffusers version: 0.10.0.dev0
Platform: Windows-10-10.0.19044-SP0
Python version: 3.10.8
PyTorch version (GPU?): 1.13.0 (True)
Huggingface_hub version: 0.11.1
Transformers version: 4.25.1
Using GPU in script?: yes
Using distributed or parallel set-up in script?: no

The text was updated successfully, but these errors were encountered:

patrickvonplaten · 2022-12-08T23:50:01Z

Hey @fralumz,

It is not recommended to use autocast - it is both slower then pure fp16 and more instable/controllabel

HughPH · 2022-12-23T21:15:43Z

After removing with torch.autocast("cuda"): I get images again with 0.10.1 but it's back to black output with 0.10.2 or above, including 0.12.0dev0

patrickvonplaten · 2023-01-03T11:31:53Z

Hey @HughPH,

Could you please post a reproducible code snippet including your current diffusers, pytorch, etc.. versions (you can get them with diffusers-cli env)

HughPH · 2023-01-03T13:24:36Z

Yes, when I finish work I'll duplicate my environment, get a minimal script together, and then make sure it breaks with 0.10.2

patrickvonplaten · 2023-01-17T09:31:15Z

Since a couple of people seem to be following this issue, I'd like to clarify a couple of things regarding the 2.1 model.

The 2.1 model was trained with xformers flash attention. Upon release, it was noticed that without using xformers, the model produces black images for torch.autocast and pure fp16 as written by the authors here: Stability-AI/stablediffusion@c12d960
The reason is that when using xformers flash attention as explained here: https://deepai.org/publication/flashattention-fast-and-memory-efficient-exact-attention-with-io-awareness
values are much less prone to overflow since there are less additions.
=> So we strongly recommend to always use xformers if possible when using SD 2.1. (It's a pity that there are no clean xformers wheels yet, but this should change very soon with PyTorch 2.0).

If you cannot use xformers, then there is a chance that you run into precision problems in half precision.
To fix this, we found that upcasting the query and key states just before the attention allows the model to be run
in inference in fp16. See PR here: #1590

Note: This fixes inference only for pure fp16, not for autocast. Autocast is much harder to control as it will automatically downcast weights to fp16 before the softmax computation which will then lead to black images.
Also, 2.1 cannot be fine-tuned in mixed-precision because of this the layer is prone to overflow.

Conclusion:

Please do not use torch.autocast(...) with `diffusers SD 2.1 (actually don't use it at all, we only see disadvantages compared to pure fp16)
Please use xformers if you can.

Also trying to make this a bit clearer in the docs: #2021

github-actions · 2023-02-10T15:03:34Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

fralumz added the bug Something isn't working label Dec 8, 2022

keturn mentioned this issue Jan 15, 2023

[bug]: SD 2.x models return only black in float16 invoke-ai/InvokeAI#2329

Closed

Sanster mentioned this issue Jan 16, 2023

[bug] Stable diffusion aitemplate img2img precision error facebookincubator/AITemplate#141

Open

github-actions bot added the stale Issues that haven't received updates label Feb 10, 2023

github-actions bot closed this as completed Feb 19, 2023

phillipinseoul mentioned this issue Sep 30, 2023

Bad generation results on "stabilityai/stable-diffusion-2-1" KAIST-Visual-AI-Group/SyncDiffusion#2

Closed

bghira mentioned this issue Apr 2, 2024

[mps] training / inference dtype issues #7563

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blank black image output with stable diffusion 2.1 using autocast #1614

Blank black image output with stable diffusion 2.1 using autocast #1614

fralumz commented Dec 8, 2022 •

edited

patrickvonplaten commented Dec 8, 2022

HughPH commented Dec 23, 2022 •

edited

patrickvonplaten commented Jan 3, 2023

HughPH commented Jan 3, 2023

patrickvonplaten commented Jan 17, 2023 •

edited

github-actions bot commented Feb 10, 2023

Blank black image output with stable diffusion 2.1 using autocast #1614

Blank black image output with stable diffusion 2.1 using autocast #1614

Comments

fralumz commented Dec 8, 2022 • edited

Describe the bug

Reproduction

Logs

System Info

patrickvonplaten commented Dec 8, 2022

HughPH commented Dec 23, 2022 • edited

patrickvonplaten commented Jan 3, 2023

HughPH commented Jan 3, 2023

patrickvonplaten commented Jan 17, 2023 • edited

github-actions bot commented Feb 10, 2023

fralumz commented Dec 8, 2022 •

edited

HughPH commented Dec 23, 2022 •

edited

patrickvonplaten commented Jan 17, 2023 •

edited