-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
AutoencodeTiny (TAESD) decoder seems to work fine. encoding on the other hand is producing poor results, and an encode-decode round-trip turns out poorly:
Reproduction
see https://gist.github.com/keturn/b0a10a3b388e1e49cdf38567b76eb30c
import diffusers, torch
from PIL.Image import Image, open as image_open
device = torch.device("cuda:0")
with torch.inference_mode():
taesd = diffusers.AutoencoderTiny.from_pretrained("madebyollin/taesd", torch_dtype=torch.float16).to(device=device)
vaesd = diffusers.AutoencoderKL.from_pretrained("runwayml/stable-diffusion-v1-5", subfolder="vae", variant="fp16", torch_dtype=torch.float16).to(device=device)
from diffusers.utils.testing_utils import load_image
image = load_image(
"https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/versatile_diffusion/benz.jpg"
)
from diffusers.image_processor import VaeImageProcessor
vae_processor = VaeImageProcessor()
image_tensor: torch.FloatTensor = vae_processor.preprocess(image).to(dtype=torch.float16, device=device)
print(f"image tensor range: {image_tensor.min()} < {image_tensor.mean()} < {image_tensor.max()})")
with torch.inference_mode():
taesd_latents = taesd.encode(image_tensor).latents
print(f"taesd-encoded latent range: {taesd_latents.min()} < {taesd_latents.mean()} (σ={taesd_latents.std()}) < {taesd_latents.max()})")
vaesd_latents = vaesd.encode(image_tensor).latent_dist.sample()
print(f"vaesd-encoded latent range: {vaesd_latents.min()} < {vaesd_latents.mean()} (σ={vaesd_latents.std()}) < {vaesd_latents.max()})")
with torch.inference_mode():
redecoded_tensor = taesd.decode(taesd_latents).sample
redecoded_image = vae_processor.postprocess(redecoded_tensor)
display(image, redecoded_image[0])
from diffusers.commands import env
env.EnvironmentCommand().run()
System Info
diffusers
version: 0.20.0- Platform: Linux-5.15.0-79-generic-x86_64-with-glibc2.35
- Python version: 3.11.4
- PyTorch version (GPU?): 2.0.1+cu118 (True)
- Huggingface_hub version: 0.16.4
- Transformers version: 4.31.0
- Accelerate version: 0.21.0
- xFormers version: 0.0.21
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: no
Who can help?
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working