[bug]: Tiled decoding ruins the image #6144

seinan9 · 2024-04-04T18:56:09Z

Is there an existing issue for this problem?

I have searched the existing issues

Operating system

Windows

GPU vendor

Nvidia (CUDA)

GPU model

RTX 4060 TI

GPU VRAM

16GB

Version number

4.0.2

Browser

Google Chrome 123.0.6312.105

Python dependencies

{
"accelerate": "0.28.0",
"compel": "2.0.2",
"cuda": "12.1",
"diffusers": "0.27.2",
"numpy": "1.26.4",
"opencv": "4.9.0.80",
"onnx": "1.15.0",
"pillow": "10.3.0",
"python": "3.10.6",
"torch": "2.2.1+cu121",
"torchvision": "0.17.1+cu121",
"transformers": "4.39.1",
"xformers": "0.0.25"
}

What happened

The image quality is significantly worse when setting force_tiled_decode to true. This is slightly noticable during the initial generation, and much more when upscaling. Certain parts look oversaturated. One can observe clear difference in the images bellow (first with tiled decode off, second with tiled decode on). Also on the second image, one can see that the upper part (probably the first tile) is less affected than the bottom part (probably the second tile).

What you expected to happen

I expected the images to be decoded normally without oversaturation on certain parts.

How to reproduce the problem

Set force_tiled_decode option in invokeai.yaml to true, start the application and generate an image (easier to spot with realistic ones and upscaling).

Additional context

The examples are generated with epicrealism (SD 1.5), but this is reproducible with other models as well. In realistic ones it is easier to spot. This does not happen in InvokeAI version 3.7.

The initial image was generated with the following parameters:

Generation Mode: txt2img
Positive Prompt: photo of a viking, short hair, oversized sweater, close up, fierce, male
Negative Prompt: (low quality)1.4
Model: epicrealism (SD-1)
Width: 512
Height: 768
Seed: 2926731161
Steps: 25
Scheduler: dpmpp_2m_k
CFG scale: 8
CFG Rescale Multiplier: 0

Afterwards it was upscaled to 640x960 via img2img with a denoise of .55. The parameters stayed the same.

Discord username

seinan9

psychedelicious · 2024-04-05T04:20:14Z

The handling of tiled decode hasn't changed in some time - several months. This functionality is handled wholly by diffusers, and it appears their implementation also hasn't changed in months.

It's possible there was some change in another area of diffusers or invoke that indirectly affect how tiled decoding is handled.

However, slight changes like this are known effects of tiled decoding. The model doesn't have the full context of the image, it's expected that the tiled decode has measurable and sometimes visual differences. There's some discussion here, though the example images appear to be missing now.

A more convincing comparison would be between a v3.7.0 image with tiled decode vs v4 image with tiled decode (no upscaling please, that adds another variable to the equation). Ideally a few comparisons.

seinan9 · 2024-04-05T12:41:26Z

It is less visible during the first pass, since there is typically only a single tile (2 at most if the resolution is set a bit higher). Still here are two more example without upscaling (512x768). Images 1 and 3 were generated via InvokeAI 3.7, while 2 and 4 were generated using InvokeAI 4.0.2. Same parameters for alle images.

psychedelicious · 2024-04-05T20:40:43Z

Thanks for those examples. It's still very noticeable. I think we need to test this with diffusers (i.e. via separate script, not within invoke) to confirm where the problem is.

seinan9 · 2024-04-05T21:41:15Z

Your welcome. And thank you for looking into it!

RyanJDick · 2024-06-26T14:18:18Z

I tried to reproduce this today. It turns out that there was no regression in VAE tiling behavior. There was a period of time during the switch from tiled_decode to force_tiled_decode during which we weren't applying the force_tiled_decode config.

For example, look at the v3.6.2 tag:

force_tiled_decode was present in the config and tiled_decode was deprecated: https://github.com/invoke-ai/InvokeAI/blob/v3.6.2/invokeai/app/services/config/config_default.py#L272
But, we were still using tiled_decode in the codebase: https://github.com/invoke-ai/InvokeAI/blob/v3.6.2/invokeai/app/invocations/latent.py#L857

This was eventually fixed in 897fe49.

I tested VAE tiling in older versions of Invoke via workflows and saw the same bad VAE tiling artifacts as in the latest version of Invoke. Unfortunately, these tiling artifacts are expected in the current diffusers implementation of VAE tiling, as discussed on the original PR: huggingface/diffusers#1441

I'm going to do a little experimentation to see if I can improve things by modifying the tile dimensions/overlaps. But a proper fix would be a bigger project.

seinan9 · 2024-06-27T17:41:41Z

Unfortunate that it is a problem within the diffusers implementation. For me it is not an urget problem, but I am still grateful that you are looking into it. Thanks!

ufuksarp · 2024-06-27T18:27:35Z

I had the same issue with EasyDiffusion last year. They added a switch to disable VAE tiling. It seems that's the only fix right now. easydiffusion/easydiffusion#1442

seinan9 added the bug Something isn't working label Apr 4, 2024

psychedelicious added the help wanted Extra attention is needed label Apr 29, 2024

RyanJDick mentioned this issue Jun 28, 2024

Make the VAE tile size configurable for tiled VAE #6555

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug]: Tiled decoding ruins the image #6144

[bug]: Tiled decoding ruins the image #6144

seinan9 commented Apr 4, 2024 •

edited

Loading

psychedelicious commented Apr 5, 2024

seinan9 commented Apr 5, 2024

psychedelicious commented Apr 5, 2024

seinan9 commented Apr 5, 2024

RyanJDick commented Jun 26, 2024

seinan9 commented Jun 27, 2024

ufuksarp commented Jun 27, 2024

[bug]: Tiled decoding ruins the image #6144

[bug]: Tiled decoding ruins the image #6144

Comments

seinan9 commented Apr 4, 2024 • edited Loading

Is there an existing issue for this problem?

Operating system

GPU vendor

GPU model

GPU VRAM

Version number

Browser

Python dependencies

What happened

What you expected to happen

How to reproduce the problem

Additional context

Discord username

psychedelicious commented Apr 5, 2024

seinan9 commented Apr 5, 2024

psychedelicious commented Apr 5, 2024

seinan9 commented Apr 5, 2024

RyanJDick commented Jun 26, 2024

seinan9 commented Jun 27, 2024

ufuksarp commented Jun 27, 2024

seinan9 commented Apr 4, 2024 •

edited

Loading