Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: Tiled decoding ruins the image #6144

Open
1 task done
seinan9 opened this issue Apr 4, 2024 · 7 comments
Open
1 task done

[bug]: Tiled decoding ruins the image #6144

seinan9 opened this issue Apr 4, 2024 · 7 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@seinan9
Copy link

seinan9 commented Apr 4, 2024

Is there an existing issue for this problem?

  • I have searched the existing issues

Operating system

Windows

GPU vendor

Nvidia (CUDA)

GPU model

RTX 4060 TI

GPU VRAM

16GB

Version number

4.0.2

Browser

Google Chrome 123.0.6312.105

Python dependencies

{
"accelerate": "0.28.0",
"compel": "2.0.2",
"cuda": "12.1",
"diffusers": "0.27.2",
"numpy": "1.26.4",
"opencv": "4.9.0.80",
"onnx": "1.15.0",
"pillow": "10.3.0",
"python": "3.10.6",
"torch": "2.2.1+cu121",
"torchvision": "0.17.1+cu121",
"transformers": "4.39.1",
"xformers": "0.0.25"
}

What happened

The image quality is significantly worse when setting force_tiled_decode to true. This is slightly noticable during the initial generation, and much more when upscaling. Certain parts look oversaturated. One can observe clear difference in the images bellow (first with tiled decode off, second with tiled decode on). Also on the second image, one can see that the upper part (probably the first tile) is less affected than the bottom part (probably the second tile).

tiled_decode_off
tiled_decode_on

What you expected to happen

I expected the images to be decoded normally without oversaturation on certain parts.

How to reproduce the problem

Set force_tiled_decode option in invokeai.yaml to true, start the application and generate an image (easier to spot with realistic ones and upscaling).

Additional context

The examples are generated with epicrealism (SD 1.5), but this is reproducible with other models as well. In realistic ones it is easier to spot. This does not happen in InvokeAI version 3.7.

The initial image was generated with the following parameters:

  • Generation Mode: txt2img
  • Positive Prompt: photo of a viking, short hair, oversized sweater, close up, fierce, male
  • Negative Prompt: (low quality)1.4
  • Model: epicrealism (SD-1)
  • Width: 512
  • Height: 768
  • Seed: 2926731161
  • Steps: 25
  • Scheduler: dpmpp_2m_k
  • CFG scale: 8
  • CFG Rescale Multiplier: 0

Afterwards it was upscaled to 640x960 via img2img with a denoise of .55. The parameters stayed the same.

Discord username

seinan9

@seinan9 seinan9 added the bug Something isn't working label Apr 4, 2024
@psychedelicious
Copy link
Collaborator

The handling of tiled decode hasn't changed in some time - several months. This functionality is handled wholly by diffusers, and it appears their implementation also hasn't changed in months.

It's possible there was some change in another area of diffusers or invoke that indirectly affect how tiled decoding is handled.

However, slight changes like this are known effects of tiled decoding. The model doesn't have the full context of the image, it's expected that the tiled decode has measurable and sometimes visual differences. There's some discussion here, though the example images appear to be missing now.

A more convincing comparison would be between a v3.7.0 image with tiled decode vs v4 image with tiled decode (no upscaling please, that adds another variable to the equation). Ideally a few comparisons.

@seinan9
Copy link
Author

seinan9 commented Apr 5, 2024

It is less visible during the first pass, since there is typically only a single tile (2 at most if the resolution is set a bit higher). Still here are two more example without upscaling (512x768). Images 1 and 3 were generated via InvokeAI 3.7, while 2 and 4 were generated using InvokeAI 4.0.2. Same parameters for alle images.

invoke37_tiled_decode_on_0
invoke402_tiled_decode_on_0
invoke37_tiled_decode_on_1
invoke402_tiled_decode_on_1

@psychedelicious
Copy link
Collaborator

Thanks for those examples. It's still very noticeable. I think we need to test this with diffusers (i.e. via separate script, not within invoke) to confirm where the problem is.

@seinan9
Copy link
Author

seinan9 commented Apr 5, 2024

Your welcome. And thank you for looking into it!

@psychedelicious psychedelicious added the help wanted Extra attention is needed label Apr 29, 2024
@RyanJDick
Copy link
Collaborator

I tried to reproduce this today. It turns out that there was no regression in VAE tiling behavior. There was a period of time during the switch from tiled_decode to force_tiled_decode during which we weren't applying the force_tiled_decode config.

For example, look at the v3.6.2 tag:

This was eventually fixed in 897fe49.

I tested VAE tiling in older versions of Invoke via workflows and saw the same bad VAE tiling artifacts as in the latest version of Invoke. Unfortunately, these tiling artifacts are expected in the current diffusers implementation of VAE tiling, as discussed on the original PR: huggingface/diffusers#1441

I'm going to do a little experimentation to see if I can improve things by modifying the tile dimensions/overlaps. But a proper fix would be a bigger project.

@seinan9
Copy link
Author

seinan9 commented Jun 27, 2024

Unfortunate that it is a problem within the diffusers implementation. For me it is not an urget problem, but I am still grateful that you are looking into it. Thanks!

@ufuksarp
Copy link
Contributor

I had the same issue with EasyDiffusion last year. They added a switch to disable VAE tiling. It seems that's the only fix right now. easydiffusion/easydiffusion#1442

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants