-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug]: Tiled decoding ruins the image #6144
Comments
The handling of tiled decode hasn't changed in some time - several months. This functionality is handled wholly by It's possible there was some change in another area of However, slight changes like this are known effects of tiled decoding. The model doesn't have the full context of the image, it's expected that the tiled decode has measurable and sometimes visual differences. There's some discussion here, though the example images appear to be missing now. A more convincing comparison would be between a v3.7.0 image with tiled decode vs v4 image with tiled decode (no upscaling please, that adds another variable to the equation). Ideally a few comparisons. |
It is less visible during the first pass, since there is typically only a single tile (2 at most if the resolution is set a bit higher). Still here are two more example without upscaling (512x768). Images 1 and 3 were generated via InvokeAI 3.7, while 2 and 4 were generated using InvokeAI 4.0.2. Same parameters for alle images. |
Thanks for those examples. It's still very noticeable. I think we need to test this with diffusers (i.e. via separate script, not within invoke) to confirm where the problem is. |
Your welcome. And thank you for looking into it! |
I tried to reproduce this today. It turns out that there was no regression in VAE tiling behavior. There was a period of time during the switch from For example, look at the v3.6.2 tag:
This was eventually fixed in 897fe49. I tested VAE tiling in older versions of Invoke via workflows and saw the same bad VAE tiling artifacts as in the latest version of Invoke. Unfortunately, these tiling artifacts are expected in the current diffusers implementation of VAE tiling, as discussed on the original PR: huggingface/diffusers#1441 I'm going to do a little experimentation to see if I can improve things by modifying the tile dimensions/overlaps. But a proper fix would be a bigger project. |
Unfortunate that it is a problem within the diffusers implementation. For me it is not an urget problem, but I am still grateful that you are looking into it. Thanks! |
I had the same issue with EasyDiffusion last year. They added a switch to disable VAE tiling. It seems that's the only fix right now. easydiffusion/easydiffusion#1442 |
Is there an existing issue for this problem?
Operating system
Windows
GPU vendor
Nvidia (CUDA)
GPU model
RTX 4060 TI
GPU VRAM
16GB
Version number
4.0.2
Browser
Google Chrome 123.0.6312.105
Python dependencies
{
"accelerate": "0.28.0",
"compel": "2.0.2",
"cuda": "12.1",
"diffusers": "0.27.2",
"numpy": "1.26.4",
"opencv": "4.9.0.80",
"onnx": "1.15.0",
"pillow": "10.3.0",
"python": "3.10.6",
"torch": "2.2.1+cu121",
"torchvision": "0.17.1+cu121",
"transformers": "4.39.1",
"xformers": "0.0.25"
}
What happened
The image quality is significantly worse when setting force_tiled_decode to true. This is slightly noticable during the initial generation, and much more when upscaling. Certain parts look oversaturated. One can observe clear difference in the images bellow (first with tiled decode off, second with tiled decode on). Also on the second image, one can see that the upper part (probably the first tile) is less affected than the bottom part (probably the second tile).
What you expected to happen
I expected the images to be decoded normally without oversaturation on certain parts.
How to reproduce the problem
Set force_tiled_decode option in invokeai.yaml to true, start the application and generate an image (easier to spot with realistic ones and upscaling).
Additional context
The examples are generated with epicrealism (SD 1.5), but this is reproducible with other models as well. In realistic ones it is easier to spot. This does not happen in InvokeAI version 3.7.
The initial image was generated with the following parameters:
Afterwards it was upscaled to 640x960 via img2img with a denoise of .55. The parameters stayed the same.
Discord username
seinan9
The text was updated successfully, but these errors were encountered: