Tiled vae parameter validation #6

wbruna · 2025-09-13T10:57:57Z

I fixed the tiled parameter processing to limit both direct dimensions and relative factors, and added limits to the overlapping factor (I remember we shouldn't allow an overlapping factor larger than 0.5).

I also noticed the tile size bump for the encoding path wasn't being included on that limit, and pulled it into the auxiliary function too.

And since different rel_size ranges already have different effects on the calculation, I changed the explicit relative boolean to implicit rel_size > 0 tests.

wbruna · 2025-09-13T11:12:51Z

And it's not a problem with the PR itself, but another thing I've noticed:

[INFO ] stable-diffusion.cpp:2154 - decoding 1 latents
[DEBUG] stable-diffusion.cpp:1493 - VAE Tile size: 48x48
[DEBUG] ggml_extend.hpp:811  - num tiles : 3, 5 
[DEBUG] ggml_extend.hpp:812  - optimal overlap : 0.416667, 0.500000 (targeting 0.500000)
[DEBUG] ggml_extend.hpp:845  - tile work buffer size: 1.72 MB
[INFO ] ggml_extend.hpp:858  - processing 15 tiles
[DEBUG] ggml_extend.hpp:1540 - vae compute buffer size: 360.04 MB(VRAM)
  |===>                                              | 1/15 - 2.07it/s[DEBUG] ggml_extend.hpp:1540 - vae compute buffer size: 360.04 MB(VRAM)
  |======>                                           | 2/15 - 2.11it/s[DEBUG] ggml_extend.hpp:1540 - vae compute buffer size: 360.04 MB(VRAM)
  |==========>                                       | 3/15 - 2.11it/s[DEBUG] ggml_extend.hpp:1540 - vae compute buffer size: 360.04 MB(VRAM)
  |=============>                                    | 4/15 - 2.12it/s[DEBUG] ggml_extend.hpp:1540 - vae compute buffer size: 360.04 MB(VRAM)
  |================>                                 | 5/15 - 2.12it/s[DEBUG] ggml_extend.hpp:1540 - vae compute buffer size: 360.04 MB(VRAM)

We don't seem to be reusing the context across tiles.

It looks like that'd be controlled by the free_compute_buffer_immediately boolean from the GGMLRunner::compute function, but we actually call the VAE through the AutoEncoderKL::compute, so we'd need to either add a boolean to that overload, or fix it to false and free the buffers explicitly on each call site. Am I on the right track? :-)

stduhpf · 2025-09-13T11:45:50Z

It looks like that'd be controlled by the free_compute_buffer_immediately boolean from the GGMLRunner::compute function, but we actually call the VAE through the AutoEncoderKL::compute, so we'd need to either add a boolean to that overload, or fix it to false and free the buffers explicitly on each call site. Am I on the right track? :-)

Yes I think so too. Maybe it would be worth investigating that in a separate PR. It might make vae tiling a bit faster

stduhpf · 2025-09-13T12:17:54Z

Did a few tests, LGTM

wbruna · 2025-09-13T13:57:41Z

I just noticed the decoding tile size became too big when I used a relative factor 😕

W and H are being multiplied by 8 at the beginning of decode_first_stage, so:

diff --git a/stable-diffusion.cpp b/stable-diffusion.cpp
index 9085bd3..b1fd72d 100644
--- a/stable-diffusion.cpp
+++ b/stable-diffusion.cpp
@@ -1488,7 +1492,7 @@ public:
         if (!use_tiny_autoencoder) {
             float tile_overlap;
             int tile_size_x, tile_size_y;
-            get_tile_sizes(tile_size_x, tile_size_y, tile_overlap, vae_tiling_params, W, H);
+            get_tile_sizes(tile_size_x, tile_size_y, tile_overlap, vae_tiling_params, W / 8, H / 8);
 
             LOG_DEBUG("VAE Tile size: %dx%d", tile_size_x, tile_size_y);

Or maybe we should use x->ne instead, since the latent size factor is different for Wan?

* implement tiling vae encode support * Tiling (vae/upscale): adaptative overlap * Tiling: fix edge case * Tiling: fix crash when less than 2 tiles per dim * remove extra dot * Tiling: fix edge cases for adaptative overlap * tiling: fix edge case * set vae tile size via env var * vae tiling: refactor again, base on smaller buffer for alignment * Use bigger tiles for encode (to match compute buffer size) * Fix edge case when tile is bigger than latent * non-square VAE tiling (#3) * refactor tile number calculation * support non-square tiles * add env var to change tile overlap * add safeguards and better error messages for SD_TILE_OVERLAP * add safeguards and include overlapping factor for SD_TILE_SIZE * avoid rounding issues when specifying SD_TILE_SIZE as a factor * lower SD_TILE_OVERLAP limit * zero-init empty output buffer * Fix decode latent size * fix encode * tile size params instead of env * Tiled vae parameter validation (#6) * avoid crash with invalid tile sizes, use 0 for default * refactor default tile size, limit overlap factor * remove explicit parameter for relative tile size * limit encoding tile to latent size * unify code style and format code * update docs * fix get_tile_sizes in decode_first_stage --------- Co-authored-by: Wagner Bruna <wbruna@users.noreply.github.com> Co-authored-by: leejet <leejet714@gmail.com>

wbruna added 4 commits September 12, 2025 14:51

avoid crash with invalid tile sizes, use 0 for default

99616b9

refactor default tile size, limit overlap factor

570e26a

remove explicit parameter for relative tile size

06b4130

limit encoding tile to latent size

0e56bc7

stduhpf merged commit 2995c92 into stduhpf:tiled-vae-encode Sep 13, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tiled vae parameter validation #6

Tiled vae parameter validation #6

Uh oh!

wbruna commented Sep 13, 2025

Uh oh!

wbruna commented Sep 13, 2025

Uh oh!

stduhpf commented Sep 13, 2025 •

edited

Loading

Uh oh!

stduhpf commented Sep 13, 2025

Uh oh!

Uh oh!

wbruna commented Sep 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tiled vae parameter validation #6

Tiled vae parameter validation #6

Uh oh!

Conversation

wbruna commented Sep 13, 2025

Uh oh!

wbruna commented Sep 13, 2025

Uh oh!

stduhpf commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stduhpf commented Sep 13, 2025

Uh oh!

Uh oh!

wbruna commented Sep 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stduhpf commented Sep 13, 2025 •

edited

Loading