VAE Tiling not supported with SD3 for non power of 2 images?

### Describe the bug

VAE tiling works for SD3 with power of 2 images, but for no other alignments.

The mentioned issues with VAE tiling are due to: [vae/config.json](https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers/blob/main/vae/config.json)

Having: 

```
"use_post_quant_conv": false,
"use_quant_conv": false
```

Which causes the method used here:

https://github.com/huggingface/diffusers/blob/589931ca791deb8f896ee291ee481070755faa26/src/diffusers/models/autoencoders/autoencoder_kl.py#L363

And Here:

https://github.com/huggingface/diffusers/blob/589931ca791deb8f896ee291ee481070755faa26/src/diffusers/models/autoencoders/autoencoder_kl.py#L412

To be `None`

Perhaps at the moment, the model is simply not entirely compatible with the tiling in ``AutoEncoderKL``, as the state dict does not possess the keys `post_quant_conv.bias, quant_conv.weight, post_quant_conv.weight, quant_conv.bias`

Is this intended?

### Reproduction

```python
import diffusers
import PIL.Image
import os

os.environ['HF_TOKEN'] = 'your token'

cn = diffusers.SD3ControlNetModel.from_pretrained('InstantX/SD3-Controlnet-Canny')

pipe = diffusers.StableDiffusion3ControlNetPipeline.from_pretrained(
    'stabilityai/stable-diffusion-3-medium-diffusers',
    controlnet=cn)

pipe.enable_sequential_cpu_offload()

pipe.vae.enable_tiling()

width = 1376
height = 920

# aligned by 16, but alignment by 64 also fails
output_size = (width-(width % 16), height-(height % 16))

not_pow_2 = PIL.Image.new('RGB', output_size)

args = {
    'guidance_scale': 8.0,
    'num_inference_steps': 30,
    'width': output_size[0],
    'height': output_size[1],
    'control_image': not_pow_2,
    'prompt': 'test prompt'
}

pipe(**args)
```

### Logs

```shell
REDACT\venv\Lib\site-packages\diffusers\models\attention_processor.py:1584: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
  hidden_states = F.scaled_dot_product_attention(
Traceback (most recent call last):
  File "REDACT\test.py", line 35, in <module>
    pipe(**args)
  File "REDACT\venv\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "REDACT\venv\Lib\site-packages\diffusers\pipelines\controlnet_sd3\pipeline_stable_diffusion_3_controlnet.py", line 912, in __call__
    control_image = self.vae.encode(control_image).latent_dist.sample()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "REDACT\venv\Lib\site-packages\diffusers\utils\accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "REDACT\venv\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl.py", line 258, in encode
    return self.tiled_encode(x, return_dict=return_dict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "REDACT\venv\Lib\site-packages\diffusers\models\autoencoders\autoencoder_kl.py", line 363, in tiled_encode
    tile = self.quant_conv(tile)
           ^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not callable
```


### System Info

Windows

diffusers 0.29.2

### Who can help?

@yiyixuxu @sayakpaul @DN6 @asomoza

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

VAE Tiling not supported with SD3 for non power of 2 images? #8788

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

VAE Tiling not supported with SD3 for non power of 2 images? #8788

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions