[Bug] Encoder in diffusers.models.autoencoders.vae's forward method return type mismatch leads to AttributeError

### Describe the bug

**Issue Description:**
When using the Encoder from the` diffusers.models.autoencoders.vae module`, calling its forward method returns a value type mismatch, resulting in an AttributeError during subsequent processing. Specifically, when calling the Encoder's forward method, the returned result is a tuple, while the subsequent code expects to receive a tensor.

### Reproduction

Please use the following code to reproduce the issue
```python
from diffusers.models.autoencoders.vae import Encoder
import torch

encoder = Encoder(
    down_block_types=["DownBlock2D", "DownBlock2D"],
    block_out_channels=[64, 64],
)

encoder(torch.randn(1, 3, 256, 256)).shape
```

**Expected Behavior:**
The Encoder's forward method in `diffusers.models.autoencoders.vae` should return a tensor for further processing.

**Actual Behavior:**
Running the above code results in the following error:
```txt
AttributeError: 'tuple' object has no attribute 'dim'
```

**Additional Information:**
- Error log:
```txt
Traceback (most recent call last):
  File "main.py", line 9, in <module>
    encoder(torch.randn(1, 3, 256, 256)).shape
    ...
  File "python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 172, in forward
    sample = down_block(sample)
    ...
  File "python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 172, in forward
    hidden_states = resnet(hidden_states, temb)
    ...
  File "python3.11/site-packages/diffusers/models/autoencoders/vae.py", line 172, in forward
    hidden_states = self.norm1(hidden_states)
  File "python3.11/site-packages/torch/nn/modules/normalization.py", line 313, in forward
    return F.group_norm(input, self.num_groups, self.weight, self.bias, self.eps)
  File "python3.11/site-packages/torch/nn/functional.py", line 2947, in group_norm
    if input.dim() < 2:
 AttributeError: 'tuple' object has no attribute 'dim'
```
- **Relevant code snippet:**
  - In` diffusers/models/autoencoders/vae.py`, lines 171-173:
  ```python
  for down_block in self.down_blocks:
      sample = down_block(sample)
  ```

  - `DownBlock2D`'s `forward `method declaration:
  ```python
  def forward(
      self, hidden_states: torch.Tensor, temb: Optional[torch.Tensor] = None, *args, **kwargs
  ) -> Tuple[torch.Tensor, Tuple[torch.Tensor, ...]]:
  ```



### Logs

_No response_

### System Info

- 🤗 Diffusers version: 0.31.0
- Platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.39
- Running on Google Colab?: No
- Python version: 3.11.11
- PyTorch version (GPU?): 2.5.1 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Huggingface_hub version: 0.26.5
- Transformers version: 4.47.0
- Accelerate version: 1.2.1
- PEFT version: 0.14.0
- Bitsandbytes version: not installed
- Safetensors version: 0.4.5
- xFormers version: not installed
- Accelerator: NVIDIA GeForce RTX 3090, 24576 MiB
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No

### Who can help?

@DN6 @sayakpaul

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Encoder in diffusers.models.autoencoders.vae's forward method return type mismatch leads to AttributeError #10382

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] Encoder in diffusers.models.autoencoders.vae's forward method return type mismatch leads to AttributeError #10382

Description

Describe the bug

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions