the `attention_head_dim` argument for `UNet2DConditionModel`

the `attention_head_dim` in `UNet2DConditionModel` seems to be passed down to `CrossAttnDownBlock2D` and `CrossAttnUpBlock2D` as the number of attention head, instead of the dimension of each attention head 

```python
from diffusers import UNet2DConditionModel

unet = UNet2DConditionModel(
    attention_head_dim = 16)

# this prints 16
unet.down_blocks[0].attentions[0].transformer_blocks[0].attn1.heads

```

this definition is not consistent with other up/down blocks 
```python
down_block_types = ("AttnDownBlock2D",)
up_block_types = ("AttnUpBlock2D",)
unet = UNet2DConditionModel(
    attention_head_dim = 16,
    down_block_types = down_block_types,
    up_block_types = up_block_types)
# this prints 20
unet.down_blocks[0].attentions[0].num_heads
```
Is this intended or not? If not, we can probably swap the position of the 2 arguments passed to `Transformer2DModel` from  `CrossAttnDownBlock2D` - but I'm not sure if there is any config somewhere that needs to be updated accordingly 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

the `attention_head_dim` argument for `UNet2DConditionModel` #2011

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

the attention_head_dim argument for UNet2DConditionModel #2011

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

the `attention_head_dim` argument for `UNet2DConditionModel` #2011