Skip to content

the attention_head_dim argument for UNet2DConditionModel #2011

@yiyixuxu

Description

@yiyixuxu

the attention_head_dim in UNet2DConditionModel seems to be passed down to CrossAttnDownBlock2D and CrossAttnUpBlock2D as the number of attention head, instead of the dimension of each attention head

from diffusers import UNet2DConditionModel

unet = UNet2DConditionModel(
    attention_head_dim = 16)

# this prints 16
unet.down_blocks[0].attentions[0].transformer_blocks[0].attn1.heads

this definition is not consistent with other up/down blocks

down_block_types = ("AttnDownBlock2D",)
up_block_types = ("AttnUpBlock2D",)
unet = UNet2DConditionModel(
    attention_head_dim = 16,
    down_block_types = down_block_types,
    up_block_types = up_block_types)
# this prints 20
unet.down_blocks[0].attentions[0].num_heads

Is this intended or not? If not, we can probably swap the position of the 2 arguments passed to Transformer2DModel from CrossAttnDownBlock2D - but I'm not sure if there is any config somewhere that needs to be updated accordingly

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleIssues that haven't received updates

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions