-
Notifications
You must be signed in to change notification settings - Fork 6.5k
correct attention_head_dim for JointTransformerBlock
#8608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| dim=self.inner_dim, | ||
| num_attention_heads=num_attention_heads, | ||
| attention_head_dim=self.inner_dim, | ||
| attention_head_dim=attention_head_dim, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also be self.config.attention_head_dim to match transformer_sd3.py?
| query_dim=dim, | ||
| cross_attention_dim=None, | ||
| added_kv_proj_dim=dim, | ||
| dim_head=attention_head_dim // num_attention_heads, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't break? Wouldn't the value of dim_head be computed differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well no
currently dim_head=attention_head_dim // num_attention_heads with attention_head_dim and num_attention_heads passed from SD3ControlNetModel like this
* attention_head_dim=self.inner_dim
| attention_head_dim=self.inner_dim, |
*
self.inner_dim = num_attention_heads * attention_head_dim | self.inner_dim = self.config.num_attention_heads * self.config.attention_head_dim |
* -> so basically
attention_head_dim is num_attention_heads * attention_head_dim*
num_attention_heads is num_attention_heads* -> so
dim_heads here are just attention_head_dim we used to configure the model, and if we pass it down correctly, we can use it directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I see. Thanks for the explaining! 🙏🏽
* add * update sd3 controlnet * Update src/diffusers/models/controlnet_sd3.py --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
No description provided.