Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about provided conditional detr model #31

Closed
xz-123-new opened this issue Oct 15, 2022 · 3 comments
Closed

questions about provided conditional detr model #31

xz-123-new opened this issue Oct 15, 2022 · 3 comments

Comments

@xz-123-new
Copy link

Thanks for your excellent work!
I have questions about your provided model.In the provided conditional detr model"conditional detr resnet50",the transformer.decoder.layer.cross_attn.out_proj.weight/bias is of dimension of 256x256 and 256 seperately,but since the input of this cross attention is the concatenation of two 256-d query, it seems should be 512x512 and 512.It really confuses me.Looking forward to your help,thanks!

@charlesCXK
Copy link
Member

Hi,
the function out_proj is applied to the value (

v = self.ca_v_proj(memory)
) which is 256-d.

@xz-123-new
Copy link
Author

Sorry to disturb you again.My question is about the out_proj of cross_attn,i.e.
self.cross_attn = nn.MultiheadAttention(d_model * 2, nhead, dropout=dropout, vdim=d_model)
instead of your mentioned out_proj .In the source code of nn.MultiheadAttention,the out_proj is set as
self.out_proj = NonDynamicallyQuantizableLinear(embed_dim, embed_dim, bias=bias, **factory_kwargs),
where in your code, the embed_dim is set as d_model*2,i.e. 512,so i think the out_proj seems should be 512-d instead of 256-d,but the model provided is all 256-d.

@xz-123-new
Copy link
Author

sorry i wrongly import Multiheadattention from torch.nn instead of your modified version.Now the problem is solved,thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants