The diagnoal matrix meaning? #16

JosonChan1998 · 2022-03-15T12:34:39Z

Hi, thank your nice work about Transformer in Object Detection. But I have some questions when reading the paper and code. I hope you can give me some answers。

What 's the insight of the pos_transformation T in 3.3 ?
What 's the meaning about diagonal vector \lamda q described in 3.3. And I don't find the code about the diagonal operator in this repo. And i just find the pos_transformation just generated by learnable weights :

ConditionalDETR/models/transformer.py

Line 151 in 0b04a85

pos_transformation = self.query_scale(output)
I can't figure out the difference bewteen "Block" , "Full" and "Diagonal" in Fig5.

The above are all my questions. I sincerely hope I can get your help. Thanks！

DeppMeng · 2022-04-01T02:25:45Z

Sorry for the late reply.

About T. T is a learnable linear projection. It is obtained by applying a FFN on decoder embedding f. Since f contains displacement information of the distinct regions w.r.t the reference point, so we expect T to be a displacement transformation in p embedding space. T could be a full matrix, a block matrix, or a diagonal matrix. We empirically studied these types of matrices and choose the diagonal option.
\lambda_q is the diagonal elements of matrix T. It is `pos_transformation' in our code.
For the details please refer to paragraph ``The effect of linear projections T forming the transfor- mation.'' in our paper.

JosonChan1998 · 2022-04-13T02:59:04Z

Thanks for your reply!

WYHZQ · 2022-12-01T08:51:32Z

Sorry for the late reply.

1. About T. T is a learnable linear projection. It is obtained by applying a FFN on decoder embedding f. Since f contains displacement information of the distinct regions w.r.t the reference point, so we expect T to be a displacement transformation in p embedding space. T could be a full matrix, a block matrix, or a diagonal matrix. We empirically studied these types of matrices and choose the diagonal option.

2. \lambda_q is the diagonal elements of matrix T. It is `pos_transformation' in our code.

3. For the details please refer to paragraph ``The effect of linear projections T forming the transfor- mation.'' in our paper.

Thank you for your reply. You said lamq is the diagonal element of matrix A. But the "pos_transformation" obtained after FFN does not extract diagonal elements, but directly performs point multiplication with "query_sine_embedded", that is, "query_sine_embedded=query_sine_embedded * pos_transformation". Can you explain the principle?

Vincent-luo · 2023-10-30T09:32:21Z

@WYHZQ Have you figured it out? I have the same confusion.

JosonChan1998 closed this as completed Apr 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The diagnoal matrix meaning? #16

The diagnoal matrix meaning? #16

JosonChan1998 commented Mar 15, 2022

DeppMeng commented Apr 1, 2022

JosonChan1998 commented Apr 13, 2022

WYHZQ commented Dec 1, 2022

Vincent-luo commented Oct 30, 2023

The diagnoal matrix meaning? #16

The diagnoal matrix meaning? #16

Comments

JosonChan1998 commented Mar 15, 2022

DeppMeng commented Apr 1, 2022

JosonChan1998 commented Apr 13, 2022

WYHZQ commented Dec 1, 2022

Vincent-luo commented Oct 30, 2023