Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Consistency Loss #13

Closed
michaelku1 opened this issue Jul 12, 2022 · 5 comments
Closed

Implementation of Consistency Loss #13

michaelku1 opened this issue Jul 12, 2022 · 5 comments

Comments

@michaelku1
Copy link

michaelku1 commented Jul 12, 2022

Hello, so I noticed that there is a slight difference between the paper's fornulation of consistency loss and the actual implementation. Please correct me if I am wrong. As can be seen, the actual formulation are as follows:

擷取1

擷取

However, I noticed that for the final consistency loss, the actual implementation uses a sum instead of an average across layers as indicated in equation 12 (weight_dict for this type of loss is 1 by defualt, therefore notion of average is perhaps not incorporated). As for the per layer consistency loss, the actual implementation uses an average over all M object queries instead of just the sum, as indicated in equation 13.

擷取

擷取1

I am wondering what are the right formulations to follow? Thanks.

@encounter1997
Copy link
Owner

encounter1997 commented Jul 13, 2022

Hi, as shown in Eq. 12, the loss is divided by the number of decoder layers, which equals len(pred_boxes_all).

As for the summation in Eq. 13, we are sorry for the mistake, it should be an average over M predictions in an individual decoder layer.

@encounter1997
Copy link
Owner

Thanks for pointing it out, we will correct it and upload a new file to Arxiv.

@michaelku1
Copy link
Author

Thanks for your reply. For equation 12, it shows that it is normalised over number of decoder layers, but I think len(pred_boxes_all) gives the number of object queries (batch_size*300), which is suggested here:

截圖 2022-07-13 下午1 49 02

@encounter1997
Copy link
Owner

encounter1997 commented Jul 13, 2022

Each element in the list pred_boxes_all has a shape of (B*300, 4). Afterward, they are stacked into a tensor with a shape of (num_layers, B*300, 4), as shown in this line.

@michaelku1
Copy link
Author

michaelku1 commented Jul 13, 2022

I think you are right. Sorry I have not run the code yet myself so was just reading along. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants