Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the KD Loss on the RetinaNet One-Stage Object Detectors #4

Closed
JCZ404 opened this issue Nov 25, 2022 · 2 comments
Closed

About the KD Loss on the RetinaNet One-Stage Object Detectors #4

JCZ404 opened this issue Nov 25, 2022 · 2 comments

Comments

@JCZ404
Copy link

JCZ404 commented Nov 25, 2022

Hi~, Thanks for such great work! I saw you released the baseline performance of the vanilla KD on the one-stage detector RetinaNet, I wonder how this method is applied. Since the classification prediction of RetinaNet is activated by sigmoid and formulated as multiple binary classification problems solved with Focal Loss, it seems we can not use the vanilla KD on these classification outputs. The output processed by sigmoid, for example: [0.4, 0.7, 0.3, 0.2], is not sum up to 1, obviously. So, how the vanilla KD with KLDiv loss is applied under such a situation? Thanks.

@hunto
Copy link
Owner

hunto commented Nov 26, 2022

Hi @Zhangjiacheng144 ,

Thanks for your attention of our work. The distillation on sigmoid probabilistic distributions can also be conducted using KL divergence loss (see Kullback–Leibler divergence, it does not require the sum of the vector to be 1). We use Equation (1) as the form of sigmoid KD loss in our RetinaNet experiments and only replace the softmax function to sigmoid.

BTW, for detectors using sigmoid focal loss, it is more practical to use the above sigmoid KL divergence loss with focal weight.

@JCZ404
Copy link
Author

JCZ404 commented Jan 4, 2023

Ok, I got it, thanks for your kind reply!

@JCZ404 JCZ404 closed this as completed Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants