New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A question about model training #22
Comments
Removing classification loss is training now. |
Thanks for your reply. But I can't quite understand this sentence. Please let me express my questions more clearly. Currently the implementation of
|
I'm trying an experiment without cls_loss. BTW, why does KD method remove GT annotation? Is there any literature? You can disable cls_loss, bbox_loss and DFL of course, however, the label assignment still leverages the GT information (i.e., decide where to distill). If you remove these three losses and you use the full map locations to distill, then no GT information will be used. But notice that even if you do so, the teacher detector was trained with GT annotation. |
The above training settings are bbox_loss on positive locations, and cls KD on full map locations. Removing cls_loss causes significant AP drops (7.4 points). |
All my doubts are cleared. Thanks a lot! |
Hello, thanks for such a wonderful work. After reading this paper, I have a question regarding model training.
According to the codes, ground truth annotations are still necessary during knowledge distillation to compute classfication loss, regression loss and DFL. The paper mentions that removing regression loss and DFL only causes a small decrease in mAP.
But if I want to completely get rid of the dependency on ground truth labels, how should I deal with the classfication loss? Does this term affects the final mAP a lot? Would you please share some insights or results about this?
The text was updated successfully, but these errors were encountered: