Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why the mAP is lower after distillation in VOC? #88

Open
cmjkqyl opened this issue Jun 11, 2023 · 1 comment
Open

Why the mAP is lower after distillation in VOC? #88

cmjkqyl opened this issue Jun 11, 2023 · 1 comment

Comments

@cmjkqyl
Copy link

cmjkqyl commented Jun 11, 2023

I use retinanet to train and test on VOC dataset. I found that the accuracy of retinanet after training with FGD is even lower than the original retinanet. It seems that dark knowledge has had a negative impact on the network.
A phenomenon related to this is that when I do not use pre-trained weights to initialize the student network, FGD can have a significant improvement in model accuracy (compared to the original retinanet that also turns off pre-training).
The teacher network is rx101. The student network is r50.
The configuration file and training log are as follows. I have modified train.py and detection_distiller.py according to MGD's file.

Can you give me some suggestions about that? Thanks!!

mmdet==2.18
mmcv-full==1.4.3

@xiaobiaodu
Copy link

Hi, when you don't use the pre-trained weight in the student model with FGD, do you obtain better results than you use the pre-trained weight?
I doubt that pre-trained weights will have a negative impact on knowledge distillation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants