-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run my own dataset using the object detection example? #123
Comments
Hi @Coldfire93 , Can you confirm that your converted VOC files work with their reference code ? Also another question; how does |
Hi @yoshitomo-matsubara ,
|
It seems that the problem is produced because I didn't resize the image. I will try to use the "transforms_params" defined in the yaml file to do the resize operation. |
Hi @Coldfire93 , Thank you for confirming that! Since the bug is in the package, I'll soon release a new version of torchdistill so that you can update the package. |
OK. Thanks~ |
Hi @Coldfire93 , |
Hi @yoshitomo-matsubara , But I have checked that the contents in the parameter "targets": |
Could you put your 1) yaml config and 2) executed command in text instead of screenshot? I didn't get such errors when using my config files and example code in this repo
I forgot to answer this; Do you want to use weights of my customized student model? or weights of original Faster R-CNN? |
OK.
2)The executed command: And, I want to use weights of your customized student model. |
The command looks fine; I'd suggest you add Your yaml config file still contains teacher model in training loop and attempts to use head network distillation, which requires teacher model. To use weights of my customized student model, download checkpoints available here and specify the file path in e.g.,
|
Hi, @yoshitomo-matsubara, The log file and error msg is below: |
Hi, @yoshitomo-matsubara , |
Hi @Coldfire93 ,
I found
The teacher model weights are from torchvision, and when I should have asked you this; What training method would you like to try with torchdistill? P.S., |
Hi @yoshitomo-matsubara ,
It solves the issue and starts training. It looks good. Thank you very much. I will learn more about how to config.
I did the experiment, but it seems that the torchvision pretrained model was not loaded. The loss is very large shown below: (torchdistill) songhongguang@elcnlhdc-41-239:~/lwh/torchdistill$ python examples/object_detection.py --config configs/official/coco2017/yoshitomo-matsubara/rrpr2020/ghnd-custom_fasterrcnn_resnet50_fpn_from_fasterrcnn_resnet50_fpn.yaml --log logs/ghnd-custom_fasterrcnn_resnet50_fpn_from_fasterrcnn_resnet50_fpn.log
I want to use torchdistill to do knowledge distillation. My method contains three steps:
I want to compare the performance of the two model trained by step 2 and step 3. The performance of the model trained by step 3 is supposed to be better than that of step 2. I wonder if the above steps are reasonable. Thanks for your patience.
OK. I understand. 😊 |
Hi @Coldfire93 ,
Great to hear that :)
I believe that you're using pretrained teacher model and the loss values you showed above are not that large for GHND (generalized head network distillation) since the loss is the sum of squared errors as shown in Fig. 1 and Eq. (2) of the paper. Note that in the above log says
The step 3 looks built on step 2 like pretrained on coco -> end-to-end training on voc (step 2) -> GHND on voc (step 3). If not and you simply want to see end-to-end training vs. GHND, I'd suggest the following three separate experiments:
so that you can compare the performance of step 2 with that of step 3. Note that the student model at the 3rd experiment is partially initialized with the teacher model obtained through the 1st experiment, not with the student model through the 2nd experiment. To leverage of GHND, you should initialize weights of layers in student at step 2 by those in teacher model fine-tuned to VOC (step 1) as HND and GHND reuse pretrained teacher model's tail portion for that of student model (i.e., the first k layers in student are trained by HND or GHND and all their remaining layers are fixed and identical to those in teacher in terms of architecture and learned params) |
Hi @yoshitomo-matsubara ,
I'm confused about step 2. I thought the student network is designed by you( You modified the structure of the backbone). And there is no corresponding pretrained model in torchvision. Maybe I should learn more about GHND? I‘d like your advice. Thank you~ |
@Coldfire93 |
The size of your teacher model and student model is almost the same(about 160M) . I wonder why? It's expected that the student model is smaller than the teacher model. |
Hi @Coldfire93 ,
The student models in the example are from our ICPR paper (preprint ver.). While the overall student model size is almost the same as teacher model, the student model w/ bottleneck can achieve shorter end-to-end latency by splitting the inference for resource-constrained edge computing systems. Read the above paper for more details.
As described in the torchdistill paper, I did all the experiments for reproducing experimental results reported in prior studies. |
Hi @yoshitomo-matsubara , Actually, I want to get a smaller student model by doing knowledge distillation. Obviously, the ghnd method can not do that. But, the training time is shorten by using the ghnd method. (60hours v.s. 24hours) That's good. Thank you again. |
For object detection, applying knowledge distillation to object detection in end-to-end manner is pretty difficult as I answered at #117 FYI, torchvision recently introduced SSD object detection models. |
Hi @yoshitomo-matsubara , Thank you again. |
Hi @Coldfire93 , |
@Coldfire93 Closing this issue as I haven't seen any follow-up for a while. |
Hi,
I want to do the experiment using other dataset such as VOC dataset. What should I do before executing the examples/object_detection.py script?
I converted the VOC annotation to COCO format. And I modified the yaml configuration file(figure 1) , but got the error in the second figure.
Could you please tell me the reason? Thank you!
figure 1:
figure2:
The text was updated successfully, but these errors were encountered: