You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to ask some details about the training process.
CMKD uses pretrained SECOND-net for the Teacher Network.
When training the CMKD's Student network (CMKD Mono),
First, you use ~bev.yaml file to train the model with feature distillation loss
In this process, you don't calculate detection loss? If so, the purpose of this process is only to train BEV image feature to have similar patterns to the BEV lidar feature?
Second, you use ~V2.yaml file to train the model with feature distillation loss + detection loss
This process is used for final 3d object detection? Was 20 epochs enough for training?
+) Also, do you freeze the teacher network(SECOND) and do updates only for the Student network(CMKD Mono) ?
Thank you!
The text was updated successfully, but these errors were encountered:
Hi, for the fisrt question, here is a subsection in our paper explaing for this and you may take a look:
Note that in this setting we load the weights from the teacher model and thus it only needs to be finetuned, and 20 epochs are enouhgh.
For the second question, the anwser is yes. We fix the teacher model as the feature extractor during training.
Hello,
I want to ask some details about the training process.
CMKD uses pretrained SECOND-net for the Teacher Network.
When training the CMKD's Student network (CMKD Mono),
First, you use ~bev.yaml file to train the model with feature distillation loss
Second, you use ~V2.yaml file to train the model with feature distillation loss + detection loss
+) Also, do you freeze the teacher network(SECOND) and do updates only for the Student network(CMKD Mono) ?
Thank you!
The text was updated successfully, but these errors were encountered: