Training process #14

ksh11023 · 2023-01-13T05:24:58Z

Hello,

I want to ask some details about the training process.

CMKD uses pretrained SECOND-net for the Teacher Network.

When training the CMKD's Student network (CMKD Mono),
First, you use ~bev.yaml file to train the model with feature distillation loss

In this process, you don't calculate detection loss? If so, the purpose of this process is only to train BEV image feature to have similar patterns to the BEV lidar feature?

Second, you use ~V2.yaml file to train the model with feature distillation loss + detection loss

This process is used for final 3d object detection? Was 20 epochs enough for training?

+) Also, do you freeze the teacher network(SECOND) and do updates only for the Student network(CMKD Mono) ?

Thank you!

Cc-Hy · 2023-01-13T10:58:56Z

Hi, for the fisrt question, here is a subsection in our paper explaing for this and you may take a look:

Note that in this setting we load the weights from the teacher model and thus it only needs to be finetuned, and 20 epochs are enouhgh.
For the second question, the anwser is yes. We fix the teacher model as the feature extractor during training.

Cc-Hy mentioned this issue Jan 13, 2023

About cfg flies #13

Closed

Cc-Hy closed this as completed Jan 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training process #14

Training process #14

ksh11023 commented Jan 13, 2023

Cc-Hy commented Jan 13, 2023

Training process #14

Training process #14

Comments

ksh11023 commented Jan 13, 2023

Cc-Hy commented Jan 13, 2023