This is the official implementation of MonoTAKD which utilizes OpenPCDet for the KITTI dataset.
MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection (arXiv, Sup. included)
[2025/2/27]: MonoTAKD has been accepted by CVPR 2025 🔥🔥🔥
- Release code and pre-trained models for the KITTI dataset.
- Visualization utils are provided to visualize detection results in both camera perspective and BEV perspective.
- MonoTAKD DEMO images & videos are included in this release.
Notice: Due to the short schedule, instructions and pre-trained models will be released gradually in the near future. Please let us know if there are any issues and bugs.
- Detection in CAMERA perspective
- Detection with CAMERA & BEV Side-By-Side
AP_3D performance on the KITTI test set for the car category.
| Teacher | TA | Student | Easy | Moderate | Hard | |
|---|---|---|---|---|---|---|
| MonoTAKD | SECOND | CaDDN | model | 27.91 | 19.43 | 16.51 |
| MonoTAKD_Raw | SECOND | CaDDN | model | 29.86 | 21.26 | 18.27 |
| NDS | mAP | |
|---|---|---|
| BEVFormer-R50 + TAKD | 49.0 | 39.2 |
| BEVFormer-R101 + TAKD | 55.8 | 45.1 |
| BEVDepth-R50 + TAKD | 53.7 | 43.0 |
| BEVDepth-R101 + TAKD | 56.4 | 46.6 |
Please follow INSTALL to install MonoTAKD.
Please follow GETTING_START to train or evaluate the models.
Please follow VISUALIZE_DETECTION to draw detection bounding boxes onto 3D perspective view and BEV view.
Please follow KITTI_TEST_UPLOAD_GUIDELINES to upload to KITTI Benchmark for evaluation.



