Our work is implemented on the RTM3D open source code. KM3D, RTM3D
All experiments are tested with Ubuntu 20.04, Pytorch 1.0.0, CUDA 10.0, Python 3.6, single NVIDIA 1070
Please refer to INSTALL.md
Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:
PlaneCons
├── kitti_format
│ ├── data
│ │ ├── kitti
│ │ | ├── annotations_split_1 / kitti_train_4_points.json
│ │ | ├── annotations_split_2 /
│ │ | ├── training
│ │ | | ├── calib /000000.txt .....
│ │ | | ├── image_2 /000000.png.....007480.png
│ │ | | ├── label_2 /000000.txt.....007480.txt
│ │ | ├── testing
│ │ | | ├── calib /000000.txt .....
│ │ | | ├── image_2 /000000.png.....007480.png
│ │ | | ├── image_3 /000000.png.....007480.png
| | | ├── train.txt val.txt train_split_1.txt val_split_1.txt
├── src
├── demo_kitti_format
├── readme
├── requirements.txt
cd ./src/tools
python kitti.py
You can modify the resolution parameter(you can choose [0.5,1,2] three different values) in the kitti.py file to get preprocessed labels with different number of key points. We set the distance between the vertices to 2.
When resolution=2, the vertices of the 3D box are obtained.
When resolution=1, one key point is taken for each plane of the 3D box at an interval of 1 unit, i.e. 9 key points.
When resolution=0.5, one key point is taken for each plane of the 3D box at an interval of 1/4 unit.
When you are going to train different numbers of keypoint models, first you must set the n_num_joints parameter in the kittihp.py and set the corresponding network parameters in the opts.py file.
then Run following command to train model with ResNet-18 backbone.
sh ./train_res.sh
Run following command to train model with DLA-34 backbone.
sh ./train_dla.sh
Run following command for results generation.
sh ./generate.sh
Run following command for evaluation.
sh ./eval.sh
Run following command for visualization.
sh ./vis.sh
You can modify the parameters in the .sh file
We provide our pre-trained model with the following link to Google Cloud Drive:Google Cloud Drive
Plane Constraints is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from, KM3D,CenterNet, dla (DLA network), DCNv2(deformable convolutions), iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).
If you find this project useful for your research, please use the following BibTeX entry.
@article{yao2023vertex,
title={Vertex points are not enough: Monocular 3D object detection via intra-and inter-plane constraints},
author={Yao, Hongdou and Chen, Jun and Wang, Zheng and Wang, Xiao and Chai, Xiaoyu and Qiu, Yansheng and Han, Pengfei},
journal={Neural Networks},
volume={162},
pages={350--358},
year={2023},
publisher={Elsevier}
}