Skip to content

lilanxiao/VoteNet-Keras

Repository files navigation

Introduction

This repository is an unofficial Keras/Tensorflow implementation of VoteNet. (Official GitHub here).

Environments

The code is tested with:

Ubuntu 18.04
Keras 2.2.0
Tensorflow 1.10
CUDA 9.2

*For some personal reason I have to develop with such old version Keras and Tensorflow

The following Python dependencies are also required:

numpy
scipy
open3d # for visualization

Compile Custom Tensorflow Operators

The VoteNet backbone uses some custom TF operators. The corresponding code is copied from PointNet++ repo. To compile then, run following scripts.

tf_ops/3d_interpolation/tf_interpolate_compile.sh
tf_ops/grouping/tf_grouping_compile.sh
tf_ops/sampling/tf_sampling_compile.sh

Configure TF_ROOT and CUDA_ROOT according to your own environment. If necessary, read instruction in PointNet++ repo for reference.

Prepare the Training Data

This repo ONLY implements the SUN RGB-D version of VoteNet. Follow the instruction in the official VoteNet repo here to prepare the training data. Move generated data and labels in folder data. The folder should have the following subfolders:

sunrgbd/sunrgbd_pc_bbox_votes_50k_v1_train
sunrgbd/sunrgbd_pc_bbox_votes_50k_v1_val
sunrgbd/sunrgbd_pc_bbox_votes_50k_v2_train
sunrgbd/sunrgbd_pc_bbox_votes_50k_v2_val
sunrgbd/sunrgbd_trainval

*Optional: to train the Keras implementation of PointNet++, download ModelNet40 from here and move the data to folder data.

Evaluate Pretrained Model

Run following script to evaluate VoteNet. The pretraind weight in folder logs\votenet will be loaded automatically. NOTE: this model is trained on label v2.

python evaluate_votenet.py

The inferencing result of the first batch will be visualized via Open3d. To close the pop out window, press ESC. After that, the model will be evaluated on the whole validation set and the AP and AR will be printed.

Here is an example. The red and green boxes illustrate predicted and ground truth bounding box respectively. The red, green and blue points represent seeds, votes and object centers, respectively.

prediction_1

To view more options, run python evaluate_votenet.py -h

-h, --help            show this help message and exit
--log_dir LOG_DIR     Path to load the pretrained checkpoints [default
                    logs/votenet]
--checkpoint CHECKPOINT
                    File name of the check point used for inferencing.
                    Load last check point in log_dir if not specified
                    [default None]
--num_points NUM_POINTS
                    Number of input points [default: 20000]
--batch_size BATCH_SIZE
                    Batch size [default: 8]
--vote_factor VOTE_FACTOR
                    Numbers of votes each seed generates [default: 1]
--num_proposals NUM_PROPOSALS
                    Number of proposals [default: 256]
--conf_thresh CONF_THRESH
                    Confidence threshhold for NMS [default 0.05]
--nms_iou NMS_IOU     IoU threshhold for NMS [default 0.25]
--ap_iou AP_IOU       IoU threshhold for calculating the AP [default 0.25]
--thresh_viz THRESH_VIZ
                    Confidence threshold for visualization [default 0.5
--no_viz              NOT use the visualization [default False]
--use_color           Use RGB color as features
--no_height           Do NOT use height as features
--use_v1              Use v1 labels for SUN RGB-D dataset
--random_proposal     Use random sampling instead of FPS in proposal module
                    [default False]

Train from Scratch

Run following script to train VoteNet on SUN-RGB-D. The model needs about 12 hours (with multiprocessing in data generating) to train on a single Nvidia RTX 2080Ti. Since several loss values are displayed during the training, it's recommanded to maximize the terminal. Otherwise the display would be unstable and ugly.

python train_votenet.py

Run following script to train PointNet++ on ModelNet40

python train_poinet2.py

To check the hyperparameters and other options, run

python train_votenet.py -h

or

python train_pointnet2.py -h 

for more details.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{qi2019deep,
    author = {Qi, Charles R and Litany, Or and He, Kaiming and Guibas, Leonidas J},
    title = {Deep Hough Voting for 3D Object Detection in Point Clouds},
    booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
    year = {2019}
}

@article{qi2017pointnetplusplus,
  title={PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space},
  author={Qi, Charles R and Yi, Li and Su, Hao and Guibas, Leonidas J},
  journal={arXiv preprint arXiv:1706.02413},
  year={2017}
}

Known Issues

1.Performance Gap

I've fixed some bugs and changed the hyperparameters a little bit. The pretrained model can make decent predictions but the overall performance is still a liitle worse than the original implementation (59.9% mAP@0.25IoU with v2 label). Here is the best result I get:

With threshhold of 0.25 3DIoU :
eval bed Average Precision: 0.845506
eval table Average Precision: 0.497235
eval sofa Average Precision: 0.665012
eval chair Average Precision: 0.777190
eval toilet Average Precision: 0.863728
eval desk Average Precision: 0.258113
eval dresser Average Precision: 0.266407
eval night_stand Average Precision: 0.598512
eval bookshelf Average Precision: 0.305829
eval bathtub Average Precision: 0.675479
eval mAP: 0.575301
eval bed Recall: 0.951456
eval table Recall: 0.860307
eval sofa Recall: 0.913876
eval chair Recall: 0.897664
eval toilet Recall: 0.972414
eval desk Recall: 0.832094
eval dresser Recall: 0.743119
eval night_stand Recall: 0.898039
eval bookshelf Recall: 0.677305
eval bathtub Recall: 0.816327
eval AR: 0.856260

Any suggestions and insights about this issue would be welcome.

2.Duplicate random number in date augmentation with multiprocessing

The SunrgbdDetectionVotesDataset gets duplicate random number when use_multiprocessing=True, which makes the data augmentation not that "random" and might drop the performance. The model needs more time to train without multiprocessing. The usage of GPU is low especially in the validation phase.

with multiprocessing: ~250 s/epoch
without multiprocessing: ~320 s/epoch
Hardware: Nvidia RTX 2080Ti, Intel i7-8700

This is a known issue in NumPy according to here. Pytorch has worker_init_fn in Dataloader which allows resetting of random seed. But Keras doesn't has the equivalence.