MV3D_TF(In progress)

This is an experimental Tensorflow implementation of MV3D - a ConvNet for object detection with Lidar and Mono-camera.

For details about MV3D please refer to the paper Multi-View 3D Object Detection Network for Autonomous Driving by Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia.

Requirements: software

Requirements for Tensorflow 1.0 (see: Tensorflow)
Python packages you might not have: cython, python-opencv, easydict

Requirements: hardware

For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)

Installation

Clone the Faster R-CNN repository

  # Make sure to clone with --recursive
  git clone --recursive https://github.com/RyannnG/MV3D_TF.git

Build the Cython modules
```
 cd $MV3D/lib
 make
```
Downloads KITTI object datasets.

 % Specify KITTI data path so that the structure is like

 % {kitti_dir}/object/training/image_2
 %                            /image_3
 %                            /calib
 %                            /lidar_bv
 %							 /velodyne
       

 % {kitti_dir}/object/testing/image_2
 %                           /image_3
 %                           /calib
 %                           /lidar_bv
 %							/velodyne

Make Lidar Bird View data

# edit the kitti_path in tools/read_lidar.py
# then start make data
python tools/read_lidar.py

Create symlinks for the KITTI dataset

   cd $MV3D/data/KITTI
   ln -s {kitti_dir}/object object

Download pre-trained ImageNet models

Download the pre-trained ImageNet models [Google Drive] [Dropbox]

    mv VGG_imagenet.npy $MV3D/data/pretrain_model/VGG_imagenet.npy

Run script to train model

 cd $MV3D
 ./experiments/scripts/mv3d.sh $DEVICE $DEVICE_ID ${.npy/ckpt.meta} kitti_train

DEVICE is either cpu/gpu

Network Structure

Key idea: Use Lidar bird view to generate anchor boxes, then project those boxes on image to do classification.

Examples

Image and corresponding Lidar map

Note:

In image:

Boxes without regression

In Lidar:

white box: without regression (correspond with image)
purple box: with regression

Existing Errors

Mostly due to regression error

(error in box 5,6,9)

(error in 8, 9, 10)

References

Lidar Birds Eye Views

part.2: Didi Udacity Challenge 2017 — Car and pedestrian Detection using Lidar and RGB

Faster_RCNN_TF

Faster R-CNN caffe version

TFFRCNN

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
data		data
examples		examples
experiments		experiments
lib		lib
tools		tools
.gitignore		.gitignore
For Debug.ipynb		For Debug.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

examples

examples

experiments

experiments

lib

lib

tools

tools

.gitignore

.gitignore

For Debug.ipynb

For Debug.ipynb

LICENSE

LICENSE

README.md

README.md

Repository files navigation

MV3D_TF(In progress)

Requirements: software

Requirements: hardware

Installation

Network Structure

Examples

Existing Errors

References

About

Releases

Packages

Languages

License

Super-Tree/MV3D_TF

Folders and files

Latest commit

History

Repository files navigation

MV3D_TF(In progress)

Requirements: software

Requirements: hardware

Installation

Network Structure

Examples

Existing Errors

References

About

Resources

License

Stars

Watchers

Forks

Languages