Skip to content
Motion R-CNN: Mask R-CNN with support for 3D motion estimation (prototype)
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
devkit update Nov 7, 2017
object_detection update Jan 8, 2018
slim don't attempt to load initial conv weights if we have 6 channel input… Oct 23, 2017
.gitignore Add a .gitignore file. (#164) Jun 2, 2016
LICENSE cleanup files Oct 6, 2017 Improve project description Aug 13, 2018
motion-rcnn.pdf update Jan 8, 2018 add FPN model Oct 7, 2017

Motion R-CNN

This repository contains the TensorFlow prototype implementation of my bachelor thesis Motion R-CNN: Instance-level 3D Motion Estimation with Region-based CNNs.

In addition to the functionality provided by the TensorFlow Object Detection API (at the time of writing), the code supports:

  • prediction of instance masks for detected objects
  • Feature Pyramid Networks
  • prediction of inter-frame 3D camera ego-motion (translation and rotation) given a second image temporally consecutive to the first image
  • prediction of 3D motions (translation and rotation) between the two frames for all objects detected in the first frame

Note that the code only supports training on the Virtual KITTI dataset, but it is easy to adapt it to other datasets. Motion prediction and frame pair input is fully optional and the code can be used as a Mask R-CNN implementation with single image input. Support for cityscapes is implemented, but using the records created with may require adapting the data_decoder or the record writing as the record interface changed.


Motion R-CNN is released under the MIT License (refer to the LICENSE file for details).



  • tensorflow (>= 1.3.0) with GPU support.
  • sudo apt-get install protobuf-compiler
  • pip install opencv-python pandas pillow lxml matplotlib


  • from the project root directory, run export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
  • download and extract the pre-trained ResNet-50 model to object_detection/data
  • download all of the Virtual KITTI ground truth and extract the folders into a directory named vkitti.
  • cd to the project root directory
  • run protoc object_detection/protos/*.proto --python_out=.
  • run python --data_dir=<data_parent_dir> --output_dir=data/records --set val
  • run python --data_dir=<data_parent_dir> --output_dir=data/records --set train

Note that <data_parent_dir> is the directory containing the vkitti directory.

Training & evaluating


  • python --logtostderr --pipeline_config_path=data/configs/motion_rcnn_vkitti_cam.config --train_dir=output/train/motion_rcnn_vkitti_cam --gpu 0
  • python --logtostderr --pipeline_config_path=data/configs/motion_rcnn_vkitti_cam.config --checkpoint_dir=output/train/motion_rcnn_vkitti_cam --eval_dir=output/eval/motion_rcnn_vkitti_cam

to train and evaluate a model with camera and instance motion prediction. You can adapt the configurations found in data/configs/. For a description of the configuration parameters, see object_detection/protos.

Navigating the code

The following files were added or modified from the original Object Detection API code

Additionally, some proto params and builders were modified, and extensions were made to,,,,

The following tests were added or modified:

  • object_detection.core.box_predictor_test
  • object_detection.core.post_processing_test
  • object_detection.models.faster_rcnn_resnet_v1_feature_extractor_test
  • object_detection.models.faster_rcnn_resnet_v1_fpn_feature_extractor_test
  • object_detection.anchor_generators.multiple_grid_anchor_generator_test
  • object_detection.meta_architectures.faster_rcnn_meta_arch_test


This repository is based on the TensorFlow Object Detection API.

You can’t perform that action at this time.