Skip to content
PlaneRCNN detects and reconstructs piece-wise planar surfaces from a single RGB image
Branch: master
Clone or download
Latest commit e773eb3 Jun 15, 2019

License CC BY-NC-SA 4.0 Python 3.7

PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image

alt text

By Chen Liu, Kihwan Kim, Jinwei Gu, Yasutaka Furukawa, and Jan Kautz

This paper will be presented (Oral) in IEEE CVPR 2019.


This paper proposes a deep neural architecture, PlaneR-CNN, that detects arbitrary number of planes, and reconstructs piecewise planar surfaces from a single RGB image. For more details, please refer to our paper and video, or visit project website. The code is implemented using PyTorch.

Project members


Copyright (c) 2018 NVIDIA Corp. All Rights Reserved. This work is licensed under the Creative Commons Attribution NonCommercial ShareAlike 4.0 License.

Getting Started

Clone repository:

git clone

Please use Python 3. Create an Anaconda environment and install the dependencies:

conda create --name planercnn
conda activate planercnn
conda install -y pytorch=0.4.1
conda install pip
pip install -r requirements.txt

Equivalently, you can use Python virtual environment to manage the dependencies:

pip install virtualenv
python -m virtualenv planercnn
source planercnn/bin/activate
pip install -r requirements.txt

Now, we compile nms and roialign as explained in the installation section of pytorch-mask-rcnn. To be specific, you can build these two functions using the following commands with the right --arch option:

GPU arch
TitanX sm_52
GTX 960M sm_50
GTX 1070 sm_61
GTX 1080 (Ti), Titan XP sm_61

More details of the compute capability are shown in NVIDIA

cd nms/src/cuda/
nvcc -c -o -x cu -Xcompiler -fPIC -arch=[arch]
cd ../../
cd ../

cd roialign/roi_align/src/cuda/
nvcc -c -o -x cu -Xcompiler -fPIC -arch=[arch]
cd ../../
cd ../../

Please note that, the Mask R-CNN backbone does not support cuda10.0 and gcc versions higher than 6.


Models are saved under checkpoint/. You can download our trained model from here, and put it under checkpoint/ if you want to fine-tune it or run inferences.

Run the inference code with an example

python --methods=f --suffix=warping_refine --dataset=inference --customDataFolder=example_images

Results are saved under "test/inference/". Besides visualizations, plane parameters (#planes x 3) are saved in "*_plane_parameters_0.npy" and plane masks (#planes x 480 x 640) are saved in "*_plane_masks_0.npy".

Using custom data

Please put your images (.png or .jpg files), and camera intrinsics under a folder ($YOUR_IMAGE_FOLDER). The camera parameters should be put under a .txt file with 6 values (fx, fy, cx, cy, image_width, image_height) separately by a space. If the camera intrinsics is the same for all images, please put these parameters in camera.txt. Otherwise, please add a separate intrinsics file for each image, and name it the same with the image (changing the file extension to .txt). And then run:

python --methods=f --suffix=warping_refine --dataset=inference --customDataFolder=$YOUR_IMAGE_FOLDER


Training data preparation

Please first download the ScanNet dataset (v2), unzip it to "$ROOT_FOLDER/scans/", and extract image frames from the .sens file using the official reader.

Then download our plane annotation from here, and merge the "scans/" folder with "$ROOT_FOLDER/scans/". (If you prefer other locations, please change the paths in datasets/

After the above steps, ground truth plane annotations are stored under "$ROOT_FOLDER/scans/scene*/annotation/". Among the annotations, planes.npy stores the plane parameters which are represented in the global frame. Plane segmentation for each image view is stored under segmentation/.

To generate such training data on your own, please refer to data_prep/ Please refer to the README under data_prep/ for compilation.

Besides scene-specific annotation under each scene folder, please download global metadata from here, and unzip it to "$ROOT_FOLDER". Metadata includes the normal anchors ( and invalid image indices caused by tracking issues (invalid_indices_*.txt).

Training script

python --restore=2 --suffix=warping_refine


- 0: training from scratch (not tested)
- 1 (default): resume training from saved checkpoint
- 2: training from pre-trained mask-rcnn model

--suffix (the below arguments can be concatenated):
- '': training the basic version
- 'warping': with the warping loss
- 'refine': with the refinement network
- 'refine_only': train only the refinement work
- 'warping_refine_after': add the warping loss after the refinement network instead of appending both independently

- 'normal' (default): regress normal using 7 anchors
- 'normal[k]' (e.g., normal5): regress normal using k anchors, normal0 will regress normal directly without anchors
- 'joint': regress final plane parameters directly instead of predicting normals and depthmap separately

Temporary results are written under test/ for debugging purposes.


To evaluate the performance against existing methods, please run:

python --methods=f --suffix=warping_refine


- f: evaluate PlaneRCNN (use --suffix and --anchorType to specify configuration as explained above)
- p: evaluate PlaneNet
- e: evaluate PlaneRecover
- t: evaluate MWS (--suffix=gt for MWS-G)

Statistics are printed in terminal and saved in logs/global.txt for later analysis.

Note that PlaneNet and PlaneRecover are under the MIT license.


If you have any questions, please contact the primary author Chen Liu <>, or Kihwan Kim <>.


Our implementation uses the nms/roialign from the Mask R-CNN implementation from pytorch-mask-rcnn, which is licensed under MIT License

You can’t perform that action at this time.