Skip to content

shurans/sscnet

Repository files navigation

Semantic Scene Completion from a Single Depth Image

This repo contains training and testing code for our paper on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation. More information about the project can be found in our paper and project webset

teaser

If you find SSCNet useful in your research, please cite:

@article{song2016ssc,
  author     = {Song, Shuran and Yu, Fisher  and Zeng, Andy and Chang, Angel X and Savva, Manolis and Funkhouser, Thomas},
  title      = {Semantic Scene Completion from a Single Depth Image},
  journal    = {arXiv preprint arXiv:1611.08974},
  year       = {2016},
}

Contents

  1. Organization
  2. Installation
  3. Quick Demo
  4. Testing
  5. Training
  6. Visualization and Evaluation
  7. Data Preparation

Organization

The code and data is organized as follows:

    sscnet
         |-- matlab_code
         |-- caffe_code
                    |-- caffe3d_suncg
                    |-- script
                         |-train
                         |-test   
         |-- data
                |-- depthbin
                    |-- NYUtrain 
                        |-- xxxxx_0000.png
                        |-- xxxxx_0000.bin
                    |-- NYUtest
                    |-- NYUCADtrain
                    |-- NYUCADtest
                    |-- SUNCGtest
                    |-- SUNCGtrain01
                    |-- SUNCGtrain02
                    |-- ...
                |-- eval
                    |-- NYUtest
                    |-- NYUCADtest
                    |-- SUNCGtest
            |-- models
            |-- results

Download

  1. Download the data: download_data.sh (1.1 G) Updated on Sep 27 2017
  2. Download the pretrained models: download_models.sh (9.9M)
  3. [optional] Download the training data: download_suncgTrain.sh (16 G)
  4. [optional] Download the results: download_results.sh (8.2G)

Installation

  1. Software Requirements:

    1. Requirements for Caffe and pycaffe (see: Caffe installation instructions)
    2. Matlab 2016a or above with vision toolbox
    3. OPENCV
  2. Hardware Requirements: at least 12G GPU memory.

  3. Install caffe and pycaffe.

    1. Modify the config files based on your system. You can reference Makefile.config.sscnet_example.
    2. Compile
    cd caffe_code/caffe3d_suncg
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    make -j8 && make pycaffe
  4. Export path

    export LD_LIBRARY_PATH=~/build_master_release/lib:/usr/local/cudnn/v5/lib64:~/anaconda2/lib:$LD_LIBRARY_PATH
    export PYTHONPATH=~/build_master_release/python:$PYTHONPATH

Quick Demo:

cd demo
python demotest_model.py

This demo runs semantic scene compeletion on one NYU depth map using our pretrained model and outputs a '.ply' visulization of the result.

Testing:

  1. Run the testing script cd caffe_code/script/test python test_model.py
  2. The output results will be stored in folder results in .hdf5 format
  3. To test on other testsets (e.g. suncg, nyu, nyucad) you need to modify the paths in “test_model.py”.

Training:

  1. Finetuning on NYU cd caffe_code/train/ftnyu ./train.sh
  2. Training from scratch cd caffe_code/train/trainsuncg ./train.sh
  3. To get more training data from SUNCG, please refer to the SUNCG toolbox

Visualization and Evaluation:

  1. After testing, the results should be stored in folder results/

  2. You can also download our precomputed results: ./download_results.sh

  3. Run the evaluation code in matlab:

    matlab &
    cd matlab_code
    evaluation_script('../results/','nyucad')
  4. The visualization of results will be stored in results/nyucad as “.ply” files.

Data

  1. Data format
    1. Depth map : 16 bit png with bit shifting. Please refer to ./matlab_code/utils/readDepth.m for more information about the depth format.
    2. 3D volume: First three float stores the origin of the 3D volume in world coordinate. Then 16 float of camera pose in world coordinate. Followed by the 3D volume encoded by run-length encoding. Please refer to ./matlab_code/utils/readRLEfile.m for more details.
  2. Example code to convert NYU ground truth data: matlab_code/perpareNYUCADdata.m This function provides an example of how to convert the NYU ground truth from 3D CAD model annotations provided by: Guo, Ruiqi, Chuhang Zou, and Derek Hoiem. "Predicting complete 3d models of indoor scenes." You need to download the original annotations by runing download_UIUCCAD.sh.
  3. Example code to generate testing data without ground truth and room boundary: matlab_code/perpareDataTest.m This function provides an example of how to generate your own testing data without ground truth labels. It will generate a the .bin file with camera pose and an empty volume, without room boundary.

Generating training data from SUNCG

You can generate more training data from SUNCG by following steps:

  1. Download SUNCG data and toolbox from: https://github.com/shurans/SUNCGtoolbox
  2. Compile the toolbox.
  3. Download the voxel data for objects (download_objectvox.sh) and move the folder under SUNCG data directory.
  4. Run the script: genSUNCGdataScript() You may need to modify the following paths:suncgDataPath, SUNCGtoolboxPath, outputdir.

License

Code is released under the MIT License (refer to the LICENSE file for details).