Skip to content
Branch: master
Find file History
cclauss and karmel Typo: resnet_v2_50_path --> rgb_resnet_v2_50_path (#3273)
__rgb_resnet_v2_50_path__ is defined on line 28.
Latest commit 4d1f67c Feb 25, 2018

Cognitive Mapping and Planning for Visual Navigation

Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik

Computer Vision and Pattern Recognition (CVPR) 2017.

ArXiv, Project Website


If you find this code base and models useful in your research, please consider citing the following paper:

  title={Cognitive Mapping and Planning for Visual Navigation},
  author={Gupta, Saurabh and Davidson, James and Levine, Sergey and
    Sukthankar, Rahul and Malik, Jitendra},


  1. Requirements: software
  2. Requirements: data
  3. Test Pre-trained Models
  4. Train your Own Models

Requirements: software

  1. Python Virtual Env Setup: All code is implemented in Python but depends on a small number of python packages and a couple of C libraries. We recommend using virtual environment for installing these python packages and python bindings for these C libraries.

    pip install virtualenv
    virtualenv $VENV_DIR
    source $VENV_DIR/bin/activate
    # You may need to upgrade pip for installing openv-python.
    pip install --upgrade pip
    # Install simple dependencies.
    pip install -r requirements.txt
    # Patch bugs in dependencies.
    sh patches/
  2. Install Tensorflow inside this virtual environment. You will need to use one of the latest nightly builds (see instructions here).

  3. Swiftshader: We use Swiftshader, a CPU based renderer to render the meshes. It is possible to use other renderers, replace SwiftshaderRenderer in render/ with bindings to your renderer.

    mkdir -p deps
    git clone --recursive deps/swiftshader-src
    cd deps/swiftshader-src && git checkout 91da6b00584afd7dcaed66da88e2b617429b3950
    mkdir build && cd build && cmake .. && make -j 16 libEGL libGLESv2
    cd ../../../
    cp deps/swiftshader-src/build/libEGL*
    cp deps/swiftshader-src/build/libGLESv2*
  4. PyAssimp: We use PyAssimp to load meshes. It is possible to use other libraries to load meshes, replace Shape render/ with bindings to your library for loading meshes.

    mkdir -p deps
    git clone deps/assimp-src
    cd deps/assimp-src
    git checkout 2afeddd5cb63d14bc77b53740b38a54a97d94ee8
    cmake CMakeLists.txt -G 'Unix Makefiles' && make -j 16
    cd port/PyAssimp && python install
    cd ../../../..
    cp deps/assimp-src/lib/libassimp* .
  5. graph-tool: We use graph-tool library for graph processing.

    mkdir -p deps
    # If the following git clone command fails, you can also download the source
    # from
    git clone deps/graph-tool-src
    cd deps/graph-tool-src && git checkout 178add3a571feb6666f4f119027705d95d2951ab
    ./configure --disable-cairo --disable-sparsehash --prefix=$HOME/.local
    make -j 16
    make install
    cd ../../

Requirements: data

  1. Download the Stanford 3D Indoor Spaces Dataset (S3DIS Dataset) and ImageNet Pre-trained models for initializing different models. Follow instructions in data/

Test Pre-trained Models

  1. Download pre-trained models. See output/

  2. Test models using scripts/

Train Your Own Models

All models were trained asynchronously with 16 workers each worker using data from a single floor. The default hyper-parameters correspond to this setting. See distributed training with Tensorflow for setting up distributed training. Training with a single worker is possible with the current code base but will require some minor changes to allow each worker to load all training environments.


For questions or issues open an issue on the tensorflow/models issues tracker. Please assign issues to @s-gupta.


This code was written by Saurabh Gupta (@s-gupta).

You can’t perform that action at this time.