Skip to content
Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation in CVPR 2018 Spotlight
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
StructuredAttentionDepthEstimation Update gen_deploy_prototxt.py Nov 30, 2018
cmake first commit Aug 10, 2018
docker first commit Aug 10, 2018
docs first commit Aug 10, 2018
examples first commit Aug 10, 2018
figures adding a new table Aug 13, 2018
include/caffe first commit Aug 10, 2018
matlab first commit Aug 10, 2018
python first commit Aug 10, 2018
scripts first commit Aug 10, 2018
src first commit Aug 10, 2018
tools first commit Aug 10, 2018
.Doxyfile first commit Aug 10, 2018
.gitignore first commit Aug 10, 2018
.travis.yml first commit Aug 10, 2018
CMakeLists.txt first commit Aug 10, 2018
CONTRIBUTING.md first commit Aug 10, 2018
CONTRIBUTORS.md first commit Aug 10, 2018
INSTALL.md first commit Aug 10, 2018
LICENSE first commit Aug 10, 2018
Makefile first commit Aug 10, 2018
Makefile.config.example first commit Aug 10, 2018
README.MD adding Pytorch implementation link May 6, 2019
caffe.cloc first commit Aug 10, 2018
install.sh first commit Aug 10, 2018

README.MD

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation-CVPR 2018 Spotlight

The repository is an official implementation for the paper.
Links: [Paper][Oral Presentation]
By Dan Xu, Wei Wang, Hao Tang, Hong Liu, Nicu Sebe, Elisa Ricci

Installation & Setup

The code is implemented based on the Caffe framework. Please first download and install the modified caffe version. The code is tested on CUDA 8.0, cudnn 5.1, and python 2.7. The installation can follow the following instructions:
First clone the repository:

git clone https://github.com/danxuhk/StructuredAttentionDepthEstimation.git 

Then build caffe and pycaffe:

cd $Caffe_ROOT
cp Makefile.config.example Makefile.config
vim Makefile.config ### changing neccessary lines to add dependancy
sh install.sh

Data Preparation

First download KITTI raw data from the official website http://www.cvlibs.net/datasets/kitti/ to the folder ./StructuredAttentionDepthEstimation/data/KITTI. To generate the training data, follow the commands:

cd ./StructuredAttentionDepthEstimation/data
python save_16bitpng_gt.py

The process will generate a training pair text file 'eigen_train_pairs.txt' under ./utils/filenames for use in the training phase.
For testing, the eigen split of 697 images is used.

Testing and Evaluation

Please first download the trained model from Google Drive, and put the model under ./StructuredAttentionDepthEstimation/models. The saved testing results can be also downloaded the same link. To test the trained model, follow the instructions:

cd ./StructuredAttentionDepthEstimation/prototxt
python gen_deploy_prototxt.py ### generating a network definition for the deploy network
sh test.sh ### testing and evaluating the model

We refine and fuse the multi-scale features derived from different deep semantic layers (e.g. res3d, res4f, res5c layers) using the proposed MeanFieldUpdate module as follows:

    #the first meanfield updating
    MeanFieldUpdate(n, n.res3d_dec, n.res5c_dec, 1, 1, feat_num)
    MeanFieldUpdate(n, n.res4f_dec, n.updated_f1_mf1, 2, 1, feat_num)
    MeanFieldUpdate(n, n.res5c_dec, n.updated_f2_mf1, 3, 1, feat_num)
    #the second meanfield updating
    MeanFieldUpdate(n, n.res3d_dec, n.updated_f3_mf1, 1, 2, feat_num)
    MeanFieldUpdate(n, n.res4f_dec, n.updated_f1_mf2, 2, 2, feat_num)
    MeanFieldUpdate(n, n.res5c_dec, n.updated_f2_mf2, 3, 2, feat_num)
    #the third meanfield updating
    MeanFieldUpdate(n, n.res3d_dec, n.updated_f3_mf2, 1, 3, feat_num)
    MeanFieldUpdate(n, n.res4f_dec, n.updated_f1_mf3, 2, 3, feat_num)
    MeanFieldUpdate(n, n.res5c_dec, n.updated_f2_mf3, 3, 3, feat_num)
    #the fourth meanfield updating
    MeanFieldUpdate(n, n.res3d_dec, n.updated_f3_mf3, 1, 4, feat_num)
    MeanFieldUpdate(n, n.res4f_dec, n.updated_f1_mf4, 2, 4, feat_num)
    MeanFieldUpdate(n, n.res5c_dec, n.updated_f2_mf4, 3, 4, feat_num)
    #the fifth meanfield updating
    MeanFieldUpdate(n, n.res3d_dec, n.updated_f3_mf4, 1, 5, feat_num)
    MeanFieldUpdate(n, n.res4f_dec, n.updated_f1_mf5, 2, 5, feat_num)
    MeanFieldUpdate(n, n.res5c_dec, n.updated_f2_mf5, 3, 5, feat_num)

Our testing runs very fast, and approaches around 8 fps in nearly real-time, which is significantly faster than previous graphical model-based approaches for single image depth estimation. The testing results on KITTI are shown in the table below using both the Eigen and the Garg crop. We further improved the accuracy over the results in the paper. The table and the figure below show the qualitative and the quatitative results respectively. The results are not exactly the same as the results reported in our paper. We further improved the accuracy.

The produced visualization results can be downloaded from here.

Training

To retrain the model, please first download the ResNet50 pretrained model on the ImageNet, and then put it under the foler ./StructuredAttentionDepthEstimation/models/pretrained_model, and rename it with ResNet-50-pratrained-model.caffemodel, which will be used as an initialization of our backbone network. To train our whole model, please follow:

cd ./StructuredAttentionDepthEstimation/prototxt
python gen_train_prototxt.py ### generate a network definition for the training network 
python train.py

The training supports multiple GPU speedup. You can modify the iter_size in the ./prototxt/solver.prototxt, the batch_size in gen_train_prototxt.py and the gpu number in train.py to change the overall batch size.
The # of overall batch size = # of gpus * batch_size * iter_size.

Pytorch Implementation

A Pytorch implementation of our model can be found here:
https://github.com/dontLoveBugs/StructuredAttentionDepthEstimation_pytorch

Citation

Please consider citing the following paper if the code is helpful in your research work:

@inproceedings{xu2018structured,
  title={Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation},
  author={Xu, Dan and Wang, Wei and Tang, Hao and Liu, Hong and Sebe, Nicu and Ricci, Elisa},
  booktitle={CVPR},
  year={2018}
}
You can’t perform that action at this time.