Recurrent Scene Parsing with Perspective Understanding in the Loop

Objects may appear at arbitrary scales in perspective images of a scene, posing a challenge for recognition systems that process an image at a fixed resolution. We propose a depth-aware gating module that adaptively chooses the pooling field size in a convolutional network architecture according to the object scale (inversely proportional to the depth) so that small details can be preserved for objects at distance and a larger receptive field can be used for objects nearer to the camera. The depth gating signal is provided from stereo disparity (when available) or estimated directly from a single image. We integrate this depth-aware gating into a recurrent convolutional neural network trained in an end-to-end fashion to perform semantic segmentation. Our recurrent module iteratively refines the segmentation results, leveraging the depth estimate and output prediction from the previous loop. Through extensive experiments on three popular large-scale RGB-D datasets, we demonstrate our approach achieves competitive semantic segmentation performance using more compact model than existing methods. Interestingly, we find segmentation performance improves when we estimate depth directly from the image rather than using "ground-truth" and the model produces state-of-the-art results for quantitative depth estimation from a single image.

For details, please refer to our project page

To download our models, please go google drive and put the models in directory 'models'.

MatConvNet is used in our project, some functions are changed. So it might be required to re-compile. Useful commands are --

LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:local matlab 

path_to_matconvnet = '../matconvnet';
run(fullfile(path_to_matconvnet, 'matlab', 'vl_setupnn'));
addpath(fullfile(path_to_matconvnet, 'matlab'));
vl_compilenn('enableGpu', true, ...
               'cudaRoot', '/usr/local/cuda-7.5', ...
               'cudaMethod', 'nvcc', ...
               'enableCudnn', true, ...
               'cudnnRoot', '/usr/local/cuda-7.5/cudnn-v5') ;

If you find the code useful, please cite our work

@article{kong2017depthsegRNN,
  title={Recurrent Scene Parsing with Perspective Understanding in the Loop},
  author={Kong, Shu and Fowlkes, Charless},
  journal={arXiv preprint arXiv:1705.07238},
  year={2017}
}

05/24/2017 Shu Kong @ UCI

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
depthSegRNN_Cityscapes		depthSegRNN_Cityscapes
depthSegRNN_NYUv2		depthSegRNN_NYUv2
depthSegRNN_SUNRGBD		depthSegRNN_SUNRGBD
libs		libs
matconvnet		matconvnet
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recurrent Scene Parsing with Perspective Understanding in the Loop

About

Releases

Packages

Languages

aimerykong/Recurrent-Scene-Parsing-with-Perspective-Understanding-in-the-loop

Folders and files

Latest commit

History

Repository files navigation

Recurrent Scene Parsing with Perspective Understanding in the Loop

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages