MR-CNNs for Large-Scale Scene Recognition
Switch branches/tags
Nothing to show
Clone or download
Latest commit 0110bb3 Jan 31, 2018
Failed to load latest commit information.
matlab Update test_mit67.m Jan 31, 2018
models add model files Feb 21, 2017
scripts add script files Feb 21, 2017 Update Aug 24, 2017

Multi-Resolution CNNs for Large-Scale Scene Recognition

Here we provide the code and models for the following paper (Arxiv Preprint):

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs
Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, and Yu Qiao 
in IEEE Transactions on Image Processing, 2017


  • February 21st, 2017
    • Release the code and models
  • January 3rd, 2017
    • Initialize the repo


We have made two efforts to exploit CNNs for large-scale scene recognition:

  • We design a modular framework to capture multi-level visual information for scene understanding by training CNNs from different resolutions
  • We propose a knowledge disambiguation strategy by using soft labels from extra networks to deal with the label ambiguity issue of scene recognition.

These two efforts are the core part of team "SIAT_MMLAB" for the following large-scale scene recogntion challenges.

Challenge Rank Performance
Places2 challenge 2015 2nd place 0.1736 top5-error
Places2 challenge 2016 4th place 0.1042 top5-error
LSUN challenge 2015 2nd place 0.9030 top1-accuracy
LSUN challenge 2016 1st place 0.9161 top1-accuracy

Places365 Models

We first release the learned models on the Places365 dataset.

  • Models learned at resolution of 256 * 256
Model Top5 Error Rate
(A0) Normal BN-Inception 0.143
(A1) Normal BN-Inception + object networks 0.141
(A2) Normal BN-Inception + scene networks 0.134
  • Models learned at resolution of 384 * 384
Model Top5 Error Rate
(B0) Deeper BN-Inception 0.140
(B1) Deeper BN-Inception + object networks 0.136
(B2) Deeper BN-Inception + scene networks 0.130
  • Download initialization and reference models

We release the scripts at the directory of scripts/.

Try bash scripts/ to downdload knowldege models.

Try bash scripts/ to download reference models.

Testing Code

We release the testing code on the Places365 validation dataset at the directory of matlab/.

We also release a demo code to use our Places365 model as generic feature extraction and perform scene recognition on the MIT Indoor67 dataset at the directory of matlab/.

Training Code

We release the models at the directory of models/ and the training scripts at the directory of scripts/.

Try bash scripts/ to train standard CNNs.

Try bash scripts/ to train knowledge disambiguation networks (by object network).

Try bash scripts/ to train knowledge disambiguation netowrks (by scene network).

The training code is based on our modified Caffe toolbox. It is a efficient parallel caffe with MPI implementation. Meanwhile, we implement a new kl-divergence loss layer for our knowledge disambiguation methods;