Skip to content

wanglimin/MRCNN-Scene-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Resolution CNNs for Large-Scale Scene Recognition

Here we provide the code and models for the following paper (Arxiv Preprint):

Knowledge Guided Disambiguation for Large-Scale Scene Classification with Multi-Resolution CNNs
Limin Wang, Sheng Guo, Weilin Huang, Yuanjun Xiong, and Yu Qiao 
in IEEE Transactions on Image Processing, 2017

Updates

  • February 21st, 2017
    • Release the code and models
  • January 3rd, 2017
    • Initialize the repo

Overview

We have made two efforts to exploit CNNs for large-scale scene recognition:

  • We design a modular framework to capture multi-level visual information for scene understanding by training CNNs from different resolutions
  • We propose a knowledge disambiguation strategy by using soft labels from extra networks to deal with the label ambiguity issue of scene recognition.

These two efforts are the core part of team "SIAT_MMLAB" for the following large-scale scene recogntion challenges.

Challenge Rank Performance
Places2 challenge 2015 2nd place 0.1736 top5-error
Places2 challenge 2016 4th place 0.1042 top5-error
LSUN challenge 2015 2nd place 0.9030 top1-accuracy
LSUN challenge 2016 1st place 0.9161 top1-accuracy

Places365 Models

We first release the learned models on the Places365 dataset.

  • Models learned at resolution of 256 * 256
Model Top5 Error Rate
(A0) Normal BN-Inception 0.143
(A1) Normal BN-Inception + object networks 0.141
(A2) Normal BN-Inception + scene networks 0.134
  • Models learned at resolution of 384 * 384
Model Top5 Error Rate
(B0) Deeper BN-Inception 0.140
(B1) Deeper BN-Inception + object networks 0.136
(B2) Deeper BN-Inception + scene networks 0.130
  • Download initialization and reference models

We release the scripts at the directory of scripts/.

Try bash scripts/get_init_models.sh to downdload knowldege models.

Try bash scripts/get_reference_models.sh to download reference models.

Testing Code

We release the testing code on the Places365 validation dataset at the directory of matlab/.

We also release a demo code to use our Places365 model as generic feature extraction and perform scene recognition on the MIT Indoor67 dataset at the directory of matlab/.

Training Code

We release the models at the directory of models/ and the training scripts at the directory of scripts/.

Try bash scripts/256_inception2_train.sh to train standard CNNs.

Try bash scripts/256_kd_object_inception2_train.sh to train knowledge disambiguation networks (by object network).

Try bash scripts/256_kd_scene_inception2_train.sh to train knowledge disambiguation netowrks (by scene network).

The training code is based on our modified Caffe toolbox. It is a efficient parallel caffe with MPI implementation. Meanwhile, we implement a new kl-divergence loss layer for our knowledge disambiguation methods;

https://github.com/yjxiong/caffe/tree/kd

Questions

Contact

Releases

No releases published

Packages

No packages published