Skip to content

Commit

Permalink
R-FCN birth
Browse files Browse the repository at this point in the history
A matlab version of R-FCN which supports both Windows and Linux.
  • Loading branch information
daijifeng001 committed Jun 17, 2016
0 parents commit 750e534
Show file tree
Hide file tree
Showing 82 changed files with 46,843 additions and 0 deletions.
17 changes: 17 additions & 0 deletions .gitattributes
@@ -0,0 +1,17 @@
# Auto detect text files and perform LF normalization
* text=auto

# Custom for Visual Studio
*.cs diff=csharp

# Standard to msysgit
*.doc diff=astextplain
*.DOC diff=astextplain
*.docx diff=astextplain
*.DOCX diff=astextplain
*.dot diff=astextplain
*.DOT diff=astextplain
*.pdf diff=astextplain
*.PDF diff=astextplain
*.rtf diff=astextplain
*.RTF diff=astextplain
58 changes: 58 additions & 0 deletions .gitignore
@@ -0,0 +1,58 @@
# Windows image file caches
Thumbs.db
ehthumbs.db

# Folder config file
Desktop.ini

# Recycle Bin used on file shares
$RECYCLE.BIN/

# User Ingore
models/fast_rcnn_prototxts/
models/pre_trained_model/
models/rpn_prototxts/
data/
datasets/
output/
cachedir/
imdb/cache
bin/
external/caffe/matlab
fetch_data/*.zip
*.caffemodel
*.mat

# Windows Installer files
*.cab
*.msi
*.msm
*.msp

# Windows shortcuts
*.lnk

# =========================
# Operating System Files
# =========================

# OSX
# =========================

.DS_Store
.AppleDouble
.LSOverride

# Thumbnails
._*

# Files that might appear on external disk
.Spotlight-V100
.Trashes

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk
4 changes: 4 additions & 0 deletions .gitmodules
@@ -0,0 +1,4 @@
[submodule "external/caffe"]
path = external/caffe
url = https://github.com/ShaoqingRen/caffe.git
branch = faster-R-CNN
57 changes: 57 additions & 0 deletions LICENSE
@@ -0,0 +1,57 @@
Faster R-CNN

The MIT License (MIT)

Copyright (c) 2015 Microsoft Corporation

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

************************************************************************

THIRD-PARTY SOFTWARE NOTICES AND INFORMATION

This project, Faster R-CNN, incorporates material from the project(s) listed below (collectively, "Third Party Code"). Microsoft is not the original author of the Third Party Code. The original copyright notice and license under which Microsoft received such Third Party Code are set out below. This Third Party Code is licensed to you under their original license terms set forth below. Microsoft reserves all other rights not expressly granted, whether by implication, estoppel or otherwise.

1. Caffe, version 0.9, (https://github.com/BVLC/caffe/)

COPYRIGHT

All contributions by the University of California:
Copyright (c) 2014, 2015, The Regents of the University of California (Regents)
All rights reserved.

All other contributions:
Copyright (c) 2014, 2015, the respective contributors
All rights reserved.

Caffe uses a shared copyright model: each contributor holds copyright over their contributions to Caffe. The project versioning records all such contribution and copyright details. If a contributor wants to further mark their specific copyright on a particular contribution, they should indicate their copyright solely in the commit message of the change when it is committed.

The BSD 2-Clause License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

************END OF THIRD-PARTY SOFTWARE NOTICES AND INFORMATION**********


83 changes: 83 additions & 0 deletions README.md
@@ -0,0 +1,83 @@
# *R-FCN*: Object Detection via Region-based Fully Convolutional Networks

By Jifeng Dai, Yi Li, Kaiming He, Jian Sun

### Introduction

**R-FCN** is a region-based object detection framework leveraging deep fully-convolutional networks, which is accurate and efficient. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region sub-network hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. R-FCN can natually adopt powerful fully convolutional image classifier backbones, such as [ResNets](https://github.com/KaimingHe/deep-residual-networks), for object detection.

R-FCN was initially described in an [arxiv tech report](https://arxiv.org/abs/1605.06409).

This code has been tested on Windows 7/8 64 bit, Windows Server 2012 R2, and Ubuntu 14.04, with Matlab 2014a.

### License

R-FCN is released under the MIT License (refer to the LICENSE file for details).

### Citing R-FCN

If you find R-FCN useful in your research, please consider citing:

@article{dai16rfcn,
Author = {Jifeng Dai, Yi Li, Kaiming He, Jian Sun},
Title = {{R-FCN}: Object Detection via Region-based Fully Convolutional Networks},
Journal = {arXiv preprint arXiv:1605.06409},
Year = {2016}
}

### Main Results
| training data | test data | mAP | time/img (K40) | time/img (Titian X)
-------------------|:-------------------:|:---------------------:|:-----:|:--------------:|:------------------:|
R-FCN, ResNet-50L | VOC 07+12 trainval | VOC 07 test | 77.0% | 0.12sec | 0.09sec |
R-FCN, ResNet-101L | VOC 07+12 trainval | VOC 07 test | 79.5% | 0.17sec | 0.12sec |


### Requirements: software

0. `Caffe` build for R-FCN (included in this repository, see `external/caffe`)
- If you are using Windows, you may download a compiled mex file by running `fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m`
- If you are using Linux or you want to compile for Windows, please recompile [our Caffe branch](https://github.com/daijifeng001/caffe-rfcn).
0. MATLAB 2014a or later


### Requirements: hardware

GPU: Titan, Titan X, K40, K80.


### Preparation:
0. Run `fetch_data/fetch_caffe_mex_windows_vs2013_cuda75.m` to download a compiled Caffe mex (for Windows only).
0. Run `fetch_data/fetch_model_ResNet50.m` to download an ImageNet-pre-trained ResNet-50L net.
0. Run `fetch_data/fetch_model_ResNet101.m` to download an ImageNet-pre-trained ResNet-101L net.
0. Run `fetch_data/fetch_region_proposals.m` to download the pre-computed region proposals.
0. Download VOC 2007 and 2012 data to ./datasets.
0. Run `rfcn_build.m`.
0. Run `startup.m`.


### Training & Testing:
0. Run `experiments/script_rfcn_VOC0712_ResNet50_OHEM_ss.m` to train a model using ResNet-50L net with online hard example mining (OHEM), leveraging selective search proposals. The accuracy should be ~75.4% in mAP.
- **Note**: the training time is ~13 hours on Titian X.
0. Run `experiments/script_rfcn_VOC0712_ResNet50_OHEM_rpn.m` to train a model using ResNet-50L net with OHEM, leveraging RPN proposals (using ResNet-50L net). The accuracy should be ~77.0% in mAP.
- **Note**: the training time is ~13 hours on Titian X.
0. Run `experiments/script_rfcn_VOC0712_ResNet101_OHEM_rpn.m` to train a model using ResNet-101L net with OHEM, leveraging RPN proposals (using ResNet-101L net). The accuracy should be ~79.5% in mAP.
- **Note**: the training time is ~19 hours on Titian X.
0. Check other scripts in `./experiments` for more settings.

**Note:** In all the experiments, training is performed on VOC 07+12 trainval, and testing is performed on VOC 07 test.

### Resources

0. Experiment logs: [DropBox](https://www.dropbox.com/s/is2gatfdxs1tcls/experiment_log.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1mhFYejI)

If the automatic "fetch_data" fails, you may manually download resouces from:

0. Pre-complied caffe mex (Windows):
- [DropBox](https://www.dropbox.com/s/n1x2bybd6d03s7c/caffe_mex.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1i4OlG7z)
0. ImageNet-pretrained networks:
- ResNet-50L net [DropBox](https://www.dropbox.com/s/0uzh90f6jx9l0yf/models_ResNet-50L.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1kVm4ly3)
- ResNet-101L net [DropBox](https://www.dropbox.com/s/ev91ss0pyd5h9ix/models_ResNet-101L.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1nvgu1pJ)
0. Pre-computed region proposals:
- [DropBox](https://www.dropbox.com/s/gagkulgcif6k1dd/proposals.zip?dl=0), [BaiduYun](http://pan.baidu.com/s/1nv1tkH7)


3 changes: 3 additions & 0 deletions experiments/+Dataset/private/voc2007_devkit.m
@@ -0,0 +1,3 @@
function path = voc2007_devkit()
path = './datasets/VOCdevkit2007';
end
3 changes: 3 additions & 0 deletions experiments/+Dataset/private/voc2012_devkit.m
@@ -0,0 +1,3 @@
function path = voc2012_devkit()
path = './datasets/VOCdevkit2012';
end
21 changes: 21 additions & 0 deletions experiments/+Dataset/voc0712_trainval_sp.m
@@ -0,0 +1,21 @@
function dataset = voc0712_trainval_sp(dataset, usage, use_flip, extension)
% Pascal voc 0712 trainval set with *pre-computed* RPN proposals (trained with ResNet50 or ResNet101)
% extension = "resnet50" or "resnet101" for specifying pre-computed RPN proposals
% set opts.imdb_train opts.roidb_train

% change to point to your devkit install
devkit2007 = voc2007_devkit();
devkit2012 = voc2012_devkit();

switch usage
case {'train'}
dataset.imdb_train = { imdb_from_voc(devkit2007, 'trainval', '2007', use_flip), ...
imdb_from_voc(devkit2012, 'trainval', '2012', use_flip)};
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_self_proposal', true, 'extension', extension), dataset.imdb_train, 'UniformOutput', false);
case {'test'}
error('only supports one source test currently');
otherwise
error('usage = ''train'' or ''test''');
end

end
21 changes: 21 additions & 0 deletions experiments/+Dataset/voc0712_trainval_ss.m
@@ -0,0 +1,21 @@
function dataset = voc0712_trainval_ss(dataset, usage, use_flip)
% Pascal voc 0712 trainval set with selective search
% set opts.imdb_train opts.roidb_train
% or set opts.imdb_test opts.roidb_train

% change to point to your devkit install
devkit2007 = voc2007_devkit();
devkit2012 = voc2012_devkit();

switch usage
case {'train'}
dataset.imdb_train = { imdb_from_voc(devkit2007, 'trainval', '2007', use_flip), ...
imdb_from_voc(devkit2012, 'trainval', '2012', use_flip)};
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_selective_search', true), dataset.imdb_train, 'UniformOutput', false);
case {'test'}
error('only supports one source test currently');
otherwise
error('usage = ''train'' or ''test''');
end

end
21 changes: 21 additions & 0 deletions experiments/+Dataset/voc2007_test_sp.m
@@ -0,0 +1,21 @@
function dataset = voc2007_test_sp(dataset, usage, use_flip, extension)
% Pascal voc 2007 test set with *pre-computed* RPN proposals (trained with ResNet50 or ResNet101)
% extension = "resnet50" or "resnet101" for specifying pre-computed RPN proposals
% set opts.imdb_train opts.roidb_train


% change to point to your devkit install
devkit = voc2007_devkit();

switch usage
case {'train'}
dataset.imdb_train = { imdb_from_voc(devkit, 'test', '2007', use_flip) };
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_self_proposal', true, 'extension', extension), dataset.imdb_train, 'UniformOutput', false);
case {'test'}
dataset.imdb_test = imdb_from_voc(devkit, 'test', '2007', use_flip);
dataset.roidb_test = dataset.imdb_test.roidb_func(dataset.imdb_test, 'with_self_proposal', true, 'extension', extension);
otherwise
error('usage = ''train'' or ''test''');
end

end
20 changes: 20 additions & 0 deletions experiments/+Dataset/voc2007_test_ss.m
@@ -0,0 +1,20 @@
function dataset = voc2007_test_ss(dataset, usage, use_flip)
% Pascal voc 2007 test set with selective search
% set opts.imdb_train opts.roidb_train
% or set opts.imdb_test opts.roidb_train

% change to point to your devkit install
devkit = voc2007_devkit();

switch usage
case {'train'}
dataset.imdb_train = { imdb_from_voc(devkit, 'test', '2007', use_flip) };
dataset.roidb_train = cellfun(@(x) x.roidb_func(x, 'with_selective_search', true), dataset.imdb_train, 'UniformOutput', false);
case {'test'}
dataset.imdb_test = imdb_from_voc(devkit, 'test', '2007', use_flip) ;
dataset.roidb_test = dataset.imdb_test.roidb_func(dataset.imdb_test, 'with_selective_search', true);
otherwise
error('usage = ''train'' or ''test''');
end

end
10 changes: 10 additions & 0 deletions experiments/+Model/ResNet101_for_RFCN_VOC0712.m
@@ -0,0 +1,10 @@
function model = ResNet101_for_RFCN_VOC0712(model)
% ResNet 101layers (finetuned from res3a)

model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_res3a', 'solver_80k110k_lr1_3.prototxt');
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_res3a', 'test.prototxt');

model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'ResNet-101-model.caffemodel');
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'mean_image');

end
10 changes: 10 additions & 0 deletions experiments/+Model/ResNet101_for_RFCN_VOC0712_OHEM.m
@@ -0,0 +1,10 @@
function model = ResNet101_for_RFCN_VOC0712_OHEM(model)
% ResNet 101layers with OHEM training (finetuned from res3a)

model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_OHEM_res3a', 'solver_80k110k_lr1_3.prototxt');
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-101L_OHEM_res3a', 'test.prototxt');

model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'ResNet-101-model.caffemodel');
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-101L', 'mean_image');

end
10 changes: 10 additions & 0 deletions experiments/+Model/ResNet50_for_RFCN_VOC0712.m
@@ -0,0 +1,10 @@
function model = ResNet50_for_RFCN_VOC0712(model)
% ResNet 50layers (finetuned from res3a)

model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_res3a', 'solver_80k110k_lr1_3.prototxt');
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_res3a', 'test.prototxt');

model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'ResNet-50-model.caffemodel');
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'mean_image');

end
10 changes: 10 additions & 0 deletions experiments/+Model/ResNet50_for_RFCN_VOC0712_OHEM.m
@@ -0,0 +1,10 @@
function model = ResNet50_for_RFCN_VOC0712_OHEM(model)
% ResNet 50layers with OHEM training (finetuned from res3a)

model.solver_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_OHEM_res3a', 'solver_80k110k_lr1_3.prototxt');
model.test_net_def_file = fullfile(pwd, 'models', 'rfcn_prototxts', 'ResNet-50L_OHEM_res3a', 'test.prototxt');

model.net_file = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'ResNet-50-model.caffemodel');
model.mean_image = fullfile(pwd, 'models', 'pre_trained_models', 'ResNet-50L', 'mean_image');

end

0 comments on commit 750e534

Please sign in to comment.