Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 5979f51
Showing
80 changed files
with
29,918 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
Faster R-CNN | ||
|
||
The MIT License (MIT) | ||
|
||
Copyright (c) 2015 Microsoft Corporation | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
|
||
************************************************************************ | ||
|
||
THIRD-PARTY SOFTWARE NOTICES AND INFORMATION | ||
|
||
This project, Faster R-CNN, incorporates material from the project(s) listed below (collectively, "Third Party Code"). Microsoft is not the original author of the Third Party Code. The original copyright notice and license under which Microsoft received such Third Party Code are set out below. This Third Party Code is licensed to you under their original license terms set forth below. Microsoft reserves all other rights not expressly granted, whether by implication, estoppel or otherwise. | ||
|
||
1. Caffe, version 0.9, (https://github.com/BVLC/caffe/) | ||
|
||
COPYRIGHT | ||
|
||
All contributions by the University of California: | ||
Copyright (c) 2014, 2015, The Regents of the University of California (Regents) | ||
All rights reserved. | ||
|
||
All other contributions: | ||
Copyright (c) 2014, 2015, the respective contributors | ||
All rights reserved. | ||
|
||
Caffe uses a shared copyright model: each contributor holds copyright over their contributions to Caffe. The project versioning records all such contribution and copyright details. If a contributor wants to further mark their specific copyright on a particular contribution, they should indicate their copyright solely in the commit message of the change when it is committed. | ||
|
||
The BSD 2-Clause License | ||
|
||
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: | ||
|
||
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. | ||
|
||
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
************END OF THIRD-PARTY SOFTWARE NOTICES AND INFORMATION********** | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
# Instance-aware Semantic Segmentation via Multi-task Network Cascades | ||
|
||
By Jifeng Dai, Kaiming He, Jian Sun | ||
|
||
This python version is implemented by [Haozhi Qi](https://github.com/Oh233) when he was an intern at Microsoft Research. | ||
|
||
### Introduction | ||
|
||
MNC is an instance-aware semantic segmentation system based on deep convolutional networks, which won the first place in COCO segmentation challenge 2015, and test at a fraction of a second per image. We decompose the task of instance-aware semantic segmentation into related sub-tasks, which are solved by multi-task network cascades (MNC) with shared features. The entire MNC network is trained end-to-end with error gradients across cascaded stages. | ||
|
||
|
||
<img src='data/readme_img/example.png', width='800'> | ||
|
||
|
||
MNC was initially described in a [CVPR 2016 oral paper](http://arxiv.org/abs/1512.04412). | ||
|
||
This repository contains a python implementation of MNC, which is ~10% slower than the original matlab implementation. | ||
|
||
This repository includes a bilinear RoI warping layer, which enables gradient back-propagation with respect to RoI coordinates. | ||
|
||
### Misc. | ||
|
||
This code has been tested on Linux (Ubuntu 14.04), using K40/Titian X GPUs. | ||
|
||
The code is built based on [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn). | ||
|
||
MNC is released under the MIT License (refer to the LICENSE file for details). | ||
|
||
|
||
### Citing MNC | ||
|
||
If you find MNC useful in your research, please consider citing: | ||
|
||
@inproceedings{dai2015instance, | ||
title={Instance-aware Semantic Segmentation via Multi-task Network Cascades}, | ||
author={Dai, Jifeng and He, Kaiming and Sun, Jian}, | ||
booktitle={CVPR}, | ||
year={2016} | ||
} | ||
|
||
|
||
### Installation guide | ||
|
||
1. Clone the MNC repository: | ||
```Shell | ||
# Make sure to clone with --recursive | ||
git clone --recursive https://github.com/daijifeng001/MNC.git | ||
``` | ||
|
||
2. Install Python packages: `numpy`, `scipy`, `cython`, `python-opencv`, `easydict`, `yaml`. | ||
|
||
3. Build the Cython modules and the gpu_nms, gpu_mask_voting modules by: | ||
```Shell | ||
cd $MNC_ROOT/lib | ||
make | ||
``` | ||
|
||
4. Install `Caffe` and `pycaffe` dependencies (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html) for official installation guide) | ||
|
||
**Note:** Caffe *must* be built with support for Python layers! | ||
|
||
```make | ||
# In your Makefile.config, make sure to have this line uncommented | ||
WITH_PYTHON_LAYER := 1 | ||
# CUDNN is recommended in building to reduce memory footprint | ||
USE_CUDNN := 1 | ||
``` | ||
|
||
5. Build Caffe and pycaffe: | ||
```Shell | ||
cd $MNC_ROOT/caffe-mnc | ||
# If you have all of the requirements installed | ||
# and your Makefile.config in place, then simply do: | ||
make -j8 && make pycaffe | ||
``` | ||
|
||
### Demo | ||
|
||
First, download the trained MNC model. | ||
```Shell | ||
./data/scripts/fetch_mnc_model.sh | ||
``` | ||
|
||
Run the demo: | ||
```Shell | ||
cd $MNC_ROOT | ||
./tools/demo.py | ||
``` | ||
Result demo images will be stored to ```data/demo/```. | ||
|
||
The demo performs instance-aware semantic segmentation with a trained MNC model (using VGG-16 net). The model is pre-trained on ImageNet, and finetuned on VOC 2012 train set with additional annotations from [SBD](http://www.cs.berkeley.edu/~bharath2/codes/SBD/download.html). The mAP^r of the model is 65.0% on VOC 2012 validation set. The test speed per image is ~0.33sec on Titian X and ~0.42sec on K40. | ||
|
||
### Training | ||
|
||
This repository contains code to **end-to-end** train MNC for instance-aware semantic segmentation, where gradients across cascaded stages are counted in training. | ||
|
||
#### Preparation: | ||
|
||
0. Run `./data/scripts/fetch_imagenet_models.sh` to download the ImageNet pre-trained VGG-16 net. | ||
0. Download the VOC 2007 dataset to ./data/VOCdevkit2007 | ||
0. Run `./data/scripts/fetch_sbd_data.sh` to download the additional segmentation annotations in [SBD](http://www.cs.berkeley.edu/~bharath2/codes/SBD/download.html) to ./data/VOCdevkitSDS. | ||
|
||
#### 1. End-to-end training of Faster-RCNN for object detection | ||
|
||
Faster-RCNN can be viewed as a 2-stage cascades composed of region proposal network (RPN) and object detection network. We first present end-to-end training results on this relatively simple network cascades. | ||
|
||
Run script `experiments/scripts/faster_rcnn_end2end.sh` to train a Faster-RCNN model on VOC 2007 trainval. Final mAP^b should be ~69.1% on VOC 2007 test. | ||
|
||
```Shell | ||
cd $MNC_ROOT | ||
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] VGG16 [--set ...] | ||
# GPU_ID is the GPU you want to train on | ||
# --set ... allows you to specify fast_rcnn.config options, e.g. | ||
# --set EXP_DIR seed_rng1701 RNG_SEED 1701 | ||
``` | ||
|
||
#### 2. End-to-end training of MNC for instance-aware semantic segmentation | ||
|
||
To end-to-end train a 5-stage MNC model (on VOC 2012 train), use `experiments/scripts/mnc_5stage.sh`. Final mAP^r should be ~65.0% on VOC 2012 validation. | ||
|
||
```Shell | ||
cd $MNC_ROOT | ||
./experiments/scripts/mnc_5stage.sh [GPU_ID] VGG16 [--set ...] | ||
# GPU_ID is the GPU you want to train on | ||
# --set ... allows you to specify fast_rcnn.config options, e.g. | ||
# --set EXP_DIR seed_rng 1701 RNG_SEED 1701 | ||
``` | ||
|
||
#### 3. Training of CFM for instance-aware semantic segmentation | ||
|
||
The code also includes an entry to tain a convolutional feature masking (CFM) model for instance aware semantic segmentation. | ||
|
||
##### 3.1. Download pre-computed MCG proposals | ||
|
||
Download and process the pre-computed MCG proposals. | ||
|
||
```Shell | ||
cd $MNC_ROOT | ||
./data/scripts/fetch_mcg_data.sh | ||
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db train --output data/cache/voc_2012_train_mcg_maskdb/ | ||
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db val --output data/cache/voc_2012_val_mcg_maskdb/ | ||
``` | ||
Resulting proposals would be at folder ```data/MCG/```. | ||
|
||
##### 3.2. Train the model | ||
|
||
Run `experiments/scripts/cfm.sh` to train on VOC 2012 train set. Final mAP^r should be ~60.5% on VOC 2012 validation. | ||
|
||
```Shell | ||
cd $MNC_ROOT | ||
./experiments/scripts/cfm.sh [GPU_ID] VGG16 [--set ...] | ||
# GPU_ID is the GPU you want to train on | ||
# --set ... allows you to specify fast_rcnn.config options, e.g. | ||
# --set EXP_DIR seed_rng 1701 RNG_SEED 1701 | ||
``` |
Oops, something went wrong.