Skip to content

Commit

Permalink
MNC birth
Browse files Browse the repository at this point in the history
  • Loading branch information
HaozhiQi committed Jun 22, 2016
0 parents commit 5979f51
Show file tree
Hide file tree
Showing 80 changed files with 29,918 additions and 0 deletions.
57 changes: 57 additions & 0 deletions LICENSE
@@ -0,0 +1,57 @@
Faster R-CNN

The MIT License (MIT)

Copyright (c) 2015 Microsoft Corporation

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

************************************************************************

THIRD-PARTY SOFTWARE NOTICES AND INFORMATION

This project, Faster R-CNN, incorporates material from the project(s) listed below (collectively, "Third Party Code"). Microsoft is not the original author of the Third Party Code. The original copyright notice and license under which Microsoft received such Third Party Code are set out below. This Third Party Code is licensed to you under their original license terms set forth below. Microsoft reserves all other rights not expressly granted, whether by implication, estoppel or otherwise.

1. Caffe, version 0.9, (https://github.com/BVLC/caffe/)

COPYRIGHT

All contributions by the University of California:
Copyright (c) 2014, 2015, The Regents of the University of California (Regents)
All rights reserved.

All other contributions:
Copyright (c) 2014, 2015, the respective contributors
All rights reserved.

Caffe uses a shared copyright model: each contributor holds copyright over their contributions to Caffe. The project versioning records all such contribution and copyright details. If a contributor wants to further mark their specific copyright on a particular contribution, they should indicate their copyright solely in the commit message of the change when it is committed.

The BSD 2-Clause License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

************END OF THIRD-PARTY SOFTWARE NOTICES AND INFORMATION**********


155 changes: 155 additions & 0 deletions README.md
@@ -0,0 +1,155 @@
# Instance-aware Semantic Segmentation via Multi-task Network Cascades

By Jifeng Dai, Kaiming He, Jian Sun

This python version is implemented by [Haozhi Qi](https://github.com/Oh233) when he was an intern at Microsoft Research.

### Introduction

MNC is an instance-aware semantic segmentation system based on deep convolutional networks, which won the first place in COCO segmentation challenge 2015, and test at a fraction of a second per image. We decompose the task of instance-aware semantic segmentation into related sub-tasks, which are solved by multi-task network cascades (MNC) with shared features. The entire MNC network is trained end-to-end with error gradients across cascaded stages.


<img src='data/readme_img/example.png', width='800'>


MNC was initially described in a [CVPR 2016 oral paper](http://arxiv.org/abs/1512.04412).

This repository contains a python implementation of MNC, which is ~10% slower than the original matlab implementation.

This repository includes a bilinear RoI warping layer, which enables gradient back-propagation with respect to RoI coordinates.

### Misc.

This code has been tested on Linux (Ubuntu 14.04), using K40/Titian X GPUs.

The code is built based on [py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn).

MNC is released under the MIT License (refer to the LICENSE file for details).


### Citing MNC

If you find MNC useful in your research, please consider citing:

@inproceedings{dai2015instance,
title={Instance-aware Semantic Segmentation via Multi-task Network Cascades},
author={Dai, Jifeng and He, Kaiming and Sun, Jian},
booktitle={CVPR},
year={2016}
}


### Installation guide

1. Clone the MNC repository:
```Shell
# Make sure to clone with --recursive
git clone --recursive https://github.com/daijifeng001/MNC.git
```

2. Install Python packages: `numpy`, `scipy`, `cython`, `python-opencv`, `easydict`, `yaml`.

3. Build the Cython modules and the gpu_nms, gpu_mask_voting modules by:
```Shell
cd $MNC_ROOT/lib
make
```

4. Install `Caffe` and `pycaffe` dependencies (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html) for official installation guide)

**Note:** Caffe *must* be built with support for Python layers!

```make
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# CUDNN is recommended in building to reduce memory footprint
USE_CUDNN := 1
```

5. Build Caffe and pycaffe:
```Shell
cd $MNC_ROOT/caffe-mnc
# If you have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe
```

### Demo

First, download the trained MNC model.
```Shell
./data/scripts/fetch_mnc_model.sh
```

Run the demo:
```Shell
cd $MNC_ROOT
./tools/demo.py
```
Result demo images will be stored to ```data/demo/```.

The demo performs instance-aware semantic segmentation with a trained MNC model (using VGG-16 net). The model is pre-trained on ImageNet, and finetuned on VOC 2012 train set with additional annotations from [SBD](http://www.cs.berkeley.edu/~bharath2/codes/SBD/download.html). The mAP^r of the model is 65.0% on VOC 2012 validation set. The test speed per image is ~0.33sec on Titian X and ~0.42sec on K40.

### Training

This repository contains code to **end-to-end** train MNC for instance-aware semantic segmentation, where gradients across cascaded stages are counted in training.

#### Preparation:

0. Run `./data/scripts/fetch_imagenet_models.sh` to download the ImageNet pre-trained VGG-16 net.
0. Download the VOC 2007 dataset to ./data/VOCdevkit2007
0. Run `./data/scripts/fetch_sbd_data.sh` to download the additional segmentation annotations in [SBD](http://www.cs.berkeley.edu/~bharath2/codes/SBD/download.html) to ./data/VOCdevkitSDS.

#### 1. End-to-end training of Faster-RCNN for object detection

Faster-RCNN can be viewed as a 2-stage cascades composed of region proposal network (RPN) and object detection network. We first present end-to-end training results on this relatively simple network cascades.

Run script `experiments/scripts/faster_rcnn_end2end.sh` to train a Faster-RCNN model on VOC 2007 trainval. Final mAP^b should be ~69.1% on VOC 2007 test.

```Shell
cd $MNC_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng1701 RNG_SEED 1701
```

#### 2. End-to-end training of MNC for instance-aware semantic segmentation

To end-to-end train a 5-stage MNC model (on VOC 2012 train), use `experiments/scripts/mnc_5stage.sh`. Final mAP^r should be ~65.0% on VOC 2012 validation.

```Shell
cd $MNC_ROOT
./experiments/scripts/mnc_5stage.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng 1701 RNG_SEED 1701
```

#### 3. Training of CFM for instance-aware semantic segmentation

The code also includes an entry to tain a convolutional feature masking (CFM) model for instance aware semantic segmentation.

##### 3.1. Download pre-computed MCG proposals

Download and process the pre-computed MCG proposals.

```Shell
cd $MNC_ROOT
./data/scripts/fetch_mcg_data.sh
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db train --output data/cache/voc_2012_train_mcg_maskdb/
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db val --output data/cache/voc_2012_val_mcg_maskdb/
```
Resulting proposals would be at folder ```data/MCG/```.

##### 3.2. Train the model

Run `experiments/scripts/cfm.sh` to train on VOC 2012 train set. Final mAP^r should be ~60.5% on VOC 2012 validation.

```Shell
cd $MNC_ROOT
./experiments/scripts/cfm.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
# --set EXP_DIR seed_rng 1701 RNG_SEED 1701
```

0 comments on commit 5979f51

Please sign in to comment.