Instance-aware Semantic Segmentation via Multi-task Network Cascades

By Jifeng Dai, Kaiming He, Jian Sun

This python version is re-implemented by Haozhi Qi when he was an intern at Microsoft Research.

Fork Information

This fork of the original repository contains a few small improvements in the demo script, allowing it to be used to batch process data in a particular folder.

It also comes with an associated install script which can be used to set this code up on a machine, without requiring root privileges. The script is tailored for users of ETH Zurich's Euryale mini-cluster (using slurm and Linux modules), but it can easily be tweaked to work on arbitrary systems.

Moreover, this fork's Caffe is more up-to-date than the original's, allowing it to work with cuDNN 5, leading to improved performance (at least in terms of inference speed, reaching about 170ms for a 1242x375 image, as compared to the 300ms mentioned in the original paper).

Introduction

MNC is an instance-aware semantic segmentation system based on deep convolutional networks, which won the first place in COCO segmentation challenge 2015, and test at a fraction of a second per image. We decompose the task of instance-aware semantic segmentation into related sub-tasks, which are solved by multi-task network cascades (MNC) with shared features. The entire MNC network is trained end-to-end with error gradients across cascaded stages.

MNC was initially described in a CVPR 2016 oral paper.

This repository contains a python implementation of MNC, which is ~10% slower than the original matlab implementation.

This repository includes a bilinear RoI warping layer, which enables gradient back-propagation with respect to RoI coordinates.

Misc.

This code has been tested on Linux (Ubuntu 14.04), using K40/Titan X GPUs.

The code is built based on py-faster-rcnn.

MNC is released under the MIT License (refer to the LICENSE file for details).

Citing MNC

If you find MNC useful in your research, please consider citing:

@inproceedings{dai2016instance,
    title={Instance-aware Semantic Segmentation via Multi-task Network Cascades},
    author={Dai, Jifeng and He, Kaiming and Sun, Jian},
    booktitle={CVPR},
    year={2016}
}

Main Results

	training data	test data	mAP^r@0.5	mAP^r@0.7	time (K40)	time (Titian X)
MNC, VGG-16	VOC 12 train	VOC 12 val	65.0%	46.3%	0.42sec/img	0.33sec/img

Installation guide

Clone the MNC repository:

# Make sure to clone with --recursive
git clone --recursive https://github.com/daijifeng001/MNC.git

Install Python packages: numpy, scipy, cython, python-opencv, easydict, yaml.
Build the Cython modules and the gpu_nms, gpu_mask_voting modules by:

cd $MNC_ROOT/lib
make

Install Caffe and pycaffe dependencies (see: Caffe installation instructions for official installation guide)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# CUDNN is recommended in building to reduce memory footprint
USE_CUDNN := 1

Build Caffe and pycaffe:

cd $MNC_ROOT/caffe-mnc
# If you have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe

Demo

First, download the trained MNC model.

./data/scripts/fetch_mnc_model.sh

Run the demo:

cd $MNC_ROOT
./tools/demo.py

Result demo images will be stored to data/demo/.

The demo performs instance-aware semantic segmentation with a trained MNC model (using VGG-16 net). The model is pre-trained on ImageNet, and finetuned on VOC 2012 train set with additional annotations from SBD. The mAP^r of the model is 65.0% on VOC 2012 validation set. The test speed per image is ~0.33sec on Titian X and ~0.42sec on K40.

Training

This repository contains code to end-to-end train MNC for instance-aware semantic segmentation, where gradients across cascaded stages are counted in training.

Preparation:

Run ./data/scripts/fetch_imagenet_models.sh to download the ImageNet pre-trained VGG-16 net.
Download the VOC 2007 dataset to ./data/VOCdevkit2007
Run ./data/scripts/fetch_sbd_data.sh to download the VOC 2012 dataset together with the additional segmentation annotations in SBD to ./data/VOCdevkitSDS.

1. End-to-end training of MNC for instance-aware semantic segmentation

To end-to-end train a 5-stage MNC model (on VOC 2012 train), use experiments/scripts/mnc_5stage.sh. Final mAP^r@0.5 should be ~65.0% (mAP^r@0.7 should be ~46.3%), on VOC 2012 validation.

cd $MNC_ROOT
./experiments/scripts/mnc_5stage.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng 1701 RNG_SEED 1701

2. Training of CFM for instance-aware semantic segmentation

The code also includes an entry to train a convolutional feature masking (CFM) model for instance aware semantic segmentation.

@inproceedings{dai2015convolutional,
    title={Convolutional Feature Masking for Joint Object and Stuff Segmentation},
    author={Dai, Jifeng and He, Kaiming and Sun, Jian},
    booktitle={CVPR},
    year={2015}
}

2.1. Download pre-computed MCG proposals

Download and process the pre-computed MCG proposals.

cd $MNC_ROOT
./data/scripts/fetch_mcg_data.sh
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db train --output data/cache/voc_2012_train_mcg_maskdb/
python ./tools/prepare_mcg_maskdb.py --para_job 24 --db val --output data/cache/voc_2012_val_mcg_maskdb/

Resulting proposals would be at folder data/MCG/.

2.2. Train the model

Run experiments/scripts/cfm.sh to train on VOC 2012 train set. Final mAP^r@0.5 should be ~60.5% (mAP^r@0.7 should be ~42.6%), on VOC 2012 validation.

cd $MNC_ROOT
./experiments/scripts/cfm.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng 1701 RNG_SEED 1701

3. End-to-end training of Faster-RCNN for object detection

Faster-RCNN can be viewed as a 2-stage cascades composed of region proposal network (RPN) and object detection network. Run script experiments/scripts/faster_rcnn_end2end.sh to train a Faster-RCNN model on VOC 2007 trainval. Final mAP^b should be ~69.1% on VOC 2007 test.

cd $MNC_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] VGG16 [--set ...]
# GPU_ID is the GPU you want to train on
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
caffe-mnc @ cf78fd6		caffe-mnc @ cf78fd6
data		data
experiments		experiments
files		files
lib		lib
models/VGG16		models/VGG16
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
preprocess_kitti_mnc.sh		preprocess_kitti_mnc.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Fork Information

Introduction

Misc.

Citing MNC

Main Results

Installation guide

Demo

Training

Preparation:

1. End-to-end training of MNC for instance-aware semantic segmentation

2. Training of CFM for instance-aware semantic segmentation

2.1. Download pre-computed MCG proposals

2.2. Train the model

3. End-to-end training of Faster-RCNN for object detection

About

Releases

Packages

Languages

License

AndreiBarsan/MNC

Folders and files

Latest commit

History

Repository files navigation

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Fork Information

Introduction

Misc.

Citing MNC

Main Results

Installation guide

Demo

Training

Preparation:

1. End-to-end training of MNC for instance-aware semantic segmentation

2. Training of CFM for instance-aware semantic segmentation

2.1. Download pre-computed MCG proposals

2.2. Train the model

3. End-to-end training of Faster-RCNN for object detection

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages