FasterRCNN

Name		Name	Last commit message	Last commit date
parent directory ..
convert_d2		convert_d2
dataset		dataset
modeling		modeling
utils		utils
BALLOON.md		BALLOON.md
NOTES.md		NOTES.md
README.md		README.md
common.py		common.py
config.py		config.py
data.py		data.py
eval.py		eval.py
predict.py		predict.py
train.py		train.py
viz.py		viz.py

README.md

Faster R-CNN / Mask R-CNN on COCO

This example provides a minimal (2k lines) and faithful implementation of the following object detection / instance segmentation papers:

with the support of:

Multi-GPU / multi-node distributed training, multi-GPU evaluation
Cross-GPU BatchNorm (aka Sync-BN, from MegDet: A Large Mini-Batch Object Detector)
Group Normalization
Training from scratch (from Rethinking ImageNet Pre-training)

This is likely the best-performing open source TensorFlow reimplementation of the above papers.

Dependencies

OpenCV, TensorFlow ≥ 1.6
pycocotools/scipy: for i in cython 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI' scipy; do pip install $i; done
Pre-trained ImageNet ResNet model from tensorpack model zoo
COCO data. It needs to have the following directory structure:

COCO/DIR/
  annotations/
    instances_train201?.json
    instances_val201?.json
  train201?/
    # image files that are mentioned in the corresponding json
  val201?/
    # image files that are mentioned in corresponding json

You can use either the 2014 version or the 2017 version of the dataset. To use the common "trainval35k + minival" split for the 2014 dataset, just download the annotation files instances_minival2014.json, instances_valminusminival2014.json from here to annotations/ as well.

Usage

It is recommended to get familiar the relevant papers listed above before using this code. Otherwise you may end up doing something unreasonable.

Train:

To train on a single machine (with 1 or more GPUs):

./train.py --config \
    BACKBONE.WEIGHTS=/path/to/ImageNet-R50-AlignPadding.npz \
    DATA.BASEDIR=/path/to/COCO/DIR \
    [OTHER-ARCHITECTURE-SETTINGS]

Alternatively, use TRAINER=horovod which supports distributed training as well, but less straightforward to run. Refer to HorovodTrainer docs for details.

All options can be changed by either the command line or the config.py file (recommended). Some reasonable configurations are listed in the table below. See config.py for details about how to correctly set BACKBONE.WEIGHTS and other configs.

Inference:

To predict on given images (needs DISPLAY to show the outputs):

./predict.py --predict input1.jpg input2.jpg --load /path/to/Trained-Model-Checkpoint --config SAME-AS-TRAINING

To evaluate the performance of a model on COCO:

./predict.py --evaluate output.json --load /path/to/Trained-Model-Checkpoint \
    --config SAME-AS-TRAINING

Several trained models can be downloaded in the table below. Evaluation and prediction have to be run with the corresponding configs used in training.

Results

These models are trained on train2017 and evaluated on val2017 using mAP@IoU=0.50:0.95. Unless otherwise noted, all models are fine-tuned from ImageNet pre-trained R50/R101 models in tensorpack model zoo, using 8 NVIDIA V100s.

Performance in Detectron can be reproduced.

Backbone	mAP (box;mask)	Detectron mAP ¹ (box;mask)	Time (on 8 V100s)	Configurations (click to expand)
R50-FPN	34.8		6.5h	super quick `MODE_MASK=False FRCNN.BATCH_PER_IM=64` `PREPROC.TRAIN_SHORT_EDGE_SIZE=[500,800] PREPROC.MAX_SIZE=1024`
R50-C4	35.6	34.8	22.5h	standard `MODE_MASK=False MODE_FPN=False`
R50-FPN	37.5	36.7	10.5h	standard `MODE_MASK=False`
R50-C4	36.2;31.8 ⬇️	35.8;31.4	23h	standard `MODE_FPN=False`
R50-FPN	38.2;34.8	37.7;33.9	12.5h	standard this is the default
R50-FPN	38.9;35.4 ⬇️	38.6;34.5	24h	2x `TRAIN.LR_SCHEDULE=2x`
R50-FPN-GN	40.4;36.3 ⬇️	40.3;35.7	29h	2x+GN `FPN.NORM=GN BACKBONE.NORM=GN` `FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` `FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head` `TRAIN.LR_SCHEDULE=2x`
R50-FPN	41.7;36.2 ⬇️		16h	+Cascade `FPN.CASCADE=True`
R50-FPN-GN	46.1;40.1 ⬇️		36h (on 16 V100s)	4x+GN+Cascade+TrainAug `FPN.CASCADE=True` `FPN.NORM=GN BACKBONE.NORM=GN` `FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` `FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head` `PREPROC.TRAIN_SHORT_EDGE_SIZE=[640,800]` `TRAIN.LR_SCHEDULE=4x`
R101-C4	40.1;34.6 ⬇️		27h	standard `MODE_FPN=False` `BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]`
R101-FPN	40.7;36.8 ⬇️ ²	40.0;35.9	17h	standard `BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]`
R101-FPN	46.6;40.3 ⬇️		64h	3x+Cascade+TrainAug `FPN.CASCADE=True` `BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` `TEST.RESULT_SCORE_THRESH=1e-4` `PREPROC.TRAIN_SHORT_EDGE_SIZE=[640,800]` `TRAIN.LR_SCHEDULE=3x`
R101-FPN-GN (From Scratch)	47.7;41.7 ⬇️ ³	47.4;40.5	28h (on 64 V100s)	9x+GN+Cascade+TrainAug `FPN.CASCADE=True` `BACKBONE.RESNET_NUM_BLOCKS=[3,4,23,3]` `FPN.NORM=GN BACKBONE.NORM=GN` `FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head` `FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head` `PREPROC.TRAIN_SHORT_EDGE_SIZE=[640,800]` `TRAIN.LR_SCHEDULE=9x` `BACKBONE.FREEZE_AT=0`

1: Numbers taken from Detectron Model Zoo. We compare models that have identical training & inference cost between the two implementations. Their numbers can be different due to small implementation details.

2: Our mAP is 7 point better than the official model in matterport/Mask_RCNN which has the same architecture. Our implementation is also 5x faster.

3: This entry does not use ImageNet pre-training. Detectron numbers are taken from Fig. 5 in Rethinking ImageNet Pre-training. Note that our training strategy is slightly different: we enable cascade throughout the entire training. As far as I know, this model is the best open source TF model on COCO dataset.

Other Datasets / Implementation Details / Speed:

See BALLOON.md and NOTES.md for more details.

Files

FasterRCNN

Directory actions

More options

Directory actions

More options

Latest commit

History

FasterRCNN

Folders and files

parent directory

Faster R-CNN / Mask R-CNN on COCO

Dependencies

Usage

Train:

Inference:

Results

Other Datasets / Implementation Details / Speed: