Learning Object Detection from Scratch with Gated Recurrent Feature Pyramids
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md add voc lmdb links Mar 19, 2018

README.md

GRP-DSOD

We have released the GRP-DSOD code in https://github.com/szq0214/DSOD. Check out the pycaffe code there if you would like to reproduce the exact same results as in the paper.

In this repository, we are planning to release a pytorch version of DSOD and GRP-DSOD - stay tuned!

We also see some very promising results on the PASCAL VOC Comp3 Leaderboard, like https://github.com/kuangliu/torchcv. Unfortunately, they used the ImageNet pre-trained models as the initialized parameters (https://github.com/kuangliu/torchcv/issues/11). Please note that the Comp3 Challenge only allows to use the VOC12 dataset for training (without the pre-trained models). Please check your training process carefully.

If you find this helps your research, please cite:

@article{shen2017learning,
     title={Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids},
     author={Shen, Zhiqiang and Shi, Honghui and Feris, Rogerio and Cao, Liangliang and Yan, Shuicheng and Liu, Ding and Wang, Xinchao and Xue, Xiangyang and Huang, Thomas S},
     journal={arXiv preprint arXiv:1712.00886},
     year={2017}
}

Introduction

In GRP-DSOD, we propose a recurrent feature-pyramid structure to squeeze rich spatial and semantic features into a single prediction layer that further reduces the number of parameters to learn (DSOD need learn 1/2, but GRP-DSOD need only 1/3). Thus our new model is more fit for learning from scratch, and can converge faster than DSOD. We also introduce a novel gate-controlled prediction strategy in GRP-DSOD to adaptively enhance or attenuate feature activations at different scales based on the input object size.

Figure 1: Illustration of the motivation of GRP-DSOD.
Figure 2: An overview of GRP-DSOD together with three one-stage detector methods.

Visualization

  1. Visualizations of network structures (tools from ethereon, please ignore the warning messages):

Results & Models

Our PASCAL VOC LMDB files:

Method LMDBs
Train on VOC07+12 and test on VOC07 Download
Train on VOC07++12 and test on VOC12 (Comp4) Download
Train on VOC12 and test on VOC12 (Comp3) Download

The tables below show the results on PASCAL VOC 2007, 2012 and 2012 Comp3 (training on VOC 2012 only).

PASCAL VOC test results:

Method VOC 2007 test mAP # params Models
GRP-DSOD300 (07+12) 78.5 14.1M Download (56.5M)
GRP-DSOD320 (07+12) 78.7 14.2M Download (56.8M)
GRP-DSOD320* (07+12) 79.0 16.0M Download (63.9M)
Method VOC 2012 test mAP # params Models
GRP-DSOD320* (12) 72.5 (VOC Comp3) 16.0M Download (63.9M)
GRP-DSOD320 (07++12) 77.0 14.2M Download (56.8M)
GRP-DSOD320* (07++12) -- -- Running

Contact

Zhiqiang Shen (zhiqiangshen0214 at gmail.com)

Any comments or suggestions are welcome!