The source code in this repository contain Light-Head R-CNN: In Defense of Two-Stage Object Detector that use xception* backbone network. It is based on the code from zengarden. I tried it with Pascal VOC and COCO dataset. The test mAP and FPS of the original paper has not yet been achieved. It would be great if someone would give me a advice. The network architecture is below from original paper.
Same as in here.
- Clone the Light_head_R_CNN_xception repository.
git clone https://github.com/geonseoks/Light_head_R_CNN_xception
-
To use the xception* networks, the ckpt files for xception* has to be downloaded. Move the ckpt file into ~your_light_head_rcnn_original_directory/data/imagenet_weights.
-
Move the resnet_v1.py and resnet_utils.py into ~your_light_head_rcnn_original_directory/lib/utils/tf_utils/basemodel.
-
Move the network_desp.py into ~your_light_head_rcnn_original_directory/experiments/user/network_desp.py.
File structure as follows:
~Your_light_head_rcnn_original_directory/
|->data
| |->imagenet_weights
| | |->model.ckpt-3081378.ckpt
|->experiments
| |->user
| | |->network_desp.py
|->lib
| |->utils
| | |->tf_utils
| | | |->basemodel
| | | | |->resnet_v1.py
| | | | |->resnet_utils.py
Same as in here except learning rate (basic_lr = 5e-4 * train_batch_per_gpu * 0.7 is better).
Same as in here.
The bold text on the top line is the result of the original paper.
Train data | Test data | ImageNet backbone accuracy at 224*224 (%) | Base model | Input resolution | GPU | FPS | Epochs | mAP (%) |
---|---|---|---|---|---|---|---|---|
MSCOCO | MSCOCO | 65.9 | xception* | 700x1100 | TITAN XP | 102 | - | 30.7 |
MSCOCO | MSCOCO | 65.0 | xception* | 700x1100 | GTX 1080Ti | 51.89 | 30 | 26.1 |
MSCOCO | MSCOCO | 65.0 | xception* | 700x1100 | TITAN X PASCAL | 31.0 | 30 | 26.1 |
VOC07 | VOC07 | 65.0 | xception* | 700x1100 | GTX 1080Ti | 54.07 | - | 62.0 |
VOC07 | VOC07 | 65.0 | xception* | 700x1100 | TITAN X PASCAL | 33.4 | - | 62.0 |
VOC07 | VOC07 | 65.0 | xception* | 144x144 | TITAN X PASCAL | 164.0 | - | 56.9 |
VOC07+VOC12 | VOC07 | 65.0 | xception* | 144x144 | TITAN X PASCAL | 174.0 | - | 61.0 |
VOC07 | VOC07 | 65.0 | xception* | 128x128 | TITAN X PASCAL | 180.4 | - | 55.0 |
VOC07+VOC12 | VOC07 | 65.0 | xception* | 128x128 | TITAN X PASCAL | - | - | 59.9 |