Example codes for Faster R-CNN
|VOC2007 trainval||VOC2007 test||69.9 mAP ||70.6 mAP|
|VOC2007&2012 trainval||VOC2007 test||73.2 mAP ||74.7 mAP|
ChainerCV and Caffe (py-faster-rcnn) implementations run at almost the same speed. We compared the time it takes to forward an image.
|15.9 FPS||16.2 FPS|
If a path to pretrained model path is not given, weights distributed on the internet will be used.
$ python demo.py [--gpu <gpu>] [--pretrained-model <model_path>] <image>.jpg
This example will automatically download a pretrained weights from the internet when executed. A sample image to try the implementation can be found in the link below.
Difference in the runtime behaviour from the original code
The bounding box follows integer bbox convention in the original implementation, whereas the ChainerCV implementation follows float bbox convention used in COCO. The integer convention encodes right below vertex coordinates of bounding boxes by subtracting one from the ground truth, whereas the float convention does not.
On top of that, the anchors are not discretized in ChainerCV.
For training with VOC2007 (this setting is used by default)
$ python train.py --dataset voc07 --step_size 50000 --iteration 70000 [--gpu <gpu>]
For training with VOC2007+2012
$ python train.py --dataset voc0712 --step_size 80000 --iteration 110000 [--gpu <gpu>]
PlotReport extension uses matplotlib. If you got
RuntimeError: Invalid DISPLAY variable error on Linux environment, adding an environment variable specification is recommended:
$ MPLBACKEND=Agg python train.py OPTIONS
The evaluation score is reported by
DetectionVOCEvaluator during training.
Also, the evaluation can be conducted outside of training loop by using
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016.