Special Features for This Caffe Repository

  • approximate joint train, test and evaluate models of Faster rcnn/R-FCN, .etc
  • support multi-GPU training
  • support R-FCN with OHEM
  • support Light-head R-CNN / R-FCN++
  • support Cascade R-CNN
  • support FPN (Feature Pyramid Network)
  • support Deformable Conv and Deformable PSROIPooling
  • support SSD layers
  • support YOLOv3 inference
  • Action recognition (Two Stream CNN)
  • CTPN layers for scene text detection, port from tianzhi0549/CTPN
  • script for merging Conv + BatchNorm + Scale layers to 1 layer when those layer are freezed to reduce memory: 'examples/FRCNN/res50/'. script for merge ResNet: 'examples/FRCNN/'.
  • support snapshot after got -SIGTERM (kill command's default signal)
  • logger tools by VisualDL which can visualize loss scalars and feature images .etc
  • support NMS and IOU calc on GPU, Soft-NMS on CPU
  • support box-voting & multi-scale testing
  • support solver learning rate warm-up strategy & cosine decay lr & Cyclical lr (see sgd_solver.cpp)
  • support model file encrypt/decrypt, see 'encrypt_model.cpp' & 'frcnn_api.cpp'

Special layers

Data Preprocess

data enhancement:

  • support Histogram equalization of color image
  • haze-free algorithm

data augmentation:

  • random flip horizontal
  • random jitter
  • hue, saturation, exposure
  • rotate(multiple of 90 degree)

TODO list

  • support batch image greater than 1 (on branch batch)
  • support Rotated R-CNN for rotated bounding box (on branch r-frcnn)
  • support OHEM (see r-fcn)


This repository uses C++11 features, so make sure to use compiler that is compatible of C++11.

Tested on CUDA 8.0/9.2, CuDNN 7.0, NCCLv1#286916a.

GCC v5.4.0/7.3.1, note that versions lower than v5 are not supported. Python 2.7 for python scripts.

cp Makefile.config.example Makefile.config
# modify the content in Makefile.config to adapt your system
# if you like to use VisualDL to log losses, set USE_VISUALDL to 1,
# and cd src/logger && make
make -j7
# extra: 'py' for python interface of Caffe.
# extra: 'pyfrcnn' python wrapper of C++ api. You can use this for demo.
make pyfrcnn py

All following steps, you should do these in the $CAFFE_ROOT path.

Faster R-CNN


The official Faster R-CNN code of NIPS 2015 paper (written in MATLAB) is available here. It is worth noticing that:

  • This repository contains a C++ reimplementation of the Python code(py-faster-rcnn), which is built on caffe.
  • This repository used code from caffe-faster-rcnn commit 8ba1d26 as base framework.


Using sh example/FRCNN/, the will process five pictures in the examples/FRCNN/images, and put results into examples/FRCNN/results.

Note: You should prepare the trained caffemodel into models/FRCNN, such as ZF_faster_rcnn_final.caffemodel for ZF model.

Prepare for training and evaluation

  • The list of training data is examples/FRCNN/dataset/voc2007.trainval.
  • The list of testing data is examples/FRCNN/dataset/voc2007.trainval.
  • Create symlinks for the PASCAL VOC dataset ln -s $YOUR_VOCdevkit_Path $CAFFE_ROOT/VOCdevkit.

As shown in VGG example models/FRCNN/vgg16/train_val.proto, the original pictures should appear at $CAFFE_ROOT/VOCdevkit/VOC2007/JPEGImages/. (Check window_data_param in FrcnnRoiData)

If you want to train Faster R-CNN on your own dataset, you may prepare custom dataset list. The format is as below

# image-id
number of boxes
label x1 y1 x2 y2 difficulty


sh examples/FRCNN/zf/ will start training process of voc2007 data using ZF model.

The ImageNet pre-trained models can be found in this link

If you use the provided training script, please make sure:

  • VOCdevkit is within $CAFFE_ROOT and VOC2007 in within VOCdevkit
  • ZF pretrain model should be put into models/FRCNN/ as ZF.v2.caffemodel

examples/FRCNN/ transform the parameters of bbox_pred layer by mean and stds values, because the regression value is normalized during training and we should recover it to obtain the final model.


sh examples/FRCNN/zf/ the will evaluate the performance of voc2007 test data using the trained ZF model.

  • First Step of This Shell : Test all voc-2007-test images and output results in a text file.
  • Second Step of This Shell : Compare the results with the ground truth file and calculate the mAP.


The program use config file named like config.json to set params. Special params need to be cared about:

  • data_jitter: data augmentation, if set <0 then no jitter,hue,saturation,exposure
  • im_size_align: set to stride of last conv layer of FPN to avoid Deconv shape problem, such as 64, set to 0 to disable
  • bbox_normalize_targets: do bbox norm in training, and do unnorm at testing(do not need convert model weight before testing)
  • test_rpn_score_thresh: you can set >0 to speed up NMS at testing


Scripts and prototxts for different models are listed in the examples/FRCNN

More details about the code in include and src directory:

  • api/FRCNN for demo and test api
  • caffe/FRCNN contains codes related to Faster R-CNN
  • caffe/RFCN for R-FCN
  • caffe/DeformConv for Deformable Conv
  • caffe/SSD for SSD
  • examples/YOLO for YOLOv3 inference, includes converter script and demo. pay attention to the Upsample layer usage.
  • logger dir relates to logger tools
  • modules and yaml-cpp relate to Caffe module layers, which include FPN layers .etc
  • python/frcnn relates to pybind11 interface for demo
  • caffe/ACTION_REC Two-Stream Convolutional Networks for Action Recognition in Video
  • caffe/CTPN relates to CTPN special layers for scene text detection
  • caffe/PR for some layers from caffe PR

Commands, Rebase From Caffe Master

For synchronous with official caffe

Rebase the dev branch

  • git checkout dev
  • git rebase master
  • git push -f origin dev


  • CUB not found, when compile for GPU version, requires a head file <cub/cub.cuh>. CUB is library contained in the official Cuda Toolkit, usually can be found in /usr/local/cuda/include/thrust/system/cuda/detail/. You should add this path in your Makefile.config (try locate cub.cuh to find cub on your system)
  • When Get error: RPC failed; result=22, HTTP code = 0, use git config http.postBuffer 524288000, increases git buffer to 500mb
  • Cannot load module layer dynamic library, the program search the modules first in enviroment variable CAFFE_LAYER_PATH then in predefined DEFAULT_LAYER_PATH in Makefile. So try to set CAFFE_LAYER_PATH in shell script. And this could be happen when using pycaffe.
  • about R-FCN: currently not support class-agnostic (although it is easy to modify), and OHEM method has very little improvement in joint train. also remember to set bg_thresh_lo to 0 when use OHEM.

License and Citation

Caffe is released under the BSD 2-Clause license. The BAIR/BVLC reference models are released for unrestricted use.

Please cite the following papers in your publications if it helps your research:

