A caffe implementation of MobileNet-SSD detection network, with pretrained weights on VOC0712 and mAP=0.727.
- Download SSD source code and compile (follow the SSD README).
- Download the pretrained deploy weights from the link above.
- Put all the files in SSD_HOME/examples/
- Run demo.py to show the detection result.
- You can run merge_bn.py to generate a no bn model, it will be much faster.
Create LMDB for your own dataset
- Place the Images directory and Labels directory into same directory. (Each image in Images folder should have a unique label file in Labels folder with same name)
- Modify the labelmap.prototxt file according to your classes.
- Modify the paths and directories in create_list.sh and create_data.sh as specified in same file in comments.
bash create_list.sh, which will create trainval.txt, test.txt and test_name_size.txt
bash create_data.sh, which will generate the LMDB in Dataset directory.
- Delete trainval.txt, test.txt, test_name_size.txt before creation of next LMDB.
- LMDB Creation part is taken from https://github.com/jinfagang/kitti-ssd
Train your own dataset
- Convert your own dataset to lmdb database (follow the SSD README), and create symlinks to current directory.
ln -s PATH_TO_YOUR_TRAIN_LMDB trainval_lmdb ln -s PATH_TO_YOUR_TEST_LMDB test_lmdb
- Create the labelmap.prototxt file and put it into current directory.
- Use gen_model.sh to generate your own training prototxt.
- Download the training weights from the link above, and run train.sh, after about 30000 iterations, the loss should be 1.5 - 2.5.
- Run test.sh to evaluate the result.
- Run merge_bn.py to generate your own no-bn caffemodel if necessary.
python merge_bn.py --model example/MobileNetSSD_deploy.prototxt --weights snapshot/mobilenet_iter_xxxxxx.caffemodel
About some details
There are 2 primary differences between this model and MobileNet-SSD on tensorflow:
- ReLU6 layer is replaced by ReLU.
- For the conv11_mbox_prior layer, the anchors are [(0.2, 1.0), (0.2, 2.0), (0.2, 0.5)] vs tensorflow's [(0.1, 1.0), (0.2, 2.0), (0.2, 0.5)].
Reproduce the result
I trained this model from a MobileNet classifier(caffemodel and prototxt) converted from tensorflow. I first trained the model on MS-COCO and then fine-tuned on VOC0712. Without MS-COCO pretraining, it can only get mAP=0.68.
You can run it on Android with my another project rscnn.