This repo contains some simple codes for me to learn the basic of object detection 中文请点击, SSD(Single Shot MultiBox Detector) is a somewhat simple but powerful model to get started. So I try to implement it by myself, hoping I can get more insight in object dectection land. It's really amazing with deep learning and little code that machines can catch object show in the world. I try to reimplement it more readable and with clear codes. I hope this repo will help people who want to learn object detection and feel hard to get started.
-
train.py
-
voc_dataset.py
-
eval.py
-
lib
- augmentations.py
- model.py
- ssd_loss.py
- multibox_endoder.py
- utils.py
- voc_eval.py
-
config.py
-
demo.ipynb
- Install Pytorch, I recommand Anaconda as your packge manager, and you can simplely install Pytorch by
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
for example.
- download VOC2007 trainval and VOC2012 trainval, download VOC2007 testset, extract them and put them in a folder
if you are using a linux machine, simple run
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtrainval_11-May-2012.tar
tar xvf VOCtest_06-Nov-2007.tar
the structures would like
~/VOCdevkit/
-- VOC2007
-- VOC2012
then ~/VOCdevkit is your VOC root.
- for training ssd you need pretrained VGG weights as your basenet's starting point. so download this weight from https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth, then put it in weights folder.
mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
- vim config.py to change learning rate and batch size num...... and so on. It's not really neccesery, some thing you need to care about is VOC_ROOT, change it to your VOC root where you put your VOC data.
- A simple command is all you need
python train.py
or
nohup python -u train.py &
watch -n 1 tail nohup.out
#ctrl+c to quit!
- Question:
- I have a GPU device, how do I use it? The code will detect that and use cuda:0 as default otherwise it use cpu
- I get oom error. just vim config.py and reduce batch size
- I get nan loss value. your learning rate might be too large, try to set a lower learing rate
I have not tested it on VOC dataset for I just reimplemented it for learning purpose, but there still provide a jupyter notebook for you to see the result,download the pretrained weights from https://drive.google.com/drive/folders/1XN-CXifL-2xilx9y8sb3Qmog_sbzW0k-?usp=sharing or use your own weights
jupyter notebook
then go to localhost:8888 by default to see the demo.
Now I provide code to eval on VOC2007 testset, I use Detectron's voc_eval.py to calculate MAP, to eval your model, just run
python eval.py --model=weights/loss-1220.37.pth --save_folder=result
MAP result will show in your screen
- something to notice
--model is your model checkpoint to eval, after running those script a annotations_cache folder and a result(--save_folder) folder will show in this workspace. result folder contains prediction for each class.
Implementation | mAP |
---|---|
origin paper | 0.772 |
this repo(eval using unofficial voc_eval code) | 0.73-0.75 |
- Wei Liu, et al. "SSD: Single Shot MultiBox Detector." ECCV2016.
- The code were mainly inspired by Those two repo, thanks for them for shareing us so elegant work