yolo v2 vgg16 pytorch

re-implementation of yolo v2 detection using torchvision vgg16 bn model.

Intoduction

A pytorch implementation of vgg16 version of yolo v2 described in YOLO9000: Better, Faster, Stronger paper by Joseph Redmon, Ali Farhadi. The goal of this repo. is to re-implement a famous one-stage object detection, yolo v2 using torchvision models.

Requirements

Python 3.7
Pytorch 1.5.0
visdom
numpy
cv2
matplotlib

Quick start detection

1. download model weight from here move into ./saves

2. prefare PATH that contains the image you want to detect -- PATH (e.g. 'D:\image_path')

3. check the TYPE of image. -- TYPE (e.g. 'jpg')

4. Enter tne command : python demo.py --demo_img_path PATH --demo_img_type TYPE

Results

methods	Traning Dataset	Testing Dataset	Resolution	mAP	Fps
original papers	VOC2007 train + VOC2012 train	VOC2007 test	416 x 416	76.8	67
ours	VOC2007 train + VOC2012 train	VOC2007 test	416 x 416	77.03	58

detection result of voc 2007.

Implementation

Dataset

Firstly, you make a dataset file structure like bellow for voc train and test.

VOCtrainval needs to contain TRAIN file and VOCtest contain TEST file.

root|-- TEST
        |-- VOC2007
            |-- Annotations
            |-- ImageSets
            |-- JPEGImages
            |-- SegmentationClass
            |-- SegmentationObject
    |-- TRAIN
        |-- VOC2007
            |-- Annotations
            |-- ImageSets
            |-- JPEGImages
            |-- SegmentationClass
            |-- SegmentationObject
        |-- VOC2012
            |-- Annotations
            |-- ImageSets
            |-- JPEGImages
            |-- SegmentationClass
            |-- SegmentationObject

to train, we used voc2007trainval + voc2012trainval dataset,

to test, we used voc2007test dataset

Model

Unlike the existing yolo v2, the backbone uses vgg instead of darknet 19, and the modules behind it have been modified a little bit.

Loss

01- what is the cell concept?

yolo considers the final layer feature map size as a cell size.

For example, an image of 416 resolution becomes a cell of 13 size.

02- make_target

to assign gt bbox to anchors.

so we get positive anchors if iou(bbox, anchors) > 0.5

For positive anchors, it is xy_gt that scales 0 to 1 to which position in the center of gt_bbox corresponds to the cell.

also, for positive anchors, wh_gt is the ratio of gt_bbox and anchor boxes.

gt_conf is max iou(pred_bbox, gt_bbox) for each anchor in cells.

no conf is 1 - gt_conf

03- whole loss

whole loss consists of xy centor loss, wh ratio loss, confidence loss, no conf loss, and classification loss. original paper losses are sum square errors of each component, except to wh ration loss is root sse.

Train

optimizer is SGD (weight_decay : 5e-4, momentum : 0.9)

train until convergence (about 150 epochs)

learning rate decay

Epoches	Learning rate
000-099	1e-4
100-149	1e-5

Evaluation

evaluation is a voc metric, mAP(iou>0.5) and exactly same to official python mAP code https://github.com/Cartucho/mAP

Start Guide for Train / Test / Demo

for training

# python main.py 
usage: main.py [-h] [--batch_size] [--conf_thres] 
               [--epochs] [--lr] [--num_workers]
               [--save_file_name] [--save_path] [--data_path]
               [--start_epoch]

  -h, --help            show this help message and exit
  --batch_size          for training batch size, test batch is only 1.
  --conf_thres          for testing, confience threshold, default 0.01 
  --epochs              whole traning epochs   (default 200)
  --lr                  initial learning rate (default 1e-4) 
  --num_workers         dataset num_workers (default 4)
  --save_file_name      when you do experiment, you can change save_file_name to distinguish other pths.
  --save_path           the path to save .pth file
  --data_path           data path for training and testing refer to Implementations/dataset (default="D:\Data\VOC_ROOT")
  --num_classes         number of dataset classes (voc : 20, coco:80) (default=20)
  --dataset_type        which dataset you want to use VOC or COC0 (default='voc')
  --start_epoch         when you resume, set the start epochs.

before test and demo, we must have trained .pth file(weight params) you can download 149 epoch weights at https://livecauac-my.sharepoint.com/:u:/g/personal/csm8167_cau_ac_kr/EWshHPoe9-tOgLUtreWDUeEBmMwMXaAA1VT1rniLf_x7Sg?e=0MwAUa

and then make ./saves file place the weight in the file.

for testing

# python test.py 
usage: test.py [-h] [--data_path] [--test_epoch] 
               [--save_path] [--save_file_name] [--conf_thres]

  -h, --help            show this help message and exit
  --data_path           for testing, voc 2007 path (because split =='TEST') (default="D:\Data\VOC_ROOT")
  --test_epoch          for testing, which epoch param do we get
  --save_path           for testing, params path (default './saves') 
  --save_file_name      save_file_name to distinguish other params. (default 'yolo_v2_vgg_16')
  --conf_thres          for testing, confience threshold which detector detect above the thres (default 0.01)

for demo

# python demo.py 
usage: demo.py [-h] [--demo_img_path] [--demo_img_type] 
               [--vis] [--epoch] [--save_path]
               [--save_file_name] [--conf_thres]

  -h, --help            show this help message and exit
  --demo_img_path       The path that contains the image you want to detect
  --demo_img_type       The type of images you want to detect
  --vis                 Whether to visualize (default False)
  --epoch               for demo, which epoch param do we get
  --save_path           for demo, params path (default './saves') 
  --save_file_name      save_file_name to distinguish other params. (default 'yolo_v2_vgg_16')
  --conf_thres          for demo, confience threshold which detector detect above the thres (default 0.35)

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
dataset		dataset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
anchor.py		anchor.py
config.py		config.py
demo.py		demo.py
loss.py		loss.py
main.py		main.py
model.py		model.py
test.py		test.py
train.py		train.py
utils.py		utils.py
voc_eval.py		voc_eval.py

License

csm-kr/yolo_v2_vgg16_pytorch

Folders and files

Latest commit

History

Repository files navigation

yolo v2 vgg16 pytorch

Intoduction

Requirements

Quick start detection

Results

Implementation

Dataset

Model

Loss

Train

Evaluation

Start Guide for Train / Test / Demo

About

Topics

Resources

License

Stars

Watchers

Forks

Languages