YOLOv2 : Object Detection

YOLOv2 is a state of the art Object Detection Neural Network. This repository is an implementation of YOLOv2 from scratch in tensorflow (eager).

There are two different versions of this Network:

Trained on Pascal VOC 2012
Trained on Multiview 3D Hand Pose Dataset

The weights for these can be downloaded from Model checkpoint for Object Detection VOC and Model checkpoint for Hand Detection.

Result

Requirements

Python 3.5+
Tensorflow version 1.7.0 and its requirements.
NumPy 1.9.0.
OpenCV 3.4.0 The opencv-python pip should work
CUDA version 9 (Strongly Recommended)
cuDNN version 7.0 (Recommended)

Datasets

The datasets used to train the network are listed below:

Name of Dataset	No. of Images	No. of Classes
Pascal VoC 2013	17125	20
Multiview 3D Hand Pose Dataset	Around 700K	1

Network Architecture

	Layer	Filters	Size	Input Layer	Input Dimension	Output Dimension
1	Conv 2D	32	3 x 3 / 1	Input	416 x 416 x 3	416 x 416 x 32
1	Max Pool		2 x 2 / 2	1 Conv 2D	416 x 416 x 32	208 x 208 x 32
2	Conv 2D	64	3 x 3 / 1	1 Max Pool	208 x 208 x 32	208 x 208 x 64
2	Max Pool		2 x 2 / 2	2 Conv 2D	208 x 208 x 64	104 x 104 x 64
3	Conv 2D	128	3 x 3 / 1	2 Max Pool	104 x 104 x 64	104 x 104 x 128
4	Conv 2D	64	3 x 3 / 1	3 Conv 2D	104 x 104 x 128	104 x 104 x 64
5	Conv 2D	128	3 x 3 / 1	4 Conv 2D	104 x 104 x 64	104 x 104 x 128
5	Max Pool		2 x 2 / 2	5 Conv 2D	104 x 104 x 128	52 x 52 x 128
6	Conv 2D	256	3 x 3 / 1	5 Max Pool	52 x 52 x 128	52 x 52 x 256
7	Conv 2D	128	3 x 3 / 1	6 Conv 2D	52 x 52 x 256	52 x 52 x 128
8	Conv 2D	256	3 x 3 / 1	7 Conv 2D	52 x 52 x 128	52 x 52 x 256
8	Max Pool		2 x 2 / 2	8 Conv 2D	52 x 52 x 256	26 x 26 x 256
9	Conv 2D	512	3 x 3 / 1	8 Max Pool	26 x 26 x 256	26 x 26 x 512
10	Conv 2D	256	3 x 3 / 1	9 Conv 2D	26 x 26 x 512	26 x 26 x 256
11	Conv 2D	512	3 x 3 / 1	10 Conv 2D	26 x 26 x 256	26 x 26 x 512
12	Conv 2D	256	3 x 3 / 1	11 Conv 2D	26 x 26 x 512	26 x 26 x 256
13	Conv 2D	512	3 x 3 / 1	12 Conv 2D	26 x 26 x 256	26 x 26 x 512
13	Max Pool		2 x 2 / 2	13 Conv 2D	26 x 26 x 512	13 x 13 x 512
14	Conv 2D	1024	3 x 3 / 1	13 Max Pool	13 x 13 x 512	13 x 13 x 1024
15	Conv 2D	512	3 x 3 / 1	14 Conv 2D	13 x 13 x 1024	13 x 13 x 512
16	Conv 2D	1024	3 x 3 / 1	15 Conv 2D	13 x 13 x 512	13 x 13 x 1024
17	Conv 2D	512	3 x 3 / 1	16 Conv 2D	13 x 13 x 1024	13 x 13 x 512
18	Conv 2D	1024	3 x 3 / 1	17 Conv 2D	13 x 13 x 512	13 x 13 x 1024
19	Conv 2D	1024	3 x 3 / 1	18 Conv 2D	13 x 13 x 1024	13 x 13 x 1024
20	Conv 2D	1024	3 x 3 / 1	19 Conv 2D	13 x 13 x 1024	13 x 13 x 1024
21	Conv 2D	64	3 x 3 / 1	13 Conv 2D	26 x 26 x 512	26 x 26 x 64
22	Conv 2D	1024	3 x 3 / 1	20 + 21	13 x 13 x (1024 + 4*64)	13 x 13 x 1024
23	Conv 2D	125 = 5 x (1+4+20)	3 x 3 / 1	22 Conv 2D	13 x 13 x 1024	13 x 13 x 125

First time setup

git clone https://github.com/pmkalshetti/object_detection.git
cd object_detection
pip install -r requirements.txt

Model - Generating data, Training and Prediction

Make necessary changes in constants.py to reflect your dataset (both train and prediction) then run the following commands to make TFRecords, that will be loaded into the train.py.

    cd src
    python3 write_data_to_TFRecords.py

Train the model using the following command. This might take a long time.

    python3 train.py

Do the prediction. If not done yet, make changes to reflect your input to do the prediction. Make changes in constants.py to reflect the output directory for network predictions. Then run the following to do the inference.

    python3 predict.py

This will currently work for the dummy data provided, if you do not make any changes in constants.py.

Results

The mAP achieved with this implementation of YOLO is 0.82.

Here are some visual results obtained by the network.

References

Original Paper: YOLO9000: Better, Faster, Stronger
Original paper's authors webpage for weights and other info: Darknet
A YOLOv2 implementation in Keras: https://github.com/allanzelener/YAD2K
Dataset: Pascal VOC 2012
Dataset: Multiview 3D Hand Pose Dataset

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
data		data
docs		docs
hand_detection		hand_detection
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
animation.gif		animation.gif
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

docs

docs

hand_detection

hand_detection

notebooks

notebooks

src

src

.gitignore

.gitignore

README.md

README.md

animation.gif

animation.gif

requirements.txt

requirements.txt

Repository files navigation

YOLOv2 : Object Detection

Result

Requirements

Datasets

Network Architecture

First time setup

Model - Generating data, Training and Prediction

Results

References

About

Releases

Packages

Contributors 2

Languages

pmkalshetti/object_detection_old

Folders and files

Latest commit

History

Repository files navigation

YOLOv2 : Object Detection

Result

Requirements

Datasets

Network Architecture

First time setup

Model - Generating data, Training and Prediction

Results

References

About

Resources

Stars

Watchers

Forks

Languages