Skip to content

YOLOV1 to V4 implemented in tensorflow 2 (tf.keras).

Notifications You must be signed in to change notification settings

samson6460/tf2_YOLO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tf2_YOLO

example example Open In Colab

Use 6 steps to train your own YOLO model in your codes!

1⃣ Create YOLO object(v1~v4).

2⃣ Read files(labelme style, labelimg style).

3⃣ Get anchor boxes.

4⃣ Create model(with DarkNet, ResNet backbone...).

5⃣ Compile model.

6⃣ Train the model!

tf2_YOLO is my implementation of YOLOv1 to YOLOv4 using Tensorflow 2.X(tf.keras) after delving into 4 papers of YOLO:

YOLOv1: You Only Look Once: Unified, Real-Time Object Detection by Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi (https://arxiv.org/abs/1506.02640).

YOLOv2(YOLO9000): Better, Faster, Stronger by Joseph Redmon, Ali Farhadi (https://arxiv.org/abs/1612.08242).

YOLOv3: An Incremental Improvement by Joseph Redmon, Ali Farhadi (https://arxiv.org/abs/2004.10934).

YOLOv4: Optimal Speed and Accuracy of Object Detection, Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao (https://arxiv.org/abs/1804.02767).

This repo refers to lots of resources, including the source code of darknet:

Most importantly, the repo is written in Python and Tensorflow, so you can easily modify anything in it and program comfortably.

Table of Contents

Installation

  1. Clone or download

    • Use the command bellow in terminal to git clone:
      git clone https://github.com/samson6460/tf2_YOLO.git

    • Or just download whole files using the [Code > Download ZIP] button in the upper right corner.

  2. Install dependent packages: pip install -r requirements.txt

Sample applications

Raccoon detection

result

Open sample code in Colab

dataset from: https://github.com/datitran/raccoon_dataset

Blood cell detection

result

Open sample code in Colab

dataset from: https://github.com/Shenggan/BCCD_Dataset.git

Usage

Each version of YOLO is independent, you can copy the parts you want to your own project directory.

And follow the documentation bellow to train and evaluate your YOLO model.

1. Create YOLO class

YOLOv1

from tf2_YOLO import yolov1_5
yolo = yolov1_5.Yolo(input_shape, class_names)

YOLOv2

from tf2_YOLO import yolov2
yolo = yolov2.Yolo(input_shape, class_names)

YOLOv3

from tf2_YOLO import yolov3
yolo = yolov3.Yolo(input_shape, class_names)

YOLOv4

from tf2_YOLO import yolov4
yolo = yolov4.Yolo(input_shape, class_names)
  • input_shape: A tuple of 3 integers, shape of input image.
  • class_names: A list, containing all label names.

2. Read file

Read as array(read into RAM)

img, label = yolo.read_file_to_dataset(
    img_path, 
    label_path)

or

Read as tf.Sequence

seq = yolo.read_file_to_sequence(
    img_path, 
    label_path,
    batch_size)
  • img_path: A string, file path of images.

  • label_path: A string, file path of annotations.

  • batch_size: An integer, size of the batches of data (default: 20).

Returns from YOLOv1、YOLOv2

A tuple of 2 ndarrays, (img, label),

  • shape of img: (batch_size, img_heights, img_widths, channels)
  • shape of label: (batch_size, grid_heights, grid_widths, info)

Returns from YOLOv3、YOLOv4

A tuple: (img: ndarray, label_list: list), label_list contains the label of all FPN layers.

  • shape of img: (batch_size, img_heights, img_widths, channels)
  • shape of label: (batch_size, grid_heights, grid_widths, info)

3. Visualize one of images with its annotation

YOLOv1、YOLOv2

yolo.vis_img(img[0], label[0])

YOLOv3、YOLOv4

yolo.vis_img(img[0], label[2][0])

4. Get anchor boxes for yolov2 ~ yolov4

YOLOv2

from utils.kmeans import kmeans, iou_dist
import numpy as np

all_boxes = label[label[..., 4] == 1][..., 2:4]
anchors = kmeans(
    all_boxes,
    n_cluster=5,
    dist_func=iou_dist,
    stop_dist=0.00001)
anchors = np.sort(anchors, axis=0)[::-1]

YOLOv3、YOLOv4

from utils.kmeans import kmeans, iou_dist
import numpy as np

all_boxes = label[-1][label[-1][..., 4] == 1][..., 2:4]
anchors = kmeans(
    all_boxes,
    n_cluster=9,
    dist_func=iou_dist,
    stop_dist=0.00001)
anchors = np.sort(anchors, axis=0)[::-1]

5. Create model

YOLOv1

model = yolo.create_model(bbox_num)
  • bbox_num: An integer, the number of bounding boxes.

YOLOv2、YOLOv3、YOLOv4

model = yolo.create_model(anchors)
  • anchors: 2D array like, prior anchor boxes(widths, heights), all the values should be normalize to 0-1.

6. Compile model

YOLOv1、YOLOv2

from utils.tools import get_class_weight
from tensorflow.keras.optimizers import Adam

binary_weight = get_class_weight(
    label[..., 4:5],
    method='binary'
    )

loss = yolo.loss(binary_weight)

metrics = yolo.metrics("obj+iou+recall0.5")

yolo.model.compile(optimizer=Adam(lr=1e-4),
                   loss=loss,
                   metrics=metrics)

YOLOv3、YOLOv4

from utils.tools import get_class_weight
from tensorflow.keras.optimizers import Adam

binary_weight_list = []

for i in range(len(label)):
    binary_weight_list.append(
        get_class_weight(
        label[i][..., 4:5],
        method='binary'
        )
    )

loss = yolo.loss(binary_weight_list)

metrics = yolo.metrics("obj+iou+recall0.5")

yolo.model.compile(optimizer=Adam(lr=1e-4),
                   loss=loss,
                   metrics=metrics
                   )

7. Train model

Train with dataset

yolo.model.fit(
    data,
    label,
    epochs)

Train with tf.Sequence

yolo.model.fit(
    seq,
    epochs)

8. Predict and Evaluate

YOLOv1、YOLOv2

from utils.measurement import create_score_mat

prediction = yolo.model.predict(data)

# visualize one image with its annotation
yolo.vis_img(
    data[0], prediction[0],
    nms_mode=2)

create_score_mat(
    label,
    prediction,
    class_names=yolo.class_names,
    nms_mode=2,
    version=1 # or version=2
    )
print(create_score_mat)

YOLOv3、YOLOv4

from utils.measurement import create_score_mat

prediction = yolo.model.predict(data)

# visualize one image with its annotation
yolo.vis_img(
    data[0],
    prediction[2][0],
    prediction[1][0],
    prediction[0][0],
    nms_mode=2)

create_score_mat(
    label[-1],
    prediction[2],
    prediction[1],
    prediction[0],
    class_names=yolo.class_names,
    nms_mode=2,
    version=3)
print(create_score_mat)
  • nms_mode: An integer,
    • 0: Not use NMS.
    • 1: Use NMS.
    • 2: Use Soft-NMS.
    • 3: Use DIoU-NMS.
  • version: An integer, specifying the decode method, yolov1, 2, 3 or 4.

9. Get PR curve and mAP

YOLOv1、YOLOv2

from utils.measurement import PR_func

pr = PR_func(
    label,
    prediction,
    class_names=yolo.class_names,
    max_per_img=100,
    version=1 # or version=2
    )

pr.plot_pr_curve(smooth=False)

pr.get_map(mode="voc2012")

YOLOv3、YOLOv4

from utils.measurement import PR_func

pr = PR_func(
    data[-1],
    prediction[2],
    prediction[1],
    prediction[0],
    class_names=yolo.class_names,
    max_per_img=100,
    version=3
    )

pr.plot_pr_curve(smooth=False)

pr.get_map(mode="voc2012")
  • max_per_img: An integer, limit the number of objects that an image can detect at most.
  • version: An integer, specifying the decode method, yolov1, 2, 3 or 4.
  • smooth: A boolean, if True, use interpolated precision.
  • mode: A string, one of "voc2007", "voc2012"(default), "area", "smootharea".
    • "voc2007": calculate the average precision of recalls at [0, 0.1, ..., 1](11 points).
    • "voc2012": calculate the average precision of recalls at [0, 0.14, 0.29, 0.43, 0.57, 0.71, 1].
    • "area": calculate the area under precision-recall curve.
    • "smootharea": calculate the area under interpolated precision-recall curve.