# Yolov7 training and inferencing notebook

Reference: [Official YoloV7 github repository](https://github.com/WongKinYiu/yolov7)

## Imports

In [None]:
import os
import glob as glob
import matplotlib.pyplot as plt
import cv2
import random
import numpy as np

In [None]:
!pwd

## Prepare the Dataset

Dataset for Summer 2022 competition Source: [HBRS Bib cloud](https://bib-cloud.bib.hochschule-bonn-rhein-sieg.de/apps/files/?dir=/Shared/b-it-bots-ds/atwork/images/object_detection/YOLO/internal_robocup_2022/FULL_DATASET_SS22_COMPETITION&fileid=14231157) (require HBRS library login credential)

The dataset is structured in the following manner:

```
├── dataset_ss22_v4_yolov7.yaml
├── README.md
├── dataset_ss22_v4_yolov7
        images
        ├── train
        └── valid
        labels
        ├── train
        └── valid

```

## The Dataset YAML File

The dataset YAML (`dataset_ss22_v4_yolov7.yaml`) file containing the path to the training and validation images and labels. This file will also contain the class names from the dataset.

The dataset contains 20 classes.

The following block shows the contents of the `dataset_ss22_v4_yolov7.yaml` file.

```yaml
train: ../dataset_ss22_v4_yolov7/images/train 
val: ../dataset_ss22_v4_yolov7/images/valid

nc: 20

names: ['F20_20_B', 'R20', 'S40_40_B', 'S40_40_G', 'axis', 'bearing_box', 'bracket', 'brown_box', 'cup', 'dishwasher_soap', 'eye_glasses', 'insulation_tape', 'motor', 'pringles', 'screw_driver', 'sponge', 'spoon', 'tennis_ball', 'toothbrush', 'towel']
```

## Visualize a Few Ground Truth Images

In YOLO format, [x_center, y_center, width, height]


```
A------------------------
-------------------------
-------------------------
-------------------------
-------------------------
------------------------B
```

In Bounding Box format, A [x_min, y_min] and B [x_max, y_max].


Visualize 4 random samples from Dataset [Reference](https://www.youtube.com/watch?v=Ciy1J97dbY0&ab_channel=LearnOpenCV)

In [None]:
class_names = ['F20_20_B', 'R20', 'S40_40_B', 'S40_40_G', 'axis', 'bearing_box', 'bracket', 'brown_box', 'cup', 
               'dishwasher_soap', 'eye_glasses', 'insulation_tape', 'motor', 'pringles', 'screw_driver', 'sponge', 
               'spoon', 'tennis_ball', 'toothbrush', 'towel']
colors = np.random.uniform(0, 255, size=(len(class_names), 3))

In [None]:
# Function to convert bounding boxes in YOLO format to xmin, ymin, xmax, ymax.
def yolo2bbox(bboxes):
    xmin, ymin = bboxes[0]-bboxes[2]/2, bboxes[1]-bboxes[3]/2
    xmax, ymax = bboxes[0]+bboxes[2]/2, bboxes[1]+bboxes[3]/2
    return xmin, ymin, xmax, ymax

In [None]:
def plot_box(image, bboxes, labels):
    # Need the image height and width to denormalize
    # the bounding box coordinates
    h, w, _ = image.shape
    for box_num, box in enumerate(bboxes):
        x1, y1, x2, y2 = yolo2bbox(box)
        # denormalize the coordinates
        xmin = int(x1*w)
        ymin = int(y1*h)
        xmax = int(x2*w)
        ymax = int(y2*h)
        width = xmax - xmin
        height = ymax - ymin

        class_name = class_names[int(labels[box_num])]

        cv2.rectangle(
            image,
            (xmin, ymin), (xmax, ymax),
            color=colors[class_names.index(class_name)],
            thickness=2
        )

        font_scale = min(1, max(3, int(w/500)))
        font_thickness = min(2, max(10, int(w/50)))

        p1, p2 = (int(xmin), int(ymin)), (int(xmax), int(ymax))
        # Text width and height
        tw, th = cv2.getTextSize(
            class_name,
            0, fontScale=font_scale, thickness=font_thickness
        )[0]
        p2 = p1[0] + tw, p1[1] + -th - 10
        cv2.rectangle(
            image,
            p1, p2,
            color=colors[class_names.index(class_name)],
            thickness=-1,
        )
        cv2.putText(
            image,
            class_name,
            (xmin+1, ymin-10),
            cv2.FONT_HERSHEY_SIMPLEX,
            font_scale,
            (255, 255, 255),
            font_thickness
        )
    return image

In [None]:
# Function to plot images with the bounding boxes.
def plot(image_paths, label_paths, num_samples):
    all_training_images = glob.glob(image_paths)
    all_training_labels = glob.glob(label_paths)
    all_training_images.sort()
    all_training_labels.sort()

    num_images = len(all_training_images)

    plt.figure(figsize=(15, 12))
    for i in range(num_samples):
        j = random.randint(0, num_images-1)
        # j = 0
        image = cv2.imread(all_training_images[j])
        with open(all_training_labels[j], 'r') as f:
            bboxes = []
            labels = []
            label_lines = f.readlines()
            for label_line in label_lines:
                label = label_line.split(' ')[0]
                bbox_string = label_line.split(' ')[1:]
                x_c, y_c, w, h = bbox_string
                x_c = float(x_c)
                y_c = float(y_c)
                w = float(w)
                h = float(h.split('\n')[0])
                bboxes.append([x_c, y_c, w, h])
                labels.append(label)
        result_image = plot_box(image, bboxes, labels)
        plt.subplot(2, 2, i+1)
        plt.imshow(result_image[:, :, ::-1])
        plt.axis('off')
    plt.subplots_adjust(wspace=0)
    plt.tight_layout()
    plt.show()


In [None]:
# Visualize a few training images.
plot(
    image_paths='dataset_ss22_v4_yolov7/images/train/*', 
    label_paths='dataset_ss22_v4_yolov7/labels/train/*',
    num_samples=4,
)

# plot(
#     image_paths='dataset_ss22_inference/train/images/*', 
#     label_paths='dataset_ss22_inference/train/labels/*',
#     num_samples=4,
# )

## Clone YOLOV7 Repository

In [None]:
if not os.path.exists('yolov7'):
    !git clone https://github.com/WongKinYiu/yolov7

In [None]:
# Change to yoloV7 directory
%cd yolov7

##### Download pretrained weights (if not available)

!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt

##### **Function to Monitor TensorBoard logs**

**NOTE**: TensorBoard logs can be visualized with [Local port link](http://10.20.118.78:31025/#scalars&runSelectionState=eyJ5b2xvNS90cmFpbi9yZXN1bHRzXzEiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzIiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzMiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzQiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzUiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzgiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzgyIjpmYWxzZSwieW9sbzUvdHJhaW4vcmVzdWx0c18xNCI6ZmFsc2UsInlvbG81L3RyYWluL3Jlc3VsdHNfMTMiOmZhbHNlLCJ5b2xvNS90cmFpbi9yZXN1bHRzXzEyIjpmYWxzZSwieW9sbzUvdHJhaW4vcmVzdWx0c18xMSI6ZmFsc2V9) only from HBRS University netowork


## Training using YOLOV7

In [None]:
# set arguments for training Yolov7

TRAIN = True
FREEZE = True # freezing first 15 layers
MULT_GPU = True
GPU_IDs = [0,1] # GPU device ids (default is 0)
EPOCHS = 5
BATCH_SIZE = 256
RESULT_DIR = os.path.expanduser('~') + '/public/logs/yolo7' # set training result directory path
WEIGHTS = 'yolov7_training.pt' # pretrained model weights
HYP_PARAM = '../config_yolov7/hyp.ss22_local_competition.yaml' # hyperparameter yaml file
DATASET_YAML = '../dataset_ss22_v4_yolov7.yaml' # dataset yaml file
CFG_YAML = '../config_yolov7/yolov7_ss22.yaml' # set num. of classes and network architecture

## Helper Functions for Logging

The helper functions are for logging of the results in the notebook while training the models.


In [None]:
def set_res_dir():
    # Directory to store results
    res_dir_count = len(glob.glob(RESULT_DIR + '/train/*'))
    print(f"Current number of result directories: {res_dir_count}")
    if TRAIN:
        RES_DIR = f"{RESULT_DIR}/train/results_{res_dir_count+1}"
        print(RES_DIR)
    else:
        RES_DIR = f"{RESULT_DIR}/train/results_{res_dir_count}"
    return RES_DIR

In [None]:
if TRAIN:   
    if FREEZE:
        RES_DIR = set_res_dir()
        
        if not MULT_GPU:
            # trainig on single GPU
            !python train.py \
                    --batch-size {BATCH_SIZE} \
                    --nosave \
                    --data {DATASET_YAML} \
                    --cfg {CFG_YAML} \
                    --weights {WEIGHTS} \
                    --hyp {HYP_PARAM} \
                    --epochs {EPOCHS} \
                    --name {RES_DIR} \
                    --device {GPU_IDs[0]} \
                    --img 640 640 \
                    --freeze 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

        else:

            # trainig on multi GPUs
            !python -m torch.distributed.launch --nproc_per_node {len(GPU_IDs)} train.py \
                    --batch-size {BATCH_SIZE} \
                    --sync-bn \
                    --nosave \
                    --data {DATASET_YAML} \
                    --cfg {CFG_YAML} \
                    --weights {WEIGHTS} \
                    --hyp {HYP_PARAM} \
                    --epochs {EPOCHS} \
                    --name {RES_DIR} \
                    --device {str(GPU_IDs).replace('[','').replace(']','').replace(' ', '')} \
                    --img 640 640 \
                    --freeze 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
    
    else:
        RES_DIR = set_res_dir()
        # training all layers of model
        if not MULT_GPU:
                    # trainig on single GPU
                    !python train.py \
                            --batch-size {BATCH_SIZE} \
                            --nosave \
                            --data {DATASET_YAML} \
                            --cfg {CFG_YAML} \
                            --weights {WEIGHTS} \
                            --hyp {HYP_PARAM} \
                            --epochs {EPOCHS} \
                            --name {RES_DIR} \
                            --device {GPU_IDs[0]} \
                            --img 640 640 \

        else:
            # trainig on multi GPUs
            !python -m torch.distributed.launch --nproc_per_node {len(GPU_IDs)} train.py \
                    --batch-size {BATCH_SIZE} \
                    --sync-bn \
                    --nosave \
                    --data {DATASET_YAML} \
                    --cfg {CFG_YAML} \
                    --weights {WEIGHTS} \
                    --hyp {HYP_PARAM} \
                    --epochs {EPOCHS} \
                    --name {RES_DIR} \
                    --device {str(GPU_IDs).replace('[','').replace(']','').replace(' ', '')} \
                    --img 640 640 \


else:
    # set the RES_DIR number
    res_dir_count = '1' 
    RES_DIR = f"{RESULT_DIR}/train/results_{res_dir_count}"
    print("Set RES_DIR to: ", RES_DIR)


## Check Out the Validation Predictions and Inference

### Visualization and Inference Utilities

In [None]:
# Function to show validation predictions saved during training.
def show_valid_results(RES_DIR):
    !ls {RES_DIR}
    EXP_PATH = f"{RES_DIR}"
    validation_pred_images = glob.glob(f"{EXP_PATH}/*_pred.jpg") # TODO: detect all image ext files
    print(validation_pred_images)
    for pred_image in validation_pred_images:
        image = cv2.imread(pred_image)
        plt.figure(figsize=(19, 16))
        plt.imshow(image[:, :, ::-1])
        plt.axis('off')
        plt.show()

The following functions are for carrying out inference on images and videos.

In [None]:
# Helper function for inference on images.
def inference(RES_DIR, data_path):
    # Directory to store inference results.
    infer_dir_count = len(glob.glob(RESULT_DIR + '/detect/*'))
    print(f"Current number of inference detection directories: {infer_dir_count}")
    INFER_DIR = f"{RESULT_DIR}/detect/inference_{infer_dir_count+1}"
    print(INFER_DIR)
    # Inference on images.
    !python detect.py \
    --weights {RES_DIR}/weights/best.pt \
    --source {data_path} \
    --name {INFER_DIR} \
    --device 0 \
    --conf 0.60 \
    --img-size 640
    
    return INFER_DIR

In [None]:
def visualize(INFER_DIR):
# Visualize inference images.
    INFER_PATH = f"{INFER_DIR}"
    infer_images = glob.glob(f"{INFER_PATH}/*")
    print(infer_images)
    for pred_image in infer_images:
        image = cv2.imread(pred_image)
        plt.figure(figsize=(19, 16))
        plt.imshow(image[:, :, ::-1])
        plt.axis('off')
        plt.show()

**Visualize validation prediction images.**

In [None]:
show_valid_results(RES_DIR)

## Inference
In this section, we will carry out inference on unseen images and videos from the internet. 

The images for inference are in the `inference_images` directory.

**To carry out inference on images, we just need to provide the directory path where all the images are stored, and inference will happen on all images automatically.**

In [None]:
on_single_image = True

if on_single_image:
    # Inference on single image
    IMAGE_INFER_DIR = inference(RES_DIR, '../inference_images/inference_img01/1562121558.622500193_raw_rgb.jpg')
else:
    # Inference on images.
    IMAGE_INFER_DIR = inference(RES_DIR, '../day3_test_images')


IMAGE_INFER_DIR

In [None]:
# IMAGE_INFER_DIR
visualize(IMAGE_INFER_DIR)

## Export model (.pt) to ONNX model (.onnx)

Note: The exported ONNX model is not yet tested in C++ (TODO)

In [None]:
!pip install onnx

In [None]:
!python export.py \
    --weights {RESULT_DIR}/train/results_1/weights/best.pt \
    --grid \
    --end2end \
    --simplify \
    --topk-all 100 \
    --iou-thres 0.65 \
    --conf-thres 0.35 \
    --img-size 640 640 \
    --max-wh 640