<a href="https://colab.research.google.com/github/Futuremine97/CodingTest/blob/main/yolov1_torch_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# YOLOv1 Implementation with PyTorch

- To-Do List
  - mAP
  - Handle Cell Boxes
  - darkent architecture
  - 

In [None]:
from tqdm import tqdm
from torch.utils.data import DataLoader
from collections import Counter

import torch
import torchvision.transforms as transforms
import torch.optim as optim
import torchvision.transforms.functional as FT
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches

## Dataset

### Download Dataset

In [None]:
!gdown https://drive.google.com/uc?id=1MhlW9dHlBR7YYuWAMKs8C0VyKkSXBzOC
!sed -i 's/\r$//' get_data.sh
!wget https://raw.githubusercontent.com/aladdinpersson/Machine-Learning-Collection/master/ML/Pytorch/object_detection/YOLO/data/generate_csv.py

Downloading...
From: https://drive.google.com/uc?id=1MhlW9dHlBR7YYuWAMKs8C0VyKkSXBzOC
To: /content/get_data.sh
100% 1.10k/1.10k [00:00<00:00, 2.03MB/s]
--2022-04-26 08:59:37--  https://raw.githubusercontent.com/aladdinpersson/Machine-Learning-Collection/master/ML/Pytorch/object_detection/YOLO/data/generate_csv.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 763 [text/plain]
Saving to: ‘generate_csv.py’


2022-04-26 08:59:37 (40.6 MB/s) - ‘generate_csv.py’ saved [763/763]



In [None]:
!bash get_data.sh

--2022-04-26 08:59:41--  http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
Resolving host.robots.ox.ac.uk (host.robots.ox.ac.uk)... 129.67.94.152
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 460032000 (439M) [application/x-tar]
Saving to: ‘VOCtrainval_06-Nov-2007.tar’


2022-04-26 08:59:45 (138 MB/s) - ‘VOCtrainval_06-Nov-2007.tar’ saved [460032000/460032000]

--2022-04-26 08:59:45--  http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
Resolving host.robots.ox.ac.uk (host.robots.ox.ac.uk)... 129.67.94.152
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 451020800 (430M) [application/x-tar]
Saving to: ‘VOCtest_06-Nov-2007.tar’


2022-04-26 08:59:48 (138 MB/s) - ‘VOCtest_06-Nov-2007.tar’ saved [451020800/451020800]

--2022-04-26 08:59:48--  http://host

### Define VOCDataset

In [None]:
import torch
import os
import pandas as pd
from PIL import Image


class VOCDataset(torch.utils.data.Dataset):
    def __init__(
        self, csv_file, img_dir, label_dir, S=7, B=2, C=20, transform=None,
    ):
        self.annotations = pd.read_csv(csv_file)
        self.img_dir = img_dir
        self.label_dir = label_dir
        self.transform = transform
        self.S = S
        self.B = B
        self.C = C

    def __len__(self):
        return len(self.annotations)

    def __getitem__(self, index):
        label_path = os.path.join(self.label_dir, self.annotations.iloc[index, 1])
        boxes = []
        with open(label_path) as f:
            for label in f.readlines():
                class_label, x, y, width, height = [
                    float(x) if float(x) != int(float(x)) else int(x)
                    for x in label.replace("\n", "").split()
                ]

                boxes.append([class_label, x, y, width, height])

        img_path = os.path.join(self.img_dir, self.annotations.iloc[index, 0])
        image = Image.open(img_path)
        boxes = torch.tensor(boxes)

        if self.transform:
            # image = self.transform(image)
            image, boxes = self.transform(image, boxes)

        # Convert To Cells
        label_matrix = torch.zeros((self.S, self.S, self.C + 5 * self.B))
        for box in boxes:
            class_label, x, y, width, height = box.tolist()
            class_label = int(class_label)

            # i,j represents the cell row and cell column
            i, j = int(self.S * y), int(self.S * x)
            x_cell, y_cell = self.S * x - j, self.S * y - i

            """
            Calculating the width and height of cell of bounding box,
            relative to the cell is done by the following, with
            width as the example:
            
            width_pixels = (width*self.image_width)
            cell_pixels = (self.image_width)
            
            Then to find the width relative to the cell is simply:
            width_pixels/cell_pixels, simplification leads to the
            formulas below.
            """
            width_cell, height_cell = (
                width * self.S,
                height * self.S,
            )

            # If no object already found for specific cell i,j
            # Note: This means we restrict to ONE object
            # per cell!
            if label_matrix[i, j, 20] == 0:
                # Set that there exists an object
                label_matrix[i, j, 20] = 1

                # Box coordinates
                box_coordinates = torch.tensor(
                    [x_cell, y_cell, width_cell, height_cell]
                )

                label_matrix[i, j, 21:25] = box_coordinates

                # Set one hot encoding for class_label
                label_matrix[i, j, class_label] = 1

        return image, label_matrix

### Define Transformation Function

In [None]:
class Compose(object):
    def __init__(self, transforms):
        self.transforms = transforms

    def __call__(self, img, bboxes):
        for t in self.transforms:
            img, bboxes = t(img), bboxes

        return img, bboxes

## Define Utilities

### IoU

In [None]:
def intersection_over_union(boxes_preds, boxes_labels, box_format="midpoint"):
    """
    Calculates intersection over union
    Parameters:
        boxes_preds (tensor): Predictions of Bounding Boxes (BATCH_SIZE, 4)
        boxes_labels (tensor): Correct labels of Bounding Boxes (BATCH_SIZE, 4)
        box_format (str): midpoint/corners, if boxes (x,y,w,h) or (x1,y1,x2,y2)
    Returns:
        tensor: Intersection over union for all examples
    """

    if box_format == "midpoint":
        box1_x1 = boxes_preds[..., 0:1] - boxes_preds[..., 2:3] / 2
        box1_y1 = boxes_preds[..., 1:2] - boxes_preds[..., 3:4] / 2
        box1_x2 = boxes_preds[..., 0:1] + boxes_preds[..., 2:3] / 2
        box1_y2 = boxes_preds[..., 1:2] + boxes_preds[..., 3:4] / 2
        box2_x1 = boxes_labels[..., 0:1] - boxes_labels[..., 2:3] / 2
        box2_y1 = boxes_labels[..., 1:2] - boxes_labels[..., 3:4] / 2
        box2_x2 = boxes_labels[..., 0:1] + boxes_labels[..., 2:3] / 2
        box2_y2 = boxes_labels[..., 1:2] + boxes_labels[..., 3:4] / 2

    if box_format == "corners":
        box1_x1 = boxes_preds[..., 0:1]
        box1_y1 = boxes_preds[..., 1:2]
        box1_x2 = boxes_preds[..., 2:3]
        box1_y2 = boxes_preds[..., 3:4]  # (N, 1)
        box2_x1 = boxes_labels[..., 0:1]
        box2_y1 = boxes_labels[..., 1:2]
        box2_x2 = boxes_labels[..., 2:3]
        box2_y2 = boxes_labels[..., 3:4]

    x1 = torch.max(box1_x1, box2_x1)
    y1 = torch.max(box1_y1, box2_y1)
    x2 = torch.min(box1_x2, box2_x2)
    y2 = torch.min(box1_y2, box2_y2)

    # .clamp(0) is for the case when they do not intersect
    intersection = (x2 - x1).clamp(0) * (y2 - y1).clamp(0)

    box1_area = abs((box1_x2 - box1_x1) * (box1_y2 - box1_y1))
    box2_area = abs((box2_x2 - box2_x1) * (box2_y2 - box2_y1))

    return intersection / (box1_area + box2_area - intersection + 1e-6)

### NMS

In [None]:
def non_max_suppression(bboxes, iou_threshold, threshold, box_format="corners"):
    """
    Does Non Max Suppression given bboxes
    Parameters:
        bboxes (list): list of lists containing all bboxes with each bboxes
        specified as [class_pred, prob_score, x1, y1, x2, y2]
        iou_threshold (float): threshold where predicted bboxes is correct
        threshold (float): threshold to remove predicted bboxes (independent of IoU) 
        box_format (str): "midpoint" or "corners" used to specify bboxes
    Returns:
        list: bboxes after performing NMS given a specific IoU threshold
    """

    assert type(bboxes) == list

    bboxes = [box for box in bboxes if box[1] > threshold]
    bboxes = sorted(bboxes, key=lambda x: x[1], reverse=True)
    bboxes_after_nms = []

    while bboxes:
        chosen_box = bboxes.pop(0)

        bboxes = [
            box
            for box in bboxes
            if box[0] != chosen_box[0]
            or intersection_over_union(
                torch.tensor(chosen_box[2:]),
                torch.tensor(box[2:]),
                box_format=box_format,
            )
            < iou_threshold
        ]

        bboxes_after_nms.append(chosen_box)

    return bboxes_after_nms

### mAP

In [None]:
def mean_average_precision(
    pred_boxes, true_boxes, iou_threshold=0.5, box_format="midpoint", num_classes=20
):
    """
    Calculates mean average precision 
    Parameters:
        pred_boxes (list): list of lists containing all bboxes with each bboxes
        specified as [train_idx, class_prediction, prob_score, x1, y1, x2, y2]
        true_boxes (list): Similar as pred_boxes except all the correct ones 
        iou_threshold (float): threshold where predicted bboxes is correct
        box_format (str): "midpoint" or "corners" used to specify bboxes
        num_classes (int): number of classes
    Returns:
        float: mAP value across all classes given a specific IoU threshold 
    """

    # list storing all AP for respective classes
    average_precisions = []

    # used for numerical stability later on
    epsilon = 1e-6

    for c in range(num_classes):
        detections = []
        ground_truths = []

        # Go through all predictions and targets,
        # and only add the ones that belong to the
        # current class c
        for detection in pred_boxes:
            if detection[1] == c:
                detections.append(detection)

        for true_box in true_boxes:
            if true_box[1] == c:
                ground_truths.append(true_box)

        # find the amount of bboxes for each training example
        # Counter here finds how many ground truth bboxes we get
        # for each training example, so let's say img 0 has 3,
        # img 1 has 5 then we will obtain a dictionary with:
        # amount_bboxes = {0:3, 1:5}
        amount_bboxes = Counter([gt[0] for gt in ground_truths])

        # We then go through each key, val in this dictionary
        # and convert to the following (w.r.t same example):
        # ammount_bboxes = {0:torch.tensor[0,0,0], 1:torch.tensor[0,0,0,0,0]}
        for key, val in amount_bboxes.items():
            amount_bboxes[key] = torch.zeros(val)

        # sort by box probabilities which is index 2
        detections.sort(key=lambda x: x[2], reverse=True)
        TP = torch.zeros((len(detections)))
        FP = torch.zeros((len(detections)))
        total_true_bboxes = len(ground_truths)
        
        # If none exists for this class then we can safely skip
        if total_true_bboxes == 0:
            continue

        for detection_idx, detection in enumerate(detections):
            # Only take out the ground_truths that have the same
            # training idx as detection
            ground_truth_img = [
                bbox for bbox in ground_truths if bbox[0] == detection[0]
            ]

            num_gts = len(ground_truth_img)
            best_iou = 0

            for idx, gt in enumerate(ground_truth_img):
                iou = intersection_over_union(
                    torch.tensor(detection[3:]),
                    torch.tensor(gt[3:]),
                    box_format=box_format,
                )

                if iou > best_iou:
                    best_iou = iou
                    best_gt_idx = idx

            if best_iou > iou_threshold:
                # only detect ground truth detection once
                if amount_bboxes[detection[0]][best_gt_idx] == 0:
                    # true positive and add this bounding box to seen
                    TP[detection_idx] = 1
                    amount_bboxes[detection[0]][best_gt_idx] = 1
                else:
                    FP[detection_idx] = 1

            # if IOU is lower then the detection is a false positive
            else:
                FP[detection_idx] = 1

        TP_cumsum = torch.cumsum(TP, dim=0)
        FP_cumsum = torch.cumsum(FP, dim=0)
        recalls = TP_cumsum / (total_true_bboxes + epsilon)
        precisions = torch.divide(TP_cumsum, (TP_cumsum + FP_cumsum + epsilon))
        precisions = torch.cat((torch.tensor([1]), precisions))
        recalls = torch.cat((torch.tensor([0]), recalls))
        # torch.trapz for numerical integration
        average_precisions.append(torch.trapz(precisions, recalls))

    return sum(average_precisions) / len(average_precisions)

### Handle Cell Boxes

In [None]:
def convert_cellboxes(predictions, S=7):
    """
    Converts bounding boxes output from Yolo with
    an image split size of S into entire image ratios
    rather than relative to cell ratios. Tried to do this
    vectorized, but this resulted in quite difficult to read
    code... Use as a black box? Or implement a more intuitive,
    using 2 for loops iterating range(S) and convert them one
    by one, resulting in a slower but more readable implementation.
    """

    predictions = predictions.to("cpu")
    batch_size = predictions.shape[0]
    predictions = predictions.reshape(batch_size, 7, 7, 30)
    bboxes1 = predictions[..., 21:25]
    bboxes2 = predictions[..., 26:30]
    scores = torch.cat(
        (predictions[..., 20].unsqueeze(0), predictions[..., 25].unsqueeze(0)), dim=0
    )
    best_box = scores.argmax(0).unsqueeze(-1)
    best_boxes = bboxes1 * (1 - best_box) + best_box * bboxes2
    cell_indices = torch.arange(7).repeat(batch_size, 7, 1).unsqueeze(-1)
    x = 1 / S * (best_boxes[..., :1] + cell_indices)
    y = 1 / S * (best_boxes[..., 1:2] + cell_indices.permute(0, 2, 1, 3))
    w_y = 1 / S * best_boxes[..., 2:4]
    converted_bboxes = torch.cat((x, y, w_y), dim=-1)
    predicted_class = predictions[..., :20].argmax(-1).unsqueeze(-1)
    best_confidence = torch.max(predictions[..., 20], predictions[..., 25]).unsqueeze(
        -1
    )
    converted_preds = torch.cat(
        (predicted_class, best_confidence, converted_bboxes), dim=-1
    )

    return converted_preds


def cellboxes_to_boxes(out, S=7):
    converted_pred = convert_cellboxes(out).reshape(out.shape[0], S * S, -1)
    converted_pred[..., 0] = converted_pred[..., 0].long()
    all_bboxes = []

    for ex_idx in range(out.shape[0]):
        bboxes = []

        for bbox_idx in range(S * S):
            bboxes.append([x.item() for x in converted_pred[ex_idx, bbox_idx, :]])
        all_bboxes.append(bboxes)

    return all_bboxes

### Others

In [None]:
def plot_image(image, boxes):
    """Plots predicted bounding boxes on the image"""
    im = np.array(image)
    height, width, _ = im.shape

    # Create figure and axes
    fig, ax = plt.subplots(1)
    # Display the image
    ax.imshow(im)

    # box[0] is x midpoint, box[2] is width
    # box[1] is y midpoint, box[3] is height

    # Create a Rectangle potch
    for box in boxes:
        box = box[2:]
        assert len(box) == 4, "Got more values than in x, y, w, h, in a box!"
        upper_left_x = box[0] - box[2] / 2
        upper_left_y = box[1] - box[3] / 2
        rect = patches.Rectangle(
            (upper_left_x * width, upper_left_y * height),
            box[2] * width,
            box[3] * height,
            linewidth=1,
            edgecolor="r",
            facecolor="none",
        )
        # Add the patch to the Axes
        ax.add_patch(rect)

    plt.show()

def save_checkpoint(state, filename="my_checkpoint.pth"):
    print("=> Saving checkpoint")
    torch.save(state, filename)

def load_checkpoint(checkpoint, model, optimizer):
    print("=> Loading checkpoint")
    model.load_state_dict(checkpoint["state_dict"])
    optimizer.load_state_dict(checkpoint["optimizer"])

## Construct Model

### YOLOv1 Model

In [None]:
import torch
import torch.nn as nn

""" 
Information about architecture config:
Tuple is structured by (kernel_size, filters, stride, padding) 
"M" is simply maxpooling with stride 2x2 and kernel 2x2
List is structured by tuples and lastly int with number of repeats
"""

architecture_config = [
    (7, 64, 2, 3),
    "M",
    (3, 192, 1, 1),
    "M",
    (1, 128, 1, 0),
    (3, 256, 1, 1),
    (1, 256, 1, 0),
    (3, 512, 1, 1),
    "M",
    [(1, 256, 1, 0), (3, 512, 1, 1), 4],
    (1, 512, 1, 0),
    (3, 1024, 1, 1),
    "M",
    [(1, 512, 1, 0), (3, 1024, 1, 1), 2],
    (3, 1024, 1, 1),
    (3, 1024, 2, 1),
    (3, 1024, 1, 1),
    (3, 1024, 1, 1),
]


class CNNBlock(nn.Module):
    def __init__(self, in_channels, out_channels, **kwargs):
        super(CNNBlock, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)
        self.batchnorm = nn.BatchNorm2d(out_channels)
        self.leakyrelu = nn.LeakyReLU(0.1)

    def forward(self, x):
        return self.leakyrelu(self.batchnorm(self.conv(x)))


class Yolov1(nn.Module):
    def __init__(self, in_channels=3, **kwargs):
        super(Yolov1, self).__init__()
        self.architecture = architecture_config
        self.in_channels = in_channels
        self.darknet = self._create_conv_layers(self.architecture)
        self.fcs = self._create_fcs(**kwargs)

    def forward(self, x):
        x = self.darknet(x)
        return self.fcs(torch.flatten(x, start_dim=1))

    def _create_conv_layers(self, architecture):
        layers = []
        in_channels = self.in_channels

        for x in architecture:
            if type(x) == tuple:
                layers += [
                    CNNBlock(
                        in_channels, x[1], kernel_size=x[0], stride=x[2], padding=x[3],
                    )
                ]
                in_channels = x[1]

            elif type(x) == str:
                layers += [nn.MaxPool2d(kernel_size=(2, 2), stride=(2, 2))]

            elif type(x) == list:
                conv1 = x[0]
                conv2 = x[1]
                num_repeats = x[2]

                for _ in range(num_repeats):
                    layers += [
                        CNNBlock(
                            in_channels,
                            conv1[1],
                            kernel_size=conv1[0],
                            stride=conv1[2],
                            padding=conv1[3],
                        )
                    ]
                    layers += [
                        CNNBlock(
                            conv1[1],
                            conv2[1],
                            kernel_size=conv2[0],
                            stride=conv2[2],
                            padding=conv2[3],
                        )
                    ]
                    in_channels = conv2[1]

        return nn.Sequential(*layers)

    def _create_fcs(self, split_size, num_boxes, num_classes):
        S, B, C = split_size, num_boxes, num_classes

        # In original paper this should be
        # nn.Linear(1024*S*S, 4096),
        # nn.LeakyReLU(0.1),
        # nn.Linear(4096, S*S*(B*5+C))

        return nn.Sequential(
            nn.Flatten(),
            nn.Linear(1024 * S * S, 4096),
            nn.Dropout(0.0),
            nn.LeakyReLU(0.1),
            nn.Linear(4096, S * S * (C + B * 5)),
        )

### Loss Function

In [None]:
import torch
import torch.nn as nn


class YoloLoss(nn.Module):
    """
    Calculate the loss for yolo (v1) model
    """

    def __init__(self, S=7, B=2, C=20):
        super(YoloLoss, self).__init__()
        self.mse = nn.MSELoss(reduction="sum")

        """
        S is split size of image (in paper 7),
        B is number of boxes (in paper 2),
        C is number of classes (in paper and VOC dataset is 20),
        """
        self.S = S
        self.B = B
        self.C = C

        # These are from Yolo paper, signifying how much we should
        # pay loss for no object (noobj) and the box coordinates (coord)
        self.lambda_noobj = 0.5
        self.lambda_coord = 5

    def forward(self, predictions, target):
        # predictions are shaped (BATCH_SIZE, S*S(C+B*5) when inputted
        predictions = predictions.reshape(-1, self.S, self.S, self.C + self.B * 5)

        # Calculate IoU for the two predicted bounding boxes with target bbox
        iou_b1 = intersection_over_union(predictions[..., 21:25], target[..., 21:25])
        iou_b2 = intersection_over_union(predictions[..., 26:30], target[..., 21:25])
        ious = torch.cat([iou_b1.unsqueeze(0), iou_b2.unsqueeze(0)], dim=0)

        # Take the box with highest IoU out of the two prediction
        # Note that bestbox will be indices of 0, 1 for which bbox was best
        iou_maxes, bestbox = torch.max(ious, dim=0)
        exists_box = target[..., 20].unsqueeze(3)  # in paper this is Iobj_i

        # ======================== #
        #   FOR BOX COORDINATES    #
        # ======================== #

        # Set boxes with no object in them to 0. We only take out one of the two 
        # predictions, which is the one with highest Iou calculated previously.
        box_predictions = exists_box * (
            (
                bestbox * predictions[..., 26:30]
                + (1 - bestbox) * predictions[..., 21:25]
            )
        )

        box_targets = exists_box * target[..., 21:25]

        # Take sqrt of width, height of boxes to ensure that
        box_predictions[..., 2:4] = torch.sign(box_predictions[..., 2:4]) * torch.sqrt(
            torch.abs(box_predictions[..., 2:4] + 1e-6)
        )
        box_targets[..., 2:4] = torch.sqrt(box_targets[..., 2:4])

        box_loss = self.mse(
            torch.flatten(box_predictions, end_dim=-2),
            torch.flatten(box_targets, end_dim=-2),
        )

        # ==================== #
        #   FOR OBJECT LOSS    #
        # ==================== #

        # pred_box is the confidence score for the bbox with highest IoU
        pred_box = (
            bestbox * predictions[..., 25:26] + (1 - bestbox) * predictions[..., 20:21]
        )

        object_loss = self.mse(
            torch.flatten(exists_box * pred_box),
            torch.flatten(exists_box * target[..., 20:21]),
        )

        # ======================= #
        #   FOR NO OBJECT LOSS    #
        # ======================= #

        #max_no_obj = torch.max(predictions[..., 20:21], predictions[..., 25:26])
        #no_object_loss = self.mse(
        #    torch.flatten((1 - exists_box) * max_no_obj, start_dim=1),
        #    torch.flatten((1 - exists_box) * target[..., 20:21], start_dim=1),
        #)

        no_object_loss = self.mse(
            torch.flatten((1 - exists_box) * predictions[..., 20:21], start_dim=1),
            torch.flatten((1 - exists_box) * target[..., 20:21], start_dim=1),
        )

        no_object_loss += self.mse(
            torch.flatten((1 - exists_box) * predictions[..., 25:26], start_dim=1),
            torch.flatten((1 - exists_box) * target[..., 20:21], start_dim=1)
        )

        # ================== #
        #   FOR CLASS LOSS   #
        # ================== #

        class_loss = self.mse(
            torch.flatten(exists_box * predictions[..., :20], end_dim=-2,),
            torch.flatten(exists_box * target[..., :20], end_dim=-2,),
        )

        loss = (
            self.lambda_coord * box_loss  # first two rows in paper
            + object_loss  # third row in paper
            + self.lambda_noobj * no_object_loss  # forth row
            + class_loss  # fifth row
        )

        return loss

## Training and Test

### Set Hyperparameters

In [None]:
seed = 123
torch.manual_seed(seed)

# Hyperparameters etc. 
LEARNING_RATE = 2e-5
DEVICE = "cuda" if torch.cuda.is_available else "cpu"
BATCH_SIZE = 32 # 64
WEIGHT_DECAY = 0
EPOCHS = 130
NUM_WORKERS = 2 # https://jybaek.tistory.com/799, https://stackoverflow.com/a/67503780
PIN_MEMORY = True
LOAD_MODEL = False
LOAD_MODEL_FILE = "yolov1_best.pth"
IMG_DIR = "data/images"
LABEL_DIR = "data/labels"

### Define Training Function

In [None]:
def train_fn(train_loader, model, optimizer, loss_fn, epoch):
    loop = tqdm(train_loader, leave=True, desc=f"Epoch {epoch+1}")
    mean_loss = []

    for batch_idx, (x, y) in enumerate(loop):
        x, y = x.to(DEVICE), y.to(DEVICE)
        out = model(x)
        loss = loss_fn(out, y)
        mean_loss.append(loss.item())
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # update progress bar
        loop.set_postfix(loss=loss.item())

    print(f"Mean loss was {sum(mean_loss)/len(mean_loss)}")

### Define Test Function

In [None]:
def test_fn(
    loader,
    model,
    iou_threshold,
    threshold,
    pred_format="cells",
    box_format="midpoint",
    device="cuda",
):
    all_pred_boxes = []
    all_true_boxes = []
    total_idx = 0
    for batch_idx, (x, labels) in enumerate(tqdm(loader, desc="Evaluation")):
        x = x.to(device)
        labels = labels.to(device)

        with torch.no_grad():
            predictions = model(x)

        batch_size = x.shape[0]
        true_bboxes = cellboxes_to_boxes(labels)
        bboxes = cellboxes_to_boxes(predictions)

        for idx in range(batch_size):
            nms_boxes = non_max_suppression(
                bboxes[idx],
                iou_threshold=iou_threshold,
                threshold=threshold,
                box_format=box_format,
            )

            #if batch_idx == 0 and idx == 0:
            #    plot_image(x[idx].permute(1,2,0).to("cpu"), nms_boxes)
            #    print(nms_boxes)

            for nms_box in nms_boxes:
                all_pred_boxes.append([total_idx] + nms_box)

            for box in true_bboxes[idx]:
                # many will get converted to 0 pred
                if box[1] > threshold:
                    all_true_boxes.append([total_idx] + box)

            total_idx += 1

    return all_pred_boxes, all_true_boxes

### Load Dataset

In [None]:
transform = Compose([transforms.Resize((448, 448)), transforms.ToTensor(),])

train_dataset = VOCDataset(
    "train.csv",
    transform=transform,
    img_dir=IMG_DIR,
    label_dir=LABEL_DIR,
)

test_dataset = VOCDataset(
    "test.csv", transform=transform, img_dir=IMG_DIR, label_dir=LABEL_DIR,
)

train_loader = DataLoader(
    dataset=train_dataset,
    batch_size=BATCH_SIZE,
    num_workers=NUM_WORKERS,
    pin_memory=PIN_MEMORY,
    shuffle=True,
    drop_last=True,
)

test_loader = DataLoader(
    dataset=test_dataset,
    batch_size=BATCH_SIZE,
    num_workers=NUM_WORKERS,
    pin_memory=PIN_MEMORY,
    shuffle=True,
    drop_last=True,
)

### Training

In [None]:
model = Yolov1(split_size=7, num_boxes=2, num_classes=20).to(DEVICE)
if LOAD_MODEL:
    load_checkpoint(torch.load(LOAD_MODEL_FILE), model, optimizer)

loss_fn = YoloLoss()
optimizer = optim.Adam(
    model.parameters(), lr=LEARNING_RATE, weight_decay=WEIGHT_DECAY
)

In [None]:
best_model = {"checkpoint": {}, "map": 0.0}
for epoch in range(EPOCHS):
    # Training
    model.train()
    train_fn(train_loader, model, optimizer, loss_fn, epoch)

    # Calculate mAP
    model.eval()

    # Calculate train mAP
    pred_boxes, target_boxes = test_fn(
        train_loader, model, iou_threshold=0.5, threshold=0.4
    )
    mean_avg_prec = mean_average_precision(
        pred_boxes, target_boxes, iou_threshold=0.5, box_format="midpoint"
    )
    print(f"Train mAP: {mean_avg_prec}")

    # Calculate test mAP
    pred_boxes, target_boxes = test_fn(
        test_loader, model, iou_threshold=0.5, threshold=0.4
    )
    mean_avg_prec = mean_average_precision(
        pred_boxes, target_boxes, iou_threshold=0.5, box_format="midpoint"
    )
    print(f"Test mAP: {mean_avg_prec}")

    # Save best model
    if mean_avg_prec > best_model["map"]:
        best_model["checkpoint"]["stated_dict"] = model.state_dict()
        best_model["checkpoint"]["optimizer"] = optimizer.state_dict()
        best_model["map"] = mean_avg_prec
        save_checkpoint(best_model["checkpoint"], filename=LOAD_MODEL_FILE)
    
    print("")

Epoch 1: 100%|██████████| 517/517 [04:33<00:00,  1.89it/s, loss=323]


Mean loss was 351.1677644534083


Evaluation: 100%|██████████| 517/517 [03:55<00:00,  2.20it/s]


Train mAP: 0.010408589616417885


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.26it/s]


Test mAP: 0.006853909231722355
=> Saving checkpoint



Epoch 2: 100%|██████████| 517/517 [04:03<00:00,  2.12it/s, loss=286]


Mean loss was 289.98932126781233


Evaluation: 100%|██████████| 517/517 [03:26<00:00,  2.50it/s]


Train mAP: 0.007949299179017544


Evaluation: 100%|██████████| 154/154 [00:59<00:00,  2.57it/s]


Test mAP: 0.005363969597965479



Epoch 3: 100%|██████████| 517/517 [03:56<00:00,  2.19it/s, loss=283]


Mean loss was 265.9794922465282


Evaluation: 100%|██████████| 517/517 [03:27<00:00,  2.50it/s]


Train mAP: 0.023979905992746353


Evaluation: 100%|██████████| 154/154 [00:59<00:00,  2.57it/s]


Test mAP: 0.014052552171051502
=> Saving checkpoint



Epoch 4: 100%|██████████| 517/517 [04:11<00:00,  2.06it/s, loss=228]


Mean loss was 248.37761533283404


Evaluation: 100%|██████████| 517/517 [03:45<00:00,  2.29it/s]


Train mAP: 0.040028948336839676


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.30it/s]


Test mAP: 0.02096657082438469
=> Saving checkpoint



Epoch 5: 100%|██████████| 517/517 [04:17<00:00,  2.01it/s, loss=234]


Mean loss was 229.19768133108113


Evaluation: 100%|██████████| 517/517 [03:54<00:00,  2.21it/s]


Train mAP: 0.02837992087006569


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.30it/s]


Test mAP: 0.014282318763434887



Epoch 6: 100%|██████████| 517/517 [04:07<00:00,  2.09it/s, loss=199]


Mean loss was 208.23285561419547


Evaluation: 100%|██████████| 517/517 [03:51<00:00,  2.24it/s]


Train mAP: 0.06992903351783752


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.26it/s]


Test mAP: 0.03205370157957077
=> Saving checkpoint



Epoch 7: 100%|██████████| 517/517 [04:28<00:00,  1.92it/s, loss=176]


Mean loss was 187.72742909296798


Evaluation: 100%|██████████| 517/517 [03:55<00:00,  2.19it/s]


Train mAP: 0.07929737865924835


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.24it/s]


Test mAP: 0.031234607100486755



Epoch 8: 100%|██████████| 517/517 [04:20<00:00,  1.98it/s, loss=140]


Mean loss was 165.48188982434152


Evaluation: 100%|██████████| 517/517 [03:59<00:00,  2.16it/s]


Train mAP: 0.153864324092865


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.30it/s]


Test mAP: 0.0439288504421711
=> Saving checkpoint



Epoch 9: 100%|██████████| 517/517 [04:30<00:00,  1.91it/s, loss=173]


Mean loss was 147.45611643099463


Evaluation: 100%|██████████| 517/517 [03:54<00:00,  2.21it/s]


Train mAP: 0.1751270592212677


Evaluation: 100%|██████████| 154/154 [01:10<00:00,  2.18it/s]


Test mAP: 0.04172801971435547



Epoch 10: 100%|██████████| 517/517 [04:12<00:00,  2.05it/s, loss=147]


Mean loss was 132.86552176853675


Evaluation: 100%|██████████| 517/517 [03:43<00:00,  2.31it/s]


Train mAP: 0.32235273718833923


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.32it/s]


Test mAP: 0.03671913221478462



Epoch 11: 100%|██████████| 517/517 [04:07<00:00,  2.09it/s, loss=111]


Mean loss was 118.21034518155184


Evaluation: 100%|██████████| 517/517 [03:45<00:00,  2.29it/s]


Train mAP: 0.41789236664772034


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.32it/s]


Test mAP: 0.04982496052980423
=> Saving checkpoint



Epoch 12: 100%|██████████| 517/517 [04:17<00:00,  2.01it/s, loss=118]


Mean loss was 106.53643149517953


Evaluation: 100%|██████████| 517/517 [03:44<00:00,  2.31it/s]


Train mAP: 0.5915546417236328


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.32it/s]


Test mAP: 0.04651784151792526



Epoch 13: 100%|██████████| 517/517 [04:07<00:00,  2.09it/s, loss=112]


Mean loss was 95.82303842954063


Evaluation: 100%|██████████| 517/517 [03:45<00:00,  2.29it/s]


Train mAP: 0.6267798542976379


Evaluation: 100%|██████████| 154/154 [01:05<00:00,  2.34it/s]


Test mAP: 0.040849506855010986



Epoch 14: 100%|██████████| 517/517 [04:09<00:00,  2.07it/s, loss=93.8]


Mean loss was 86.145146980507


Evaluation: 100%|██████████| 517/517 [03:45<00:00,  2.29it/s]


Train mAP: 0.7069048881530762


Evaluation: 100%|██████████| 154/154 [01:05<00:00,  2.36it/s]


Test mAP: 0.0384894534945488



Epoch 15: 100%|██████████| 517/517 [04:11<00:00,  2.06it/s, loss=69.6]


Mean loss was 83.64636288759095


Evaluation: 100%|██████████| 517/517 [03:48<00:00,  2.26it/s]


Train mAP: 0.7182698249816895


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.30it/s]


Test mAP: 0.04887733235955238



Epoch 16: 100%|██████████| 517/517 [04:10<00:00,  2.07it/s, loss=69.3]


Mean loss was 71.71411206597747


Evaluation: 100%|██████████| 517/517 [03:44<00:00,  2.31it/s]


Train mAP: 0.7148839831352234


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.33it/s]


Test mAP: 0.04208211973309517



Epoch 17: 100%|██████████| 517/517 [04:10<00:00,  2.06it/s, loss=52]


Mean loss was 65.498084380041


Evaluation: 100%|██████████| 517/517 [03:44<00:00,  2.30it/s]


Train mAP: 0.7270867824554443


Evaluation: 100%|██████████| 154/154 [01:06<00:00,  2.33it/s]


Test mAP: 0.0344734713435173



Epoch 18: 100%|██████████| 517/517 [04:05<00:00,  2.11it/s, loss=74]


Mean loss was 60.5586835466455


Evaluation: 100%|██████████| 517/517 [03:49<00:00,  2.25it/s]


Train mAP: 0.7580962777137756


Evaluation: 100%|██████████| 154/154 [01:10<00:00,  2.20it/s]


Test mAP: 0.04242920130491257



Epoch 19: 100%|██████████| 517/517 [04:15<00:00,  2.02it/s, loss=50.1]


Mean loss was 58.329981901659494


Evaluation: 100%|██████████| 517/517 [03:54<00:00,  2.20it/s]


Train mAP: 0.786124587059021


Evaluation: 100%|██████████| 154/154 [01:10<00:00,  2.17it/s]


Test mAP: 0.04457477480173111



Epoch 20: 100%|██████████| 517/517 [04:17<00:00,  2.01it/s, loss=53.2]


Mean loss was 58.23729494784741


Evaluation: 100%|██████████| 517/517 [03:57<00:00,  2.17it/s]


Train mAP: 0.7767952680587769


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.26it/s]


Test mAP: 0.04802592471241951



Epoch 21: 100%|██████████| 517/517 [04:28<00:00,  1.93it/s, loss=48.1]


Mean loss was 49.90034713966473


Evaluation: 100%|██████████| 517/517 [03:59<00:00,  2.16it/s]


Train mAP: 0.7815757989883423


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.24it/s]


Test mAP: 0.053491681814193726
=> Saving checkpoint



Epoch 22: 100%|██████████| 517/517 [04:29<00:00,  1.92it/s, loss=48.8]


Mean loss was 44.55980736236277


Evaluation: 100%|██████████| 517/517 [03:57<00:00,  2.18it/s]


Train mAP: 0.784201443195343


Evaluation: 100%|██████████| 154/154 [01:09<00:00,  2.21it/s]


Test mAP: 0.04348018020391464



Epoch 23: 100%|██████████| 517/517 [04:21<00:00,  1.98it/s, loss=42.3]


Mean loss was 42.77493007593505


Evaluation: 100%|██████████| 517/517 [03:55<00:00,  2.19it/s]


Train mAP: 0.7953792810440063


Evaluation: 100%|██████████| 154/154 [01:10<00:00,  2.20it/s]


Test mAP: 0.046512383967638016



Epoch 24: 100%|██████████| 517/517 [04:14<00:00,  2.03it/s, loss=35.7]


Mean loss was 42.1304290373044


Evaluation: 100%|██████████| 517/517 [03:58<00:00,  2.17it/s]


Train mAP: 0.7777137756347656


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.25it/s]


Test mAP: 0.047974519431591034



Epoch 25: 100%|██████████| 517/517 [04:21<00:00,  1.98it/s, loss=52.4]


Mean loss was 39.582795417977486


Evaluation: 100%|██████████| 517/517 [03:54<00:00,  2.20it/s]


Train mAP: 0.8215215802192688


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.25it/s]


Test mAP: 0.0501110777258873



Epoch 26: 100%|██████████| 517/517 [04:27<00:00,  1.93it/s, loss=31.2]


Mean loss was 38.43974141871676


Evaluation: 100%|██████████| 517/517 [04:02<00:00,  2.13it/s]


Train mAP: 0.7537894248962402


Evaluation: 100%|██████████| 154/154 [01:11<00:00,  2.14it/s]


Test mAP: 0.03733519837260246



Epoch 27: 100%|██████████| 517/517 [04:18<00:00,  2.00it/s, loss=28]


Mean loss was 35.17687087363385


Evaluation: 100%|██████████| 517/517 [04:03<00:00,  2.12it/s]


Train mAP: 0.8195171356201172


Evaluation: 100%|██████████| 154/154 [01:08<00:00,  2.24it/s]


Test mAP: 0.050933413207530975



Epoch 28: 100%|██████████| 517/517 [04:22<00:00,  1.97it/s, loss=29.2]


Mean loss was 33.2659522154345


Evaluation: 100%|██████████| 517/517 [03:57<00:00,  2.17it/s]


Train mAP: 0.8106662631034851


Evaluation: 100%|██████████| 154/154 [01:09<00:00,  2.20it/s]


Test mAP: 0.0478702150285244



Epoch 29: 100%|██████████| 517/517 [04:18<00:00,  2.00it/s, loss=25.8]


Mean loss was 33.27282050038675


Evaluation: 100%|██████████| 517/517 [04:00<00:00,  2.15it/s]


Train mAP: 0.8099872469902039


Evaluation: 100%|██████████| 154/154 [01:10<00:00,  2.18it/s]


Test mAP: 0.044080950319767



Epoch 30: 100%|██████████| 517/517 [04:20<00:00,  1.98it/s, loss=30.4]


Mean loss was 32.32044959667803


Evaluation: 100%|██████████| 517/517 [03:50<00:00,  2.24it/s]


Train mAP: 0.8327220678329468


Evaluation: 100%|██████████| 154/154 [01:07<00:00,  2.27it/s]


Test mAP: 0.052772313356399536



Epoch 31: 100%|██████████| 517/517 [04:18<00:00,  2.00it/s, loss=30]


Mean loss was 30.97167339361844


Evaluation: 100%|██████████| 517/517 [03:51<00:00,  2.23it/s]


Train mAP: 0.8124364614486694


Evaluation: 100%|██████████| 154/154 [01:09<00:00,  2.20it/s]


Test mAP: 0.04753866046667099



Epoch 32: 100%|██████████| 517/517 [04:18<00:00,  2.00it/s, loss=30.7]


Mean loss was 29.963456244256555


Evaluation: 100%|██████████| 517/517 [04:02<00:00,  2.13it/s]


Train mAP: 0.8003531694412231


Evaluation: 100%|██████████| 154/154 [01:20<00:00,  1.91it/s]


Test mAP: 0.046986646950244904



Epoch 33: 100%|██████████| 517/517 [04:51<00:00,  1.77it/s, loss=41.5]


Mean loss was 34.15886160667906


Evaluation: 100%|██████████| 517/517 [04:09<00:00,  2.07it/s]


Train mAP: 0.7909315824508667


Evaluation: 100%|██████████| 154/154 [01:20<00:00,  1.92it/s]


Test mAP: 0.04365869238972664



Epoch 34: 100%|██████████| 517/517 [04:56<00:00,  1.74it/s, loss=27.4]


Mean loss was 29.72486032415866


Evaluation: 100%|██████████| 517/517 [04:35<00:00,  1.88it/s]


Train mAP: 0.8385109901428223


Evaluation: 100%|██████████| 154/154 [01:23<00:00,  1.84it/s]


Test mAP: 0.05253824591636658



Epoch 35:   2%|▏         | 10/517 [00:06<05:41,  1.49it/s, loss=24.1]


KeyboardInterrupt: ignored

#### Old

In [None]:
model = Yolov1(split_size=7, num_boxes=2, num_classes=20).to(DEVICE)
optimizer = optim.Adam(
    model.parameters(), lr=LEARNING_RATE, weight_decay=WEIGHT_DECAY
)
loss_fn = YoloLoss()

if LOAD_MODEL:
    load_checkpoint(torch.load(LOAD_MODEL_FILE), model, optimizer)

best_model = {"checkpoint": {}, "map": 0.0}
for epoch in range(EPOCHS):
    # for x, y in train_loader:
    #    x = x.to(DEVICE)
    #    for idx in range(8):
    #        bboxes = cellboxes_to_boxes(model(x))
    #        bboxes = non_max_suppression(bboxes[idx], iou_threshold=0.5, threshold=0.4, box_format="midpoint")
    #        plot_image(x[idx].permute(1,2,0).to("cpu"), bboxes)

    #    import sys
    #    sys.exit()

    pred_boxes, target_boxes = get_bboxes(
        train_loader, model, iou_threshold=0.5, threshold=0.4
    )
    mean_avg_prec = mean_average_precision(
        pred_boxes, target_boxes, iou_threshold=0.5, box_format="midpoint"
    )
    print(f"Train mAP: {mean_avg_prec}")

    if mean_avg_prec > best_model["map"]:
        best_model["checkpoint"]["stated_dict"] = model.state_dict()
        best_model["checkpoint"]["optimizer"] = optimizer.state_dict()
        save_checkpoint(best_model["checkpoint"], filename=LOAD_MODEL_FILE)

    #if mean_avg_prec > 0.9:
    #    checkpoint = {
    #        "state_dict": model.state_dict(),
    #        "optimizer": optimizer.state_dict(),
    #    }
    #    save_checkpoint(checkpoint, filename=LOAD_MODEL_FILE)
    #    import time
    #    time.sleep(10)

    train_fn(train_loader, model, optimizer, loss_fn)

Train mAP: 0.0


100%|██████████| 258/258 [03:26<00:00,  1.25it/s, loss=562]

Mean loss was 1065.927351367566





Train mAP: 0.0027144744526594877


100%|██████████| 258/258 [03:24<00:00,  1.26it/s, loss=516]

Mean loss was 637.7909172117248





Train mAP: 0.009062686003744602


100%|██████████| 258/258 [03:25<00:00,  1.26it/s, loss=595]

Mean loss was 597.9802279213602





Train mAP: 0.012181061320006847


100%|██████████| 258/258 [03:25<00:00,  1.26it/s, loss=631]

Mean loss was 562.310099165569





Train mAP: 0.013640801422297955


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=585]

Mean loss was 524.3621479596278





Train mAP: 0.019761819392442703


100%|██████████| 258/258 [03:24<00:00,  1.26it/s, loss=449]

Mean loss was 482.1884325604106





Train mAP: 0.051066119223833084


100%|██████████| 258/258 [03:27<00:00,  1.25it/s, loss=418]

Mean loss was 443.23020828601926





Train mAP: 0.07196847349405289


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=418]

Mean loss was 410.8502611263778





Train mAP: 0.0839550718665123


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=471]

Mean loss was 381.73788120949916





Train mAP: 0.12256888300180435


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=354]

Mean loss was 350.84823726683624





Train mAP: 0.17787232995033264


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=333]

Mean loss was 322.07823358580123





Train mAP: 0.25617191195487976


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=308]

Mean loss was 295.0611009228137





Train mAP: 0.3056599497795105


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=321]

Mean loss was 315.48384265751804





Train mAP: 0.3481695353984833


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=176]

Mean loss was 266.9546206570411





Train mAP: 0.4382185935974121


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=227]

Mean loss was 246.11144948560138





Train mAP: 0.4711339473724365


100%|██████████| 258/258 [03:24<00:00,  1.26it/s, loss=246]

Mean loss was 233.94090767793878





Train mAP: 0.45735979080200195


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=230]

Mean loss was 221.16185707269713





Train mAP: 0.565459132194519


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=177]

Mean loss was 211.2441053760144





Train mAP: 0.6434961557388306


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=185]

Mean loss was 198.9545486805051





Train mAP: 0.6706975698471069


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=176]

Mean loss was 192.69116506650465





Train mAP: 0.694451630115509


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=212]

Mean loss was 187.16800003643183





Train mAP: 0.6916548609733582


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=159]

Mean loss was 181.93225996623667





Train mAP: 0.6834357380867004


100%|██████████| 258/258 [03:20<00:00,  1.28it/s, loss=253]

Mean loss was 175.4217803127082





Train mAP: 0.7123120427131653


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=146]

Mean loss was 182.3024347290512





Train mAP: 0.6884087920188904


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=146]

Mean loss was 164.98568580686583





Train mAP: 0.7585276961326599


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=149]

Mean loss was 155.08857928135598





Train mAP: 0.7714208364486694


100%|██████████| 258/258 [03:20<00:00,  1.28it/s, loss=144]

Mean loss was 151.54111868096876





Train mAP: 0.7743287086486816


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=155]

Mean loss was 141.16967622623886





Train mAP: 0.7847734689712524


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=130]

Mean loss was 137.55833606572114





Train mAP: 0.7833776473999023


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=226]

Mean loss was 138.30301388289578





Train mAP: 0.6366711258888245


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=104]

Mean loss was 143.16509211340616





Train mAP: 0.7955905795097351


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=159]

Mean loss was 130.93816689188165





Train mAP: 0.7904840707778931


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=150]

Mean loss was 129.6188883522684





Train mAP: 0.7811689376831055


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=120]

Mean loss was 127.52911906279334





Train mAP: 0.8119915127754211


100%|██████████| 258/258 [03:24<00:00,  1.26it/s, loss=147]

Mean loss was 121.0662165205608





Train mAP: 0.7940620183944702


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=120]

Mean loss was 130.43523114226585





Train mAP: 0.7939394116401672


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=100]

Mean loss was 116.37294828429702





Train mAP: 0.8127595782279968


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=115]

Mean loss was 108.76714280594227





Train mAP: 0.810300350189209


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=94.3]

Mean loss was 106.08274102026178





Train mAP: 0.8194848895072937


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=173]

Mean loss was 132.685576682867





Train mAP: 0.623074471950531


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=125]

Mean loss was 126.10601173814877





Train mAP: 0.808985710144043


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=162]

Mean loss was 110.44650221240613





Train mAP: 0.7866679430007935


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=147]

Mean loss was 113.64666656375856





Train mAP: 0.7327930331230164


100%|██████████| 258/258 [03:23<00:00,  1.27it/s, loss=84.5]

Mean loss was 108.64461475934169





Train mAP: 0.8535916209220886


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=99.2]

Mean loss was 87.16850023491438





Train mAP: 0.8696447610855103


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=76.3]

Mean loss was 83.498639631641





Train mAP: 0.8328312635421753


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=83.6]

Mean loss was 100.37142518509266





Train mAP: 0.8211326599121094


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=79.4]

Mean loss was 94.16345580049263





Train mAP: 0.8457844853401184


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=81.7]

Mean loss was 85.08248642618342





Train mAP: 0.8483061790466309


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=62.9]

Mean loss was 79.00513745093531





Train mAP: 0.8544493913650513


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=78]

Mean loss was 85.6605931806934





Train mAP: 0.8365235328674316


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=64]

Mean loss was 83.04115796643634





Train mAP: 0.8444846868515015


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=87.4]

Mean loss was 75.18933690980424





Train mAP: 0.8527801632881165


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=51.6]

Mean loss was 87.23573301744091





Train mAP: 0.8309139013290405


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=81.9]

Mean loss was 77.42361964735873





Train mAP: 0.8443084955215454


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=64.9]

Mean loss was 75.39490915638532





Train mAP: 0.8457199931144714


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=79.8]

Mean loss was 75.02201588948567





Train mAP: 0.8460389375686646


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=93.9]

Mean loss was 74.54094076526258





Train mAP: 0.8245055079460144


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=87.1]

Mean loss was 81.70092459981755





Train mAP: 0.8468174934387207


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=125]

Mean loss was 78.8452933703282





Train mAP: 0.7307022213935852


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=80.1]

Mean loss was 91.04271634419759





Train mAP: 0.8534029126167297


100%|██████████| 258/258 [03:20<00:00,  1.28it/s, loss=101]

Mean loss was 70.28979243788608





Train mAP: 0.8649894595146179


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=51.4]

Mean loss was 61.90324332362922





Train mAP: 0.8791455030441284


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=85.6]

Mean loss was 108.15279453484587





Train mAP: 0.8415839076042175


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=75.1]

Mean loss was 72.26866618607396





Train mAP: 0.8808100819587708


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=90]

Mean loss was 58.933527406796





Train mAP: 0.8770791888237


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=45.7]

Mean loss was 56.0615374838659





Train mAP: 0.8754359483718872


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=39.1]

Mean loss was 54.960026881491494





Train mAP: 0.8807972073554993


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=48.5]

Mean loss was 52.460666863493216





Train mAP: 0.8788832426071167


100%|██████████| 258/258 [03:20<00:00,  1.28it/s, loss=41.5]

Mean loss was 54.39854280338731





Train mAP: 0.8602166175842285


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=55.6]

Mean loss was 56.93583523949911





Train mAP: 0.8721157312393188


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=49.8]

Mean loss was 55.98094437473504





Train mAP: 0.8557059168815613


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=51.7]

Mean loss was 55.11486105216566





Train mAP: 0.8673685789108276


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=68.5]

Mean loss was 73.25068435373232





Train mAP: 0.8300771713256836


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=59.9]

Mean loss was 69.22860905551171





Train mAP: 0.8425653576850891


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=56.4]

Mean loss was 59.5582723691482





Train mAP: 0.8670177459716797


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=69.1]

Mean loss was 57.95439598911492





Train mAP: 0.8328779339790344


100%|██████████| 258/258 [03:20<00:00,  1.28it/s, loss=77.6]

Mean loss was 72.29038501709931





Train mAP: 0.8492277264595032


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=54.2]

Mean loss was 57.143262730088345





Train mAP: 0.8816630244255066


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=44.3]

Mean loss was 48.427887044211694





Train mAP: 0.8954212069511414


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=47.7]

Mean loss was 53.64827262523563





Train mAP: 0.8765907287597656


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=46.6]

Mean loss was 49.01315590762353





Train mAP: 0.8734633326530457


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=39]

Mean loss was 45.83776096964991





Train mAP: 0.8683563470840454


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=68.4]

Mean loss was 46.8550775069599





Train mAP: 0.8788242340087891


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=60.1]

Mean loss was 48.5170334438945





Train mAP: 0.8684656023979187


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=43.7]

Mean loss was 50.15824105018793





Train mAP: 0.8704273104667664


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=42.2]

Mean loss was 52.66248716310013





Train mAP: 0.8806408047676086


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=49.5]

Mean loss was 54.34053274642589





Train mAP: 0.8584709167480469


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=47.9]

Mean loss was 52.538115952366084





Train mAP: 0.8693629503250122


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=43.9]

Mean loss was 49.91794274204461





Train mAP: 0.8741744756698608


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=46.7]

Mean loss was 44.75701610062473





Train mAP: 0.8772004246711731


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=41.8]

Mean loss was 43.58420452591061





Train mAP: 0.8703832626342773


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=35.7]

Mean loss was 43.29103023322054





Train mAP: 0.878085732460022


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=32.7]

Mean loss was 41.93004010074822





Train mAP: 0.8857662081718445


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=34.3]

Mean loss was 42.40361460604409





Train mAP: 0.8878616094589233


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=54.9]

Mean loss was 41.30107056817343





Train mAP: 0.8784721493721008


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=27.8]

Mean loss was 41.01972932593767





Train mAP: 0.8766628503799438


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=39.3]

Mean loss was 39.99226735728656





Train mAP: 0.8730610013008118


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=44.8]

Mean loss was 41.41901808006819





Train mAP: 0.8674230575561523


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=138]

Mean loss was 131.50100141717482





Train mAP: 0.7335787415504456


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=64.5]

Mean loss was 96.76475209968035





Train mAP: 0.8880171775817871


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=47.5]

Mean loss was 52.65444411048593





Train mAP: 0.916427731513977


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=41.5]

Mean loss was 39.261492182118026





Train mAP: 0.9208803176879883


100%|██████████| 258/258 [03:22<00:00,  1.28it/s, loss=30.9]

Mean loss was 31.44981571315795





Train mAP: 0.9289237260818481


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=50.4]

Mean loss was 27.60619109545567





Train mAP: 0.9315945506095886


100%|██████████| 258/258 [03:18<00:00,  1.30it/s, loss=22.9]

Mean loss was 27.067465811736824





Train mAP: 0.9253014326095581


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=25]

Mean loss was 27.03470206445502





Train mAP: 0.927665114402771


100%|██████████| 258/258 [03:19<00:00,  1.30it/s, loss=40.3]

Mean loss was 24.162347235420878





Train mAP: 0.9265280961990356


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=17.8]

Mean loss was 22.94704653865607





Train mAP: 0.9245235323905945


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=32.2]

Mean loss was 23.24566013868465





Train mAP: 0.9138517379760742


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=29.9]

Mean loss was 31.91850605306699





Train mAP: 0.9088541269302368


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=30.4]

Mean loss was 36.44529137870138





Train mAP: 0.8962459564208984


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=37.8]

Mean loss was 36.40094597395076





Train mAP: 0.9023351669311523


100%|██████████| 258/258 [03:21<00:00,  1.28it/s, loss=21]

Mean loss was 32.804261969041455





Train mAP: 0.9057021141052246


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=24.5]

Mean loss was 27.75761378088663





Train mAP: 0.9128120541572571


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=52.5]

Mean loss was 27.762319830961005





Train mAP: 0.9004176259040833


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=37.2]

Mean loss was 28.878691067067226





Train mAP: 0.9075801968574524


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=23.3]

Mean loss was 29.442080364670865





Train mAP: 0.9012645483016968


100%|██████████| 258/258 [03:22<00:00,  1.27it/s, loss=26.9]

Mean loss was 29.24708594093027





Train mAP: 0.9014919400215149


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=32.1]

Mean loss was 29.36375849376353





Train mAP: 0.8962143063545227


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=28]

Mean loss was 30.530318304549816





Train mAP: 0.8902947306632996


100%|██████████| 258/258 [03:17<00:00,  1.30it/s, loss=36]

Mean loss was 30.874265079350433





Train mAP: 0.8926767110824585


100%|██████████| 258/258 [03:19<00:00,  1.29it/s, loss=58.8]

Mean loss was 57.00832696663317





Train mAP: 0.8662829399108887


100%|██████████| 258/258 [03:18<00:00,  1.30it/s, loss=33.7]

Mean loss was 46.39411122669545





Train mAP: 0.8736979365348816


100%|██████████| 258/258 [03:20<00:00,  1.28it/s, loss=58.8]

Mean loss was 32.47341304601625





Train mAP: 0.8992430567741394


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=68.5]

Mean loss was 80.114326846692





Train mAP: 0.8589593768119812


100%|██████████| 258/258 [03:18<00:00,  1.30it/s, loss=52.1]

Mean loss was 50.47156456644221





Train mAP: 0.8808730244636536


100%|██████████| 258/258 [03:26<00:00,  1.25it/s, loss=42.1]

Mean loss was 58.36850048774897





Train mAP: 0.8896282315254211


100%|██████████| 258/258 [03:20<00:00,  1.29it/s, loss=39.3]

Mean loss was 37.81452127944591





Train mAP: 0.9147759675979614


100%|██████████| 258/258 [03:41<00:00,  1.16it/s, loss=33.3]

Mean loss was 26.082598564236662





KeyboardInterrupt: ignored

In [None]:
checkpoint = {
    "stated_dict": model.state_dict(),
    "optimizer": optimizer.state_dict(),
}
save_checkpoint(checkpoint, filename=LOAD_MODEL_FILE)

=> Saving checkpoint


### Test

In [None]:
pred_boxes, target_boxes = test_fn(
    test_loader, model, iou_threshold=0.5, threshold=0.4
)

mean_avg_prec = mean_average_precision(
    pred_boxes, target_boxes, iou_threshold=0.5, box_format="midpoint"
)
print(f"Test mAP: {mean_avg_prec}")

Test: 100%|██████████| 77/77 [00:54<00:00,  1.41it/s]


Test mAP: 0.041447874158620834


## Reference

- https://github.com/aladdinpersson/Machine-Learning-Collection/tree/master/ML/Pytorch/object_detection/YOLO