#Understanding the Yolo Algorithm and Fine-Tuning It
____


# Overview of Yolo Algorithm

The YOLO algorithm is designed to preform object detection and image classification. Usually, detection and classification are often two separate models which take two passes, however, yolo combines the two in one pass which is why it's named you only look one. This allows for quick detection and the ability to be used in real time applications.  The following output looks like this:

<img src="cover.png" width="400" height="300">

Yolo does object detection and classification in one pass buy dividing the image into an SXS Matrix like the image below:

<img src="SXS.png" width="300" height="300">

# Data Engineering

Gage

In [5]:
import torch
import torch.nn as nn
import torch.nn.functional as F

In [None]:
x, t = #TODO.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Model Overview / Example Usage

Ben

# Loss Function

### Parameters

grid_size: the original image is divided into a grid with length grid_size  
num_boxes: number of bounding boxes to be predicted in each grid cell  
num_classes: number of classes an object can be identified as  

### Intersection over Union Utility Function

The intersection over Union metric is used to determine how well the predicted box matches the annotated box label. The function is essentially a ratio where the numerator is the overlap between the predictions and the denominator is the total area of the predictions. As a result, a perfect match would have an IoU score of 1, and worse predictions would have a score less than 1.

In [6]:
def intersection_over_union(boxes_preds, boxes_labels):
    """
    Parameters:
        boxes_preds (tensor): Predictions of Bounding Boxes (num_boxes, 4)
        boxes_labels (tensor): Correct labels of Bounding Boxes (num_boxes, 4)
    """

    ## Determine the boundary corners of the box given a box is defined in the params by (x,y,w,h) ##
    box1_x1 = boxes_preds[..., 0:1] - boxes_preds[..., 2:3] / 2
    box1_y1 = boxes_preds[..., 1:2] - boxes_preds[..., 3:4] / 2
    box1_x2 = boxes_preds[..., 0:1] + boxes_preds[..., 2:3] / 2
    box1_y2 = boxes_preds[..., 1:2] + boxes_preds[..., 3:4] / 2
    box2_x1 = boxes_labels[..., 0:1] - boxes_labels[..., 2:3] / 2
    box2_y1 = boxes_labels[..., 1:2] - boxes_labels[..., 3:4] / 2
    box2_x2 = boxes_labels[..., 0:1] + boxes_labels[..., 2:3] / 2
    box2_y2 = boxes_labels[..., 1:2] + boxes_labels[..., 3:4] / 2

    combined_x1 = torch.max(box1_x1, box2_x1)
    combined_y1 = torch.max(box1_y1, box2_y1)
    combined_x2 = torch.min(box1_x2, box2_x2)
    combined_y2 = torch.min(box1_y2, box2_y2)

    intersection = (combined_x2 - combined_x1).clamp(0) * (combined_y2 - combined_y1).clamp(0) # Clamp where there is no intersection
    
    box1_area = abs((box1_x2 - box1_x1) * (box1_y2 - box1_y1))
    box2_area = abs((box2_x2 - box2_x1) * (box2_y2 - box2_y1))

    return intersection / (box1_area + box2_area - intersection + 1e-6) # include 1e-6 for no division by 0 error


### Loss Function for Box Coordinates

In [None]:
def loss_fn_box_coordinates(predictions, target, grid_size=7, num_boxes=2, num_classes=3):
    
    ## First calculate IoUs for the two bounding box predictions
    iou_b1 = intersection_over_union(predictions[..., num_classes + 1:num_classes + 5], target[..., num_classes + 1:num_classes + 5])
    iou_b2 = intersection_over_union(predictions[..., num_classes + 6:num_classes + 10], target[..., num_classes + 1:num_classes + 5])
    ious = torch.cat([iou_b1.unsqueeze(0), iou_b2.unsqueeze(0)], dim=0)

    iou_maxes, bestbox = torch.max(ious, dim=0)
    exists_box = target[..., num_classes].unsqueeze(3)

In [None]:
import torch
import torch.nn as nn

# CNN Implementation

In [None]:
class YOLO(nn.Module):
    def __init__(self):
        super().__init__()

        self.conv3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(64 * 4 * 4, 512)

    def forward()

Ben

# Training

Gage