<a href="https://www.kaggle.com/code/binfeng2021/computer-vision-yolov8-on-traffic-detection?scriptVersionId=167680207" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

## Project Overview

In today's digitally driven world, computer vision stands as a cornerstone of technological innovation, offering machines the ability to perceive and interpret visual information much like the human eye. From autonomous vehicles navigating bustling city streets to facial recognition systems securing our smartphones, the applications of computer vision are ubiquitous and far-reaching.

Among the myriad algorithms powering computer vision systems, one standout is YOLO—You Only Look Once. YOLO represents a paradigm shift in object detection, offering unparalleled speed and accuracy by simultaneously predicting bounding boxes and class probabilities for multiple objects within a single pass through the neural network. In simple terms, YOLO enables machines to swiftly identify and categorize objects in images or video streams with remarkable efficiency.

Now, imagine leveraging the capabilities of YOLO to tackle a pressing real-world challenge: traffic sign detection. In a world where road safety is paramount, accurately identifying and interpreting traffic signs is crucial for ensuring smooth traffic flow and preventing accidents. This is where our project comes into play.

### Main Objectives

* Check the datasets
* Understand Model YOLOv8
* Train the YOLOv8 model and analyze the results
* Test the final model

### Import all necessary libraries

In [None]:
!pip install ultralytics

In [None]:
# import all necessary libraries
import numpy as np 
import pandas as pd 
import os
import random
from PIL import Image
import cv2
from IPython.display import Video
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style = 'darkgrid')
import pathlib
import glob
from tqdm.notebook import trange, tqdm
import warnings
warnings.filterwarnings('ignore')

from ultralytics import YOLO

In [None]:
# trun off wandb reporting for this notebook
os.environ['WANDB_DISABLED'] = 'true'

## Explore the datasets

### Check the training images

In [None]:
train_img_dir = '/kaggle/input/cardetection/train/images'

In [None]:
num_samples = 9
image_files = os.listdir(train_img_dir)
rand_imgs = random.sample(image_files, num_samples)

fig, axes = plt.subplots(3,3, figsize = (11, 11))
for i in range(num_samples):
    image = rand_imgs[i]
    ax = axes[i // 3, i%3]
    ax.imshow(plt.imread(os.path.join(train_img_dir, image)))
    ax.set_title(f'Image {i+1}')
    ax.axis('off')
plt.tight_layout()
plt.show()

According to the **data.yaml** file, we have 15 different labeled classes in the dataset, shown as below:

['Green Light', 'Red Light', 'Speed Limit 10', 'Speed Limit 100', 'Speed Limit 110', 'Speed Limit 120', 'Speed Limit 20', 'Speed Limit 30', 'Speed Limit 40', 'Speed Limit 50', 'Speed Limit 60', 'Speed Limit 70', 'Speed Limit 80', 'Speed Limit 90', 'Stop']

## Understand Model YOLOv8

YOLO, which stands for "You Only Look Once," is a groundbreaking object detection algorithm in computer vision. Unlike traditional object detection algorithms that involve multiple stages of processing, YOLO processes the entire image in a single pass through a convolutional neural network (CNN). This approach allows YOLO to achieve real-time object detection with impressive speed and accuracy.

### Technical Details

1. The algorithm will take the input image and split it into a grid of cells. Each cell will predict bounding boxes and a confidence score for each box. The bounding boxes indicates the location of detected object in the images and the confidence value indicates the model's certainty of the prediction. 

2. Other than confidence socre for each bounding boxes, YOLO also predicts an "objectness" score for each box that indicate the likelihood that the box contains a meaningful object, not just background clutter.

3. Last but not the least, YOLO also predicts the class for each bounding box detected. The probability generated measures the likelihood of the detected obeject belonging to different predefined classes. 

4. To generate final output, YOLO will combine a set of bounding boxes from previous steps, each bounding box is associated with its class label and confidence socre. Those combined bounding boxes then can represent the objects detected along with its location in the image and classified label. To prevent redundant and improve localization accuracy, YOLO applied a technic called Non-Maximum Suppression (NMS) to those predicted bounding boxes. NMS selects the most confident bounding boxes while suppressing overlapping detections with lower confidence scores. 



### Check out the pre-trained YOLOv8 Model

In [None]:
model = YOLO("yolov8n.pt")

In [None]:
fig, axes = plt.subplots(3,3, figsize = (11, 11))
for i in range(num_samples):
    image = rand_imgs[i]
    ax = axes[i // 3, i%3]
    result_predict = model.predict(source = os.path.join(train_img_dir, image), imgsz = (416))
    ax.imshow(result_predict[0].plot())
    ax.set_title(f'Image {i+1}')
    ax.axis('off')
plt.tight_layout()
plt.show()

The pre-trained YOLO does not perform very well, most of the traffic signs were not detected, or correctly labled. We will fine-tune the model using the specified dataset and see how the performance might change. 

## Fine-tune YOLOv8 with given training dataset

In [None]:
final_model = YOLO('yolov8n.yaml').load('yolov8m.pt')

In [None]:
Result_Final_model = final_model.train(data="/kaggle/input/cardetection/data.yaml",epochs=100, imgsz = 416, device = 0)

## Analyze the final model's performance

In [None]:
list_of_metrics = ["P_curve.png", "R_curve.png", "PR_curve.png", "F1_curve.png","confusion_matrix.png", "results.png"]
for i in list_of_metrics:
    
    image = cv2.imread(f'/kaggle/working/runs/detect/train/{i}')
    plt.figure(figsize=(16, 12))
    plt.imshow(image)
    plt.show()

#### Interpretation of the plots

1. **Precision-confidence Curve**: a graphical representation of how precision of the model changes at different confidence levels. In the first plot, we can see that the precisions of the model increases for all classes as the confidence increases. We are able to reach to precision score 1 for all classes when the confidence threshold is 0.958.

2. **Recall-confidence Curve**: a graphical representation of how recall of the model changes at different confidence levels. In the second plot, we can see that the recall of the model decreases for all classes as the confidence increases. The recall for all classes is 0.94 when the confidence threshold is 0.

3. **Precision-Recall Curve**: a graphical representation of the trade-off between precision and recall for different threshold used. From the third plot, we see that the model's precision decreases as the recall increases. When using IoU threshold (intersection over Union) of 0.5, the model is able to achieve mAP (mean average precision) of 0.908.

4. **F1-Confidence Curve**: a graphic representation of how F1 score of the model changes at different confidence levels. Since F1 score is caluclated using both precision and recall scores, it can be a good visualization of how the model is preforming overall. From the fourth plot, we can see that the F1 score increases and then decreases as the confidence threshold increases. When setting the confidence threshold as 0.319, we are able to achieve a F1 score of 0.88 for all classes. 

5. **Confusion Matrix**: a table that allows visualization of the performance of a classificaiotn model by summarizing the correct and incorrect classifications. The diagonal of the table shows all ture positives made by the model. As we can see from the fifth plot, most of high value numbers in the table are in the diagonal line, so we can conclude that the model is able to make correct prediction in most cases. 

6. **Results plot during training**: a combanation of different metrics measured during training process. In the sixth plot, we can see multiple visualization of different metrics values changes over different epochs. To better understand the results, we need to understand the different loss values measured first. 

box_loss: is also known as localication loss or regression loss. It measures the discrepancy between the predicted bounding box coordinates and the ground truth bounding box coordinates for each object in the image.

cls_loss: is konwn as classification loss. It measures the accuracy of the predicted class labels assigned to each bounding box.

dfl_loss: is known as domain-fused loss. It measures the discrepancy between feature representations learned by the model across different domains. The goal of minimizing dfl loss is to better align the feature representations across different domains, so that the model is able to improve the preformance in real-world scenarios where the testing data may differ from the training data. 

From the plots included in the sixth figure, we can see that all of those three loss values decreases and the precision and recall scores are increasing as more epoches trained. 

## Test the model on test data

In [None]:
test_img_dir = '/kaggle/input/cardetection/test/images'
num_samples = 9
image_files = os.listdir(test_img_dir)
test_imgs = random.sample(image_files, num_samples)

fig, axes = plt.subplots(3,3, figsize = (11, 11))
for i in range(num_samples):
    image = test_imgs[i]
    ax = axes[i // 3, i%3]
    result_predict = final_model.predict(source = os.path.join(test_img_dir, image), imgsz = (416))
    ax.imshow(result_predict[0].plot())
    ax.set_title(f'Image {i+1}')
    ax.axis('off')
plt.tight_layout()
plt.show()

Based on the results obtained from the test images, we can see a significant performance increase from the pre-trained model. YOLOv8 has performed very well after training and can detect most objects that we specified in the dataset. 

## Conclusion

In this project, we utilized the YOLO model for traffic detection, starting with familiarization and testing of a pre-trained model. Then, we have fine-tuned the model with our dataset. The retrained model shows a significantly improved performance. We have also analyzed different metrics and plots obtained during the training process. All of those plots shows the confirmation of enhanced accuracy. 

### Current Limitation & Future work

- To improve the performance further, we could involve further optimization by adjusting parameters used in the YOLO model, for this project, I have only used default parameters mostly, but we can try out different parameter combinations to find the optional model to use. 

- During the analysis of the training results, we noticed that the model performance continue to increase in different metrics as number of epoches increase. So we can continue to train the model with more epoches, this should result in performance increase as well.

- Last but not least, the provided dataset only contains ~3.5K training images. If we can expand the training dataset further by including more samples, we could obtain a even better results. 


### Thank you!

If you found this notebook interesting, please give me a upvote! If you have any thoughts, I would love to hear it in the comments section! Thank you for reading!