<a href="https://colab.research.google.com/github/Romanvia93/traffic_sign_detection/blob/main/yolo/notebooks/Yolov5_w_augmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Traffic Sign Detection Project Report
* Thien An Trinh
* Roman Burekhin
* Athira Devan
* Lester Azinge

## Abstract

This project aims to train, benchmark and deloy an object detection model to detect 4 types of traffic signs: `traffic light`, `stop`, `speed limit`, and `crosswalk`. The chosen models for the project were `YOLOv5`, `EfficientDet D1`, `SSD MobileNet FPNLite`, `SSD ResNet50 FPN`, and `Faster R-CNN ResNet50`. There are two frameworks for model training, evaluation and inference: `YOLOv5` belongs to [Ultralytics](https://github.com/ultralytics/yolov5) and the rest belong to [Tensorflow](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md). Our experiments showed that among the tested models, `YOLOv5` is the best - It won in all criteria including precision, speed, size, and training time, and therefore  was chosen for a video inference (available in `yolo/videos` directory) and a [Streamlit deployment](https://trafficsigns.streamlit.app/). Besides, among the TensorFlow models, `SSD MobileNet FPNLite` is the best model. Hence, it was chosen to run a realtime webcam test on a local machine. The screen recording of this demo is available in `tensorflow/videos` directory.

In [2]:
from zipfile import ZipFile
import os.path

import xml.etree.ElementTree as ET
import os

from PIL import Image
from tqdm import tqdm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import seaborn as sns
%config InlineBackend.figure_format='retina'

import gdown

# from collections import defaultdict
# import shutil

# import matplotlib.pyplot as plt
# from PIL import Image

# import cv2
# import random

# import yaml

## 1. Introduction

### 1.1. Problem Statement

We are living in a world that is moving towards automation. From robot arms assembling individual components into complete cars to smart household appliances that have been transforming our homes, the benefits of autonomous applications are undeniable. The automobile industry is following the same trend. Not only autopilot systems assist drivers by bringing them better driving experience, but they can also help reduce the number of accidents. For example, a vehicle with a smart traffic sign detection system can “see” all the signs ahead including those the driver could miss, and thereby perform proper actions timely in time-sensitive situation.

In that context, our project focused two paramount objectives: 
1. to meticulously **benchmark** a range of model architectures to identify the most optimal and efficient solution for this challenging task  
<br>
2. to **deploy** and demonstrate precise and reliable detection of critical traffic signs.

Before proceeding into further detail, it is crucial to address certain concepts related to how the task was framed:
1. The project is a `computer vision` task – a domain in which images are processed and analyzed in order to extract useful information that can drive decision-making (Arabnia et al., 2018; Yoshida, 2011).  
<br>
2. This project is specifically an `object detection` task where an image was analyzed not for obtaining the semantic meaning of the whole image (i.e., `image classification`), or for segmenting the image into meaningful regions (i.e., `image segmentation`), but rather to identify targeted objects that are present in the images and determine where on the images they are located. In this project, the objects of interest were `traffic lights`, `stop` signs, `speed limit` signs and `crosswalk` signs, which are the fundamental elements that guide drivers and traffic flow. Third, the algorithms required for this task were defined to be *deep learning (DL)  convolutional neural networks (CNNs)* which has always been the state-of-the-art in the domain for a decade.  
<br>
3. The project also leveraged an advanced DL technique called transfer learning, in which neural networks that were pretrained on a large dataset are fine-tuned on the dataset of interest instead of being trained from scratch, and therefore are capable of attaining high evaluation scores in the new domain. The details of the data, pretrained-models, training and evaluation frameworks are now discussed in the following section.


### 1.2. Related Work

The authors took [this YOLOv5 turorial](https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb) and [this TensorFlow tutorial](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html) as the starting point for the work of this project.