# Traffic signs detection and classification with Detecto and Tensorflow 

### Part 1 - *Introduction and Datasets*

All the functions and visualizations I used here can be found on my GitHub page: [https://github.com/alexisvannaire/GTSRB_detect-and-predict](https://github.com/alexisvannaire/GTSRB_detect-and-predict)

**Libraries you'll need to have installed**

* `os`, `pathlib`, `xml`, `shutil`, `functools`, `json`
* `numpy`, `pandas`, `matplotlib`, `plotly`, `random`
* `PIL`, `cv2`, `ipython`
* `sklearn`, `tensorflow`, `detecto`, `labelImg`

<br>

You can find all details in the requirements.txt file if needed.

I had needed to install `kaleido` this way because of some issues with windows, I wanted to export statics images from plotly figures [(https://plotly.com/python/static-image-export/)](https://plotly.com/python/static-image-export/) and maybe some dependencies problems:
    
`%pip install kaleido-0.1.0.post1-py2.py3-none-win_amd64.whl`

You can find the files here: [https://github.com/plotly/Kaleido/releases/download/v0.1.0.post1/](https://github.com/plotly/Kaleido/releases/download/v0.1.0.post1/)

**For this notebook:**

In [None]:
# libraries
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from IPython import display

import plots # plots.py

In [None]:
# False for default visualizations
gtsrb_exists = False # True if you've placed the gtsrb dataset in the "./data/gtsrb/" folder
video_exists = False # True if you've placed a video in the "./data/google_maps_test/video/" folder

# I. Introduction

There are a lot of things to do and learn in the computer vision field. 
Here are some of the main tasks you can take on: Image Classification, Object Detection, Image Segmentation, Image Enhancement, Face Recognition, Visual Tracking, Image Captioning, Image Generation, 3D Reconstruction, etc. 

A good way to get started is to train a CNN (Convolutional Neural Network) classifier on images.
There are a lot of datasets you can easily find like MNIST, ImageNet, CIFAR-10, Intel Image classification, etc.

In these notebooks you're going to see:

* How to train a pre-trained object detection model on a custom dataset with the **Detecto** Python library [https://github.com/alankbi/detecto](https://github.com/alankbi/detecto)
* How to train a CNN and a MobileNet model with Data Augmentation on the GTSRB dataset with **Tensorflow**
* How to visualize your predictions and create a video with boxes, predicted labels and the associated probabilities

The whole idea here is to predict all the traffic signs you can have inside an image. 

We're going to proceed in two steps:
* identifying traffic signs inside an image (Object Detection model)
* determining the specific type of traffic sign of each one (Classification model)

In [None]:
Image.open("imgs/process_image.png").convert("RGB")

# II. Datasets

## 1. German Traffic Sign Recognition Benchmark (GTSRB)

***The German Traffic Sign Benchmark** is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011* (more details here: [https://benchmark.ini.rub.de/gtsrb_news.html](https://benchmark.ini.rub.de/gtsrb_news.html)).

You can download it (300 Mo) from Kaggle here: [https://www.kaggle.com/datasets/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign](https://www.kaggle.com/datasets/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign) 

<br>

We're going to use this dataset to train the classification model and predict what type of traffic sign a new one could be.

It contains 51,839 images and 43 classes.

Here are the 43 classes illustrated with the corresponding wikipedia images:

In [None]:
plots.plot_wiki_traffic_signs_classes(ncols=8, title_size=18)

And here, random ones from the training set:

In [None]:
plots.plot_dataset_traffic_signs_classes(ncols=8, title_size=15, gtsrb_exists=gtsrb_exists)

As you can see, these traffic signs aren't always easy to guess even for a human being (mainly due to the darkness here).

Also, their dimensions vary. Let's see how:

In [None]:
fig, img_dimensions = plots.plot_images_dimensions_distribution(gtsrb_exists=gtsrb_exists)
fig

There are many images with dimensions close to 33x33 pixels, most of them square.

We can say that most images (95%) have widths and heights of less than 100x100 pixels.

In [None]:
if gtsrb_exists:
    img_dimensions[["width", "height"]].quantile(.95)

Let's see now the number of images per class:

In [None]:
fig, img_per_class = plots.plot_images_number_per_class(gtsrb_exists=gtsrb_exists)
fig

The distribution isn't uniform and that's not ideal.
But anyway, the lowest number of images per class is 210 which isn't that bad and some methods exist to lower this problem like **Data Augmentation** or **resampling**.

## 2. Google Maps - Street View images

### Training Set

For the object detection model we're going to create our own dataset.

As often in Machine & Deep Learning  modeling, a required condition to have an accuracte model is to have enough data. 
And enough is a lot! But fortunately there are a lot of models that have been trained for similar tasks we can use for our own task.

We call them pre-trained models, and the training process: **Transfer Learning**.
This allows us to start with a well optimized model for computer vision, train it on our custom dataset which doesn't need to be as huge as it has to be when you're starting to train a model from scratch.

So the idea here is to create a dataset from street view images you can get with google maps.

As the classification model will be trained on german traffic signs, we're going to use images from Germany.

I took 92 screenshots from street view in several places where there is at least one traffic sign.
Their dimensions are close to 1920x630 pixels.

Here's an example from Rostock a city in the north of Germany:

In [None]:
Image.open("imgs/76_54.1765866,12.1010878.png").convert("RGB")

In order to have more images, and with not so standard point of views, I've randomly cropped them with the following dimensions: 720x480 pixels.

You can find the corresponding function: `random_crop` in the `process_data.py` file.

Here are the three ones from the image above:

In [None]:
imgs = [np.array(Image.open(f"data/google_maps_test/labeled_imgs/76-{i}.jpg").convert("RGB")) for i in range(3)]
fig, axs = plt.subplots(nrows=1, ncols=3, figsize=(12,3))
for i in range(3):
    axs[i].imshow(imgs[i])
    axs[i].axis("off")
plt.show()

### Test Set

In order to test the model, I recorded a video of my screen while I was moving in street view. 

In [None]:
display.Image("imgs/start_video1.gif")

Once recorded, you can split it with the detecto package:

In [None]:
if video_exists:
    from detecto.utils import split_video
    split_video("data/google_maps_test/video/video.mp4", "data/google_maps_test/frames_decomposed_video/", step_size=3)

The `step_size` argument corresponds to the number of images you want to skip.

For example a `step_size` of 3 will save every third frame.

**Note:**
    
1. I added black rectangles to images in order to hide google maps widgets.
To do so, I opened an image editor like Paint and fin the four coordinates to create these rectangles.
You can find the function I used in the `process_data.py` file: `add_black_rectangles`.

2. I also defined a function in order to create gif files from a set of images, you'll find it in the `plots.py` file: `make_gif`.