<a href="https://colab.research.google.com/github/palit-ishan/Traffic-and-Road-Signs-Object-Detection-Project/blob/main/Road_Signs_Detection_Model_Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Train a custom object detection model for Road Sign Detection with TensorFlow Lite Model Maker

We will use the [TensorFlow Lite Model Maker](https://www.tensorflow.org/lite/guide/model_maker) to train a custom object detection model to detect Road Signs and put the TFLite model on a Raspberry Pi.

The Model Maker library uses *transfer learning* to simplify the process of training a TensorFlow Lite model using a custom dataset. Retraining a TensorFlow Lite model with your own custom dataset reduces the amount of training data required and will shorten the training time.


## Preparation

### Install the required packages
We will install the required packages, including the Model Maker package from the [GitHub repo](https://github.com/tensorflow/examples/tree/master/tensorflow_examples/lite/model_maker) and the pycocotools library we will use for evaluation.

In [None]:
!pip install -q tflite-model-maker
!pip install -q tflite-support

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m577.3/577.3 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.5/77.5 kB[0m [31m8.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m240.6/240.6 kB[0m [31m22.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.9/10.9 MB[0m [31m91.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m67.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.9/60.9 MB[0m [31m18.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m203.8/203.8 kB[0m [31m22.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.4/3.4 MB[0m [31m74.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
!pip install numpy==1.23.4

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


Import the required packages.

In [None]:
import os

from tflite_model_maker.config import ExportFormat, QuantizationConfig
from tflite_model_maker import model_spec
from tflite_model_maker import object_detector

from tflite_support import metadata

import tensorflow as tf
assert tf.__version__.startswith('2')

tf.get_logger().setLevel('ERROR')
from absl import logging
logging.set_verbosity(logging.ERROR)
import cv2

In [None]:
import numpy as np

### Prepare the dataset - Mounting the Dataset in Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## Training the object detection model

### Loading the dataset

* Images in `train_data` is used to train the custom object detection model. Here we use 745 images to train the model.
* Images in `val_data` is used to check if the model can generalize well to new images that it hasn't seen before. Validation is done on 132 images.

In [None]:
train_data = object_detector.DataLoader.from_pascal_voc(
    'drive/MyDrive/Project_Test_Data/train_jpg',
    'drive/MyDrive/Project_Test_Data/train_jpg',
    ['trafficlight','stop','speedlimit','crosswalk']
)
val_data = object_detector.DataLoader.from_pascal_voc(
    'drive/MyDrive/Project_Test_Data/validate_jpg',
    'drive/MyDrive/Project_Test_Data/validate_jpg',
    ['trafficlight','stop','speedlimit','crosswalk']
)

In [None]:
val_data.size

132

### Selecting a model architecture

EfficientDet-Lite[0-4] are a family of mobile/IoT-friendly object detection models derived from the [EfficientDet](https://arxiv.org/abs/1911.09070) architecture.

Here is the performance of each EfficientDet-Lite models compared to each others.

| Model architecture | Size(MB)* | Latency(ms)** | Average Precision*** |
|--------------------|-----------|---------------|----------------------|
| EfficientDet-Lite0 | 4.4       | 146           | 25.69%               |
| EfficientDet-Lite1 | 5.8       | 259           | 30.55%               |
| EfficientDet-Lite2 | 7.2       | 396           | 33.97%               |
| EfficientDet-Lite3 | 11.4      | 716           | 37.70%               |
| EfficientDet-Lite4 | 19.9      | 1886          | 41.96%               |

<i> * Size of the integer quantized models. <br/>
** Latency measured on Raspberry Pi 4 using 4 threads on CPU. <br/>
*** Average Precision is the mAP (mean Average Precision) on the COCO 2017 validation dataset.
</i>

In this notebook, we use EfficientDet-Lite2 to train our model, as it is at a good tradeoff between Latency and Average Precision.

In [None]:
spec = model_spec.get('efficientdet_lite2')
spec.config.autoaugment_policy = 'v0'

###Training the TensorFlow model with the training data.

* Setting `epochs = 50`, which means it will go through the training dataset 50 times. We will look at the validation accuracy during training and stop when we see validation loss (`val_loss`) stop decreasing to avoid overfitting.
* Set `batch_size = 16` here so you will see that it takes 46 steps to go through the 745 images in the training dataset.
* Set `train_whole_model=True` to fine-tune the whole model instead of just training the head layer to improve accuracy. The trade-off is that it may take longer to train the model.

In [None]:
model = object_detector.create(train_data, model_spec=spec,batch_size = 16,train_whole_model=True, epochs=50, validation_data=val_data)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


### Evaluating the model with the validation data.

After training the object detection model using the images in the training dataset, use the 132 images in the validation dataset to evaluate how the model performs against new data it has never seen before.

As the default batch size is 64, it will take 2 step to go through the 132 images in the validation dataset.

The evaluation metrics are same as [COCO](https://cocodataset.org/#detection-eval).

In [None]:
val_data.size

132

In [None]:
model.evaluate(val_data)




{'AP': 0.64514923,
 'AP50': 0.78636855,
 'AP75': 0.7598101,
 'APs': 0.48135948,
 'APm': 0.8445239,
 'APl': 0.9590602,
 'ARmax1': 0.58477634,
 'ARmax10': 0.7166675,
 'ARmax100': 0.7193265,
 'ARs': 0.57237744,
 'ARm': 0.8945395,
 'ARl': 0.96666664,
 'AP_/trafficlight': 0.45070502,
 'AP_/stop': 0.66664034,
 'AP_/speedlimit': 0.86242515,
 'AP_/crosswalk': 0.6008265}

### Export as a TensorFlow Lite model.

Export the trained object detection model to the TensorFlow Lite format by specifying which folder you want to export the quantized model to. The default post-training quantization technique is [full integer quantization](https://www.tensorflow.org/lite/performance/post_training_integer_quant). This allows the TensorFlow Lite model to be smaller, run faster on Raspberry Pi CPU and also compatible with the Google Coral EdgeTPU.

In [None]:
model.export(export_dir='.', tflite_filename='road_signs_efficientdet_lite2_aug_v0_b16_ep50.tflite')

### Evaluate the TensorFlow Lite model.

Several factors can affect the model accuracy when exporting to TFLite:
* [Quantization](https://www.tensorflow.org/lite/performance/model_optimization) helps shrinking the model size by 4 times at the expense of some accuracy drop.
* The original TensorFlow model uses per-class [non-max supression (NMS)](https://www.coursera.org/lecture/convolutional-neural-networks/non-max-suppression-dvrjH) for post-processing, while the TFLite model uses global NMS that's much faster but less accurate.
Keras outputs maximum 100 detections while tflite outputs maximum 25 detections.

Therefore you'll have to evaluate the exported TFLite model and compare its accuracy with the original TensorFlow model.

In [None]:
model.evaluate_tflite('road_signs_efficientdet_lite2_aug_v0_b16_ep50.tflite', val_data)




{'AP': 0.62906927,
 'AP50': 0.78477293,
 'AP75': 0.7391676,
 'APs': 0.46777886,
 'APm': 0.81886834,
 'APl': 0.9587459,
 'ARmax1': 0.56349117,
 'ARmax10': 0.67622435,
 'ARmax100': 0.67622435,
 'ARs': 0.52254903,
 'ARm': 0.8516886,
 'ARl': 0.96,
 'AP_/trafficlight': 0.44190758,
 'AP_/stop': 0.6226553,
 'AP_/speedlimit': 0.8511243,
 'AP_/crosswalk': 0.60059}

##Testing Model on a still image to see output with Bounding Box

In [None]:
labels = ['trafficlight','stop','speedlimit','crosswalk']

In [None]:
# Code to use model
interpreter = tf.lite.Interpreter(model_path="road_signs_efficientdet_lite2_aug_v0_b16_ep50.tflite")
interpreter.allocate_tensors()
signature_fn = interpreter.get_signature_runner()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()


In [None]:
input_details

[{'name': 'serving_default_images:0',
  'index': 0,
  'shape': array([  1, 448, 448,   3], dtype=int32),
  'shape_signature': array([  1, 448, 448,   3], dtype=int32),
  'dtype': numpy.uint8,
  'quantization': (0.0078125, 127),
  'quantization_parameters': {'scales': array([0.0078125], dtype=float32),
   'zero_points': array([127], dtype=int32),
   'quantized_dimension': 0},
  'sparsity_parameters': {}}]

In [None]:
'''Getting image dimension '''
rdim = (input_details[0]['shape'][1], input_details[0]['shape'][2])

In [None]:
#Providing Path for Image
image_path = 'stop_cross_walk.jpg'

In [None]:
input_data_type = input_details[0]["dtype"]
or_image = cv2.imread(image_path)
image = cv2.resize(or_image,rdim)
image = np.array(image, dtype=input_data_type)
image = np.expand_dims(image, axis = 0)

In [None]:
or_image.shape

(607, 940, 3)

In [None]:
output = signature_fn(images=image)
count = int(np.squeeze(output['output_0']))
scores = np.squeeze(output['output_1'])
classes = np.squeeze(output['output_2'])
boxes = np.squeeze(output['output_3'])

In [None]:
results = []
for i in range(count):
  if scores[i] >= 0.3:        # Setting the confidence level above 0.3 to show bounding boxes
    result = {
      'bounding_box': boxes[i],
      'class_id': classes[i],
      'score': scores[i]
    }
    results.append(result)

In [None]:
COLORS = np.random.randint(0, 254, size=(len(classes), 3), dtype=np.uint8)
for obj in results:
    # Convert the object bounding box from relative coordinates to absolute
    # coordinates based on the original image resolution
    ymin, xmin, ymax, xmax = obj['bounding_box']
    xmin = int(xmin * or_image.shape[1])
    xmax = int(xmax * or_image.shape[1])
    ymin = int(ymin * or_image.shape[0])
    ymax = int(ymax * or_image.shape[0])

    # Find the class index of the current object
    class_id = int(obj['class_id'])

    # Draw the bounding box and label on the image
    color = [int(c) for c in COLORS[class_id]]
    cv2.rectangle(or_image, (xmin, ymin), (xmax, ymax), color, 2)
    # Make adjustments to make the label visible for all objects
    y = ymin - 15 if ymin - 15 > 15 else ymin + 15
    label = "{}: {:.0f}%".format(labels[class_id], obj['score'] * 100)
    cv2.putText(or_image, label, (xmin, y),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
    
original_uint8 = or_image.astype(np.uint8)

In [None]:
cv2.imwrite('output.png', original_uint8)

True