# Retrain EfficientDet-Lite0 Model for the "Modelrailway-Cam"

In this jupyter-notebook, we'll retrain an EfficientDet-Lite object detection model (derived from [EfficientDet](https://ai.googleblog.com/2020/04/efficientdet-towards-scalable-and.html)) using the [TensorFlow Lite Model Maker library](https://www.tensorflow.org/lite/guide/model_maker), and then compile it to run on the [Coral Edge TPU](https://www.coral.ai/products/). All in about 10 minutes on a GPU. Please change runtime type (Laufzeittyp) to "GPU" in the menue.

This notebook retrains the model using images of a modelrailway showing locomotives and waggons. It is an adapted version of the original notebook: [Train a salad detector with TFLite Model Maker - Colaboratory (google.com)](https://colab.research.google.com/github/googlecodelabs/odml-pathways/blob/main/object-detection/codelab2/python/Train_a_salad_detector_with_TFLite_Model_Maker.ipynb)

Author: Detlef Heinze   Version: 1.0  

In [1]:
#@title
# -*- coding: utf-8 -*-

In [2]:
#Use your google drive for this notebook. Follow messages on scren.
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


The training data used must be present in the path /content/drive/MyDrive/TrainData


## Import the required packages

In [3]:
!pip install -q tflite-model-maker
!pip install -q pycocotools
!pip install -q tflite-support

[K     |████████████████████████████████| 642 kB 4.9 MB/s 
[K     |████████████████████████████████| 120 kB 61.5 MB/s 
[K     |████████████████████████████████| 6.4 MB 65.1 MB/s 
[K     |████████████████████████████████| 1.2 MB 60.6 MB/s 
[K     |████████████████████████████████| 10.9 MB 54.0 MB/s 
[K     |████████████████████████████████| 840 kB 64.3 MB/s 
[K     |████████████████████████████████| 3.4 MB 48.9 MB/s 
[K     |████████████████████████████████| 55.1 MB 1.1 MB/s 
[K     |████████████████████████████████| 1.1 MB 59.0 MB/s 
[K     |████████████████████████████████| 237 kB 64.1 MB/s 
[K     |████████████████████████████████| 87 kB 9.1 MB/s 
[K     |████████████████████████████████| 596 kB 71.0 MB/s 
[K     |████████████████████████████████| 77 kB 7.7 MB/s 
[K     |████████████████████████████████| 25.3 MB 1.4 MB/s 
[K     |████████████████████████████████| 99 kB 11.9 MB/s 
[K     |████████████████████████████████| 48.3 MB 1.5 MB/s 
[K     |████████████████████

In [4]:
import numpy as np
import os

from tflite_model_maker.config import ExportFormat
from tflite_model_maker import model_spec
from tflite_model_maker import object_detector

import tensorflow as tf
assert tf.__version__.startswith('2')

tf.get_logger().setLevel('ERROR')
from absl import logging
logging.set_verbosity(logging.ERROR)

## Load the training data


### Load the training data set by using training.csv



Model Maker requires that we load our dataset using the [`DataLoader`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader) API. So in this case, we'll load it from a CSV file that defines the images for training, images for validation, and images for testing. At the start change directory  to the Data directory.




In [5]:
#Load the CSV file from your Google Drive.
%cd drive/MyDrive/TrainData
train_data, validation_data, test_data = object_detector.DataLoader.from_csv('training.csv')

/content/drive/MyDrive/TrainData


## Select the model spec

Model Maker supports the EfficientDet-Lite family of object detection models that are compatible with the Edge TPU. (EfficientDet-Lite is derived from [EfficientDet](https://ai.googleblog.com/2020/04/efficientdet-towards-scalable-and.html), which offers state-of-the-art accuracy in a small model size). There are several model sizes you can choose from:

|| Model architecture | Size(MB)* | Latency(ms)** | Average Precision*** |
|-|--------------------|-----------|---------------|----------------------|
|| EfficientDet-Lite0 | 5.7       | 37.4            | 30.4%               |
|| EfficientDet-Lite1 | 7.6       | 56.3            | 34.3%               |
|| EfficientDet-Lite2 | 10.2      | 104.6           | 36.0%               |
|| EfficientDet-Lite3 | 14.4      | 107.6           | 39.4%               |
| <td colspan=4><br><i>* File size of the compiled Edge TPU models. <br/>** Latency measured on a desktop CPU with a Coral USB Accelerator. <br/>*** Average Precision is the mAP (mean Average Precision) on the COCO 2017 validation dataset.</i></td> |

Beware that the Lite2 and Lite3 models do not fit onto the Edge TPU's onboard memory, so you'll see even greater latency when using those, due to the cost of fetching data from the host system memory. Maybe this extra latency is okay for your application, but if it's not and you require the precision of the larger models, then you can [pipeline the model across multiple Edge TPUs](https://coral.ai/docs/edgetpu/pipeline/) (more about this when we compile the model below).

For the modelrailway-cam, we'll use Lite0:

In [6]:
spec = object_detector.EfficientDetLite0Spec()

## Create and train the model

Now we need to create our model according to the model spec, load our dataset into the model, specify training parameters, and begin training. 

Using Model Maker, we accomplished all of that with [`create()`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/create):

In [7]:
model = object_detector.create(train_data=train_data, 
                               model_spec=spec, 
                               validation_data=validation_data, 
                               epochs=40, 
                               batch_size=8, 
                               train_whole_model=True)

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


## Evaluate the model

Now we'll use the test dataset to evaluate how well the model performs with data it has never seen before.

The [`evaluate()`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/ObjectDetector#evaluate) method provides output in the style of [COCO evaluation metrics](https://cocodataset.org/#detection-eval):

In [8]:
model.evaluate(test_data)




{'AP': 0.8596617,
 'AP50': 1.0,
 'AP75': 1.0,
 'AP_/:Dampflok': 0.82432514,
 'AP_/:Diesellok': 0.89499825,
 'APl': -1.0,
 'APm': 0.86020523,
 'APs': -1.0,
 'ARl': -1.0,
 'ARm': 0.90384614,
 'ARmax1': 0.8769231,
 'ARmax10': 0.90384614,
 'ARmax100': 0.90384614,
 'ARs': -1.0}

Because the default batch size for [EfficientDetLite models](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/EfficientDetSpec) is 64, this needs only 1 step to go through all  images in the test set. You can also specify the `batch_size` argument when you call [`evaluate()`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/ObjectDetector#evaluate).

## Export to TensorFlow Lite

Next, we'll export the model to the TensorFlow Lite format. By default, the [`export()`](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/ObjectDetector#export) method performs [full integer post-training quantization](https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization), which is exactly what we need for compatibility with the Edge TPU. (Model Maker uses the same dataset we gave to our model spec as a representative dataset, which is required for full-int quantization.)

We just need to specify the export directory and format. By default, it exports to TF Lite, but we also want a labels file, so we declare both:

In [9]:
TFLITE_FILENAME = 'smrc_model.tflite'
LABELS_FILENAME = 'railwayLabels.txt'

In [10]:
model.export(export_dir='.', tflite_filename=TFLITE_FILENAME, label_filename=LABELS_FILENAME,
             export_format=[ExportFormat.TFLITE, ExportFormat.LABEL])

### Evaluate the TF Lite model

Exporting the model to TensorFlow Lite can affect the model accuracy, due to the reduced numerical precision from quantization and because the original TensorFlow model uses per-class [non-max supression (NMS)](https://www.coursera.org/lecture/convolutional-neural-networks/non-max-suppression-dvrjH) for post-processing, while the TF Lite model uses global NMS, which is faster but less accurate.

Therefore you should always evaluate the exported TF Lite model and be sure it still meets your requirements:

In [11]:
model.evaluate_tflite(TFLITE_FILENAME, test_data)




{'AP': 0.8432532,
 'AP50': 1.0,
 'AP75': 1.0,
 'AP_/:Dampflok': 0.8090587,
 'AP_/:Diesellok': 0.87744766,
 'APl': -1.0,
 'APm': 0.8432532,
 'APs': -1.0,
 'ARl': -1.0,
 'ARm': 0.86346155,
 'ARmax1': 0.86346155,
 'ARmax10': 0.86346155,
 'ARmax100': 0.86346155,
 'ARs': -1.0}

## Compile for the Edge TPU


First we need to download the Edge TPU Compiler:

In [12]:
! curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

! echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

! sudo apt-get update

! sudo apt-get install edgetpu-compiler

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100  2537  100  2537    0     0  84566      0 --:--:-- --:--:-- --:--:-- 84566
OK
deb https://packages.cloud.google.com/apt coral-edgetpu-stable main
Get:1 https://packages.cloud.google.com/apt coral-edgetpu-stable InRelease [6,722 B]
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Get:3 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran40/ InRelease [3,626 B]
Hit:4 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Ign:6 https://packages.cloud.google.com/apt coral-edgetpu-stable/main amd64 Packages
Ign:7 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Hit:8 https://developer.downlo

Before compiling the `.tflite` file for the Edge TPU, it's important to consider whether your model will fit into the Edge TPU memory. 

The Edge TPU has approximately 8 MB of SRAM for [caching model paramaters](https://coral.ai/docs/edgetpu/compiler/#parameter-data-caching), so any model close to or over 8 MB will not fit onto the Edge TPU memory. That means the inference times are longer, because some model parameters must be fetched from the host system memory.

One way to elimiate the extra latency is to use [model pipelining](https://coral.ai/docs/edgetpu/pipeline/), which splits the model into segments that can run on separate Edge TPUs in series. This can significantly reduce the latency for big models.

The following table provides recommendations for the number of Edge TPUs to use with each EfficientDet-Lite model.

| Model architecture | Minimum TPUs | Recommended TPUs
|--------------------|-------|-------|
| EfficientDet-Lite0 | 1     | 1     |
| EfficientDet-Lite1 | 1     | 1     |
| EfficientDet-Lite2 | 1     | 2     |
| EfficientDet-Lite3 | 2     | 2     |
| EfficientDet-Lite4 | 2     | 3     |

If you need extra Edge TPUs for your model, then update `NUMBER_OF_TPUS` here:

In [13]:
NUMBER_OF_TPUS =  1

!edgetpu_compiler --min_runtime_version 13 $TFLITE_FILENAME -d --num_segments=$NUMBER_OF_TPUS 

Edge TPU Compiler version 16.0.384591198
Searching for valid delegate with step 1
Try to compile segment with 267 ops
Started a compilation timeout timer of 180 seconds.

Model compiled successfully in 2731 ms.

Input model: smrc_model.tflite
Input size: 4.24MiB
Output model: smrc_model_edgetpu.tflite
Output size: 5.57MiB
On-chip memory used for caching model parameters: 4.21MiB
On-chip memory remaining for caching model parameters: 3.29MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 267
Operation log: smrc_model_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 264
Number of operations that will run on CPU: 3

**Beware when using multiple segments:** The Edge TPU Comiler divides the model such that all segments have roughly equal amounts of parameter data, but that does not mean all segments have the same latency. Especially when dividing an SSD model such as EfficientDet, this results in a latency-imbalance between segments, because SSD models have a large post-processing op that actually executes on the CPU, not on the Edge TPU. So although segmenting your model this way is better than running the whole model on just one Edge TPU, we recommend that you segment the EfficientDet-Lite model using our [profiling-based partitioner tool](https://github.com/google-coral/libcoral/tree/master/coral/tools/partitioner#profiling-based-partitioner-for-the-edge-tpu-compiler), which measures each segment's latency on the Edge TPU and then iteratively adjusts the segmentation sizes to provide balanced latency between all segments.

## Download the files

In [14]:
from google.colab import files
#Download model and label file for edge TPU (Coral USB Accelerator)
files.download(TFLITE_FILENAME.replace('.tflite', '_edgetpu.tflite'))
files.download(LABELS_FILENAME)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## More resources

* For more information about the Model Maker library used in this tutorial, see the [TensorFlow Lite Model Maker guide](https://www.tensorflow.org/lite/guide/model_maker) and [API reference](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker).

* For other transfer learning tutorials that are compatible with the Edge TPU, see the [Colab tutorials for Coral](https://github.com/google-coral/tutorials#colab-tutorials-for-coral).

* You can also find more examples that show how to run inference on the Edge TPU at [coral.ai/examples](https://coral.ai/examples/#code-examples/).