# Train a custom object detection model with TensorFlow Lite Model Maker

In this colab notebook, you'll learn how to use the [TensorFlow Lite Model Maker](https://www.tensorflow.org/lite/guide/model_maker) to train a custom object detection model to detect Android figurines and how to put the model on a Raspberry Pi.

The Model Maker library uses *transfer learning* to simplify the process of training a TensorFlow Lite model using a custom dataset. Retraining a TensorFlow Lite model with your own custom dataset reduces the amount of training data required and will shorten the training time.


Import the required packages.

In [22]:
import numpy as np
import os
from PIL import Image
import shutil
import zipfile
            
from pandas import *
from tflite_model_maker.config import ExportFormat, QuantizationConfig
from tflite_model_maker import model_spec
from tflite_model_maker import object_detector

from tflite_support import metadata

import tensorflow as tf
assert tf.__version__.startswith('2')

tf.get_logger().setLevel('ERROR')
from absl import logging
logging.set_verbosity(logging.ERROR)

import xml.etree.ElementTree as ET

def rename_picture_fileformat_in_annotaion(voc_dataset_dir, annotation, file_format):
    tree = ET.parse('{}/Annotations/{}.xml'.format(voc_dataset_dir,annotation))
    root = tree.getroot()
    filename_node = root.find("filename")
    filename_node.text = "{}.{}".format(annotation, file_format)
    tree.write('{}/Annotations/{}.xml'.format(voc_dataset_dir,annotation))
    
    
def add_pose_to_object_annotation(voc_dataset_dir,annotation):
    tree = ET.parse('{}/Annotations/{}.xml'.format(voc_dataset_dir,annotation))
    root = tree.getroot()
    object_nodes = root.findall("object")
    for object_node in object_nodes:
        ET.SubElement(object_node, 'pose').text = "Unspecified"
    tree.write('{}/Annotations/{}.xml'.format(voc_dataset_dir,annotation))
    
def convert_all_floats_to_ints(voc_dataset_dir, annotation):
    tree = ET.parse('{}/Annotations/{}.xml'.format(voc_dataset_dir,annotation))
    root = tree.getroot()
    bndbox_nodes = root.findall("./object/bndbox")
    for bndbox_node in bndbox_nodes:
        for child in bndbox_node.getchildren():
            child.text = str(int(float(child.text)))
    tree.write('{}/Annotations/{}.xml'.format(voc_dataset_dir,annotation))
    

def order_dataset(destination_dir, voc_dataset_dir):

    shutil.rmtree(destination_dir) if os.path.isdir(destination_dir) else None
    shutil.rmtree(voc_dataset_dir) if os.path.isdir(voc_dataset_dir) else None
    
    assert os.path.isfile("{}.zip".format(voc_dataset_dir)), "There is no dataset named \"{}.zip\"".format(voc_dataset_dir) 
    with zipfile.ZipFile("{}.zip".format(voc_dataset_dir),"r") as zip_dataset:
        zip_dataset.extractall(voc_dataset_dir)

    assert os.path.isdir(voc_dataset_dir), "The directory to the PVOC dataset \"{}\" doesnt exist".format(voc_dataset_dir) 

    os.makedirs(destination_dir, exist_ok=True)
    with open("{}/ImageSets/Main/default.txt".format(voc_dataset_dir)) as file:
        lines = [line.rstrip() for line in file]
        try:
            for line in lines:
                # Files might be in PNG File after exporting in CVAT but we need JPEG
                if os.path.isfile('{}/JPEGImages/{}.PNG'.format(voc_dataset_dir,line)):
                    im1 = Image.open('{}/JPEGImages/{}.PNG'.format(voc_dataset_dir, line))
                    im1.save('{}/{}.jpeg'.format(destination_dir, line))
                    rename_picture_fileformat_in_annotaion(voc_dataset_dir, line,"jpeg")
                # IF files are in JPEG just move them
                elif os.path.isfile('{}/JPEGImages/{}.jpeg'.format(voc_dataset_dir,line)):
                    os.replace('{}/JPEGImages/{}.jpeg'.format(voc_dataset_dir, line),'{}/{}.jpeg'.format(destination_dir, line))
                add_pose_to_object_annotation(voc_dataset_dir, line)
                convert_all_floats_to_ints(voc_dataset_dir,line)
                # Move the Annotation file to the destination_dir
                os.replace('{}/Annotations/{}.xml'.format(voc_dataset_dir, line),'{}/{}.xml'.format(destination_dir, line))

            # Move the labelmap file to the destination_dir
            os.replace('{}/labelmap.txt'.format(voc_dataset_dir),'{}/labelmap.txt'.format(destination_dir))
            shutil.rmtree(voc_dataset_dir)
        except:
            print("No files found in {}".format(voc_dataset_dir))

def get_labels() -> list:
    data = read_csv("res/train/labelmap.txt",sep=":",)
    labels = data['# label'].tolist()
    try:
        labels.remove('background')
    except:
        print("no background variables in labelmap.txt")
    return labels

def get_label_color(label, file="res/train/labelmap.txt") -> list:
    data = read_csv(file ,sep=":",)
    index = data.index[data['# label'] == label].to_list()[0]
    color_list = data['color_rgb'][index].replace("[","").replace("]","").split(",")
    color=()
    for rgbstring in color_list:
        color = color + (int(rgbstring),)
    return color



2022-05-19 13:16:42.330944: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-05-19 13:16:42.330963: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


## Train the object detection model

### Step 1: Load the dataset

* Images in `res/train` are used to train the custom object detection model.
* Images in `res/val` are used to check if the model can generalize well to new images that it hasn't seen before.
If there is no train and val dirs, export your datasets from the CVAT server as described in the `README.md` file

In [23]:
train_dir = "res/train"
val_dir = "res/val"

### Order the dataset into train and validation
make sure to have exported your dataset with a 'PASCAL VOC' format in CVAT.  


| IMPORTANT!!!  The validation and the train datasets HAVE to be exported SEPARATELY!  |
|-----------------------------------------|

In [24]:
order_dataset("res/train","res/train_food_pvoc")
order_dataset("res/val","res/val_food_pvoc")

  for child in bndbox_node.getchildren():


In [25]:
train_data = object_detector.DataLoader.from_pascal_voc(
    'res/train',
    'res/train',
    get_labels()
)

val_data = object_detector.DataLoader.from_pascal_voc(
    'res/val',
    'res/val',
    get_labels()
)

### Step 2: Select a model architecture

EfficientDet-Lite[0-4] are a family of mobile/IoT-friendly object detection models derived from the [EfficientDet](https://arxiv.org/abs/1911.09070) architecture.

Here is the performance of each EfficientDet-Lite models compared to each others.

| Model architecture | Size(MB)* | Latency(ms)** | Average Precision*** |
|--------------------|-----------|---------------|----------------------|
| EfficientDet-Lite0 | 4.4       | 146           | 25.69%               |
| EfficientDet-Lite1 | 5.8       | 259           | 30.55%               |
| EfficientDet-Lite2 | 7.2       | 396           | 33.97%               |
| EfficientDet-Lite3 | 11.4      | 716           | 37.70%               |
| EfficientDet-Lite4 | 19.9      | 1886          | 41.96%               |

<i> * Size of the integer quantized models. <br/>
** Latency measured on Raspberry Pi 4 using 4 threads on CPU. <br/>
*** Average Precision is the mAP (mean Average Precision) on the COCO 2017 validation dataset.
</i>

In this notebook, we use EfficientDet-Lite0 to train our model. You can choose other model architectures depending on whether speed or accuracy is more important to you.

In [26]:
import ipywidgets as widgets
from IPython.display import display

model_selector = widgets.Dropdown(
    options=['efficientdet_lite0', 'efficientdet_lite1', 'efficientdet_lite2', 'efficientdet_lite3', 'efficientdet_lite4'],
    value= 'efficientdet_lite4',
    description='Model',
    disabled=False,
)
display(model_selector)


Dropdown(description='Model', index=4, options=('efficientdet_lite0', 'efficientdet_lite1', 'efficientdet_lite…

In [27]:
spec = model_spec.get(model_selector.value)

2022-05-19 13:21:22.456658: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-19 13:21:22.459723: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-05-19 13:21:22.459795: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-05-19 13:21:22.459842: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-05-19 13:21:22.459885: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Co

### Step 3: Train the TensorFlow model with the training data.

* Set `epochs = 20`, which means it will go through the training dataset 20 times. You can look at the validation accuracy during training and stop when you see validation loss (`val_loss`) stop decreasing to avoid overfitting.
* Set `batch_size = 4` here so you will see that it takes 15 steps to go through the 62 images in the training dataset.
* Set `train_whole_model=True` to fine-tune the whole model instead of just training the head layer to improve accuracy. The trade-off is that it may take longer to train the model.

In [28]:
model = object_detector.create(train_data, model_spec=spec, batch_size=4, train_whole_model=True, epochs=20, validation_data=val_data)

Epoch 1/20


2022-05-19 13:21:56.439659: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.




2022-05-19 13:29:22.772340: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 2/20

2022-05-19 13:36:14.864796: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 3/20

2022-05-19 13:43:00.862400: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 4/20

2022-05-19 13:49:46.528719: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 5/20

2022-05-19 13:56:31.613993: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 6/20

2022-05-19 14:03:46.215091: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 7/20

2022-05-19 14:10:32.209195: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 8/20

2022-05-19 14:17:17.231062: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 9/20

2022-05-19 14:24:00.156296: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 10/20

2022-05-19 14:30:43.425989: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 11/20

2022-05-19 14:37:51.829256: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 12/20

2022-05-19 14:44:37.216902: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 13/20

2022-05-19 14:51:20.274175: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 14/20

2022-05-19 14:58:03.797340: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 15/20

2022-05-19 15:04:46.737838: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 16/20

2022-05-19 15:11:55.915986: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 17/20

2022-05-19 15:18:40.599994: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 18/20

2022-05-19 15:25:23.646042: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 19/20

2022-05-19 15:32:07.337724: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 20/20

2022-05-19 15:38:52.007809: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.




### Step 4. Evaluate the model with the validation data.

After training the object detection model using the images in the training dataset, use the 10 images in the validation dataset to evaluate how the model performs against new data it has never seen before.

As the default batch size is 64, it will take 1 step to go through the 10 images in the validation dataset.

The evaluation metrics are same as [COCO](https://cocodataset.org/#detection-eval).

In [29]:
model.evaluate(val_data)

2022-05-19 15:40:04.121227: W tensorflow/core/framework/dataset.cc:768] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.





{'AP': 0.2599284,
 'AP50': 0.43577805,
 'AP75': 0.29934555,
 'APs': -1.0,
 'APm': 0.3226509,
 'APl': 0.37094292,
 'ARmax1': 0.13378906,
 'ARmax10': 0.42578125,
 'ARmax100': 0.46816406,
 'ARs': -1.0,
 'ARm': 0.51744527,
 'ARl': 0.5229358,
 'AP_/tomato': 0.48699248,
 'AP_/cheese': -1.0,
 'AP_/mozzarella': -1.0,
 'AP_/grape': 0.03286434}

### Step 5: Export as a TensorFlow Lite model.

Export the trained object detection model to the TensorFlow Lite format by specifying which folder you want to export the quantized model to. The default post-training quantization technique is [full integer quantization](https://www.tensorflow.org/lite/performance/post_training_integer_quant). This allows the TensorFlow Lite model to be smaller, run faster on Raspberry Pi CPU and also compatible with the Google Coral EdgeTPU.

In [30]:
model.export(export_dir='.', tflite_filename='models/foodrecognition_{}.tflite'.format(model_selector.value))

2022-05-19 15:40:40.535933: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2022-05-19 15:41:18.781658: W tensorflow/core/common_runtime/graph_constructor.cc:803] Node 'resample_p7/PartitionedCall' has 1 outputs but the _output_shapes attribute specifies shapes for 3 outputs. Output shapes may be inaccurate.


Estimated count of arithmetic ops: 13.736 G  ops, equivalently 6.868 G  MACs


2022-05-19 15:41:42.031445: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:357] Ignored output_format.
2022-05-19 15:41:42.031484: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:360] Ignored drop_control_dependency.
2022-05-19 15:41:42.032243: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/tmpvr4psyej
2022-05-19 15:41:42.387951: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-05-19 15:41:42.387989: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/tmpvr4psyej
2022-05-19 15:41:43.056706: I tensorflow/cc/saved_model/loader.cc:228] Restoring SavedModel bundle.
2022-05-19 15:41:46.456906: I tensorflow/cc/saved_model/loader.cc:212] Running initialization op on SavedModel bundle at path: /tmp/tmpvr4psyej
2022-05-19 15:41:48.176830: I tensorflow/cc/saved_model/loader.cc:301] SavedModel load for tags { serve }; Status: success: OK. Took 6144590

Estimated count of arithmetic ops: 13.736 G  ops, equivalently 6.868 G  MACs


### Step 6:  Evaluate the TensorFlow Lite model.

Several factors can affect the model accuracy when exporting to TFLite:
* [Quantization](https://www.tensorflow.org/lite/performance/model_optimization) helps shrinking the model size by 4 times at the expense of some accuracy drop.
* The original TensorFlow model uses per-class [non-max supression (NMS)](https://www.coursera.org/lecture/convolutional-neural-networks/non-max-suppression-dvrjH) for post-processing, while the TFLite model uses global NMS that's much faster but less accurate.
Keras outputs maximum 100 detections while tflite outputs maximum 25 detections.

Therefore you'll have to evaluate the exported TFLite model and compare its accuracy with the original TensorFlow model.

In [11]:
model.evaluate_tflite('models/foodrecognition_{}.tflite'.format(model_selector.value), val_data)




{'AP': 0.117795005,
 'AP50': 0.3465374,
 'AP75': 0.045881458,
 'APs': -1.0,
 'APm': 0.16351096,
 'APl': 0.099332824,
 'ARmax1': 0.08945312,
 'ARmax10': 0.23671874,
 'ARmax100': 0.24941406,
 'ARs': -1.0,
 'ARm': 0.33772323,
 'ARl': 0.11376147,
 'AP_/tomato': 0.22117493,
 'AP_/cheese': -1.0,
 'AP_/mozzarella': -1.0,
 'AP_/grape': 0.0144150825}

## Compile the model for EdgeTPU

Finally, we'll compile the model using `edgetpu_compiler` so that the model can run on [Google Coral EdgeTPU](https://coral.ai/).

We start with installing the EdgeTPU compiler on Colab.

In [None]:
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
!echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
!sudo apt-get update
!sudo apt-get install edgetpu-compiler

**Note:** When training the model using a custom dataset, beware that if your dataset includes more than 20 classes, you'll probably have slower inference speeds compared to if you have fewer classes. This is due to an aspect of the EfficientDet architecture in which a certain layer cannot compile for the Edge TPU when it carries more than 20 classes.

Before compiling the `.tflite` file for the Edge TPU, it's important to consider whether your model will fit into the Edge TPU memory. 

The Edge TPU has approximately 8 MB of SRAM for [caching model paramaters](https://coral.ai/docs/edgetpu/compiler/#parameter-data-caching), so any model close to or over 8 MB will not fit onto the Edge TPU memory. That means the inference times are longer, because some model parameters must be fetched from the host system memory.

One way to elimiate the extra latency is to use [model pipelining](https://coral.ai/docs/edgetpu/pipeline/), which splits the model into segments that can run on separate Edge TPUs in series. This can significantly reduce the latency for big models.

The following table provides recommendations for the number of Edge TPUs to use with each EfficientDet-Lite model.

| Model architecture | Minimum TPUs | Recommended TPUs
|--------------------|-------|-------|
| EfficientDet-Lite0 | 1     | 1     |
| EfficientDet-Lite1 | 1     | 1     |
| EfficientDet-Lite2 | 1     | 2     |
| EfficientDet-Lite3 | 2     | 2     |
| EfficientDet-Lite4 | 2     | 3     |

If you need extra Edge TPUs for your model, then update `NUMBER_OF_TPUS` here:

In [None]:
NUMBER_OF_TPUS = 1

!edgetpu_compiler android.tflite --num_segments=$NUMBER_OF_TPUS

Finally, we'll copy the metadata, including the label file, from the original TensorFlow Lite model to the EdgeTPU model.

In [None]:
populator_dst = metadata.MetadataPopulator.with_model_file('android_edgetpu.tflite')

with open('android.tflite', 'rb') as f:
  populator_dst.load_metadata_and_associated_files(f.read())

populator_dst.populate()
updated_model_buf = populator_dst.get_model_buffer()