# Training Object Detection (Tensorflow Model Maker)
*This notebook was created based on [Tensorflow example](https://www.tensorflow.org/lite/tutorials/model_maker_object_detection)*

The [Model Maker library](https://www.tensorflow.org/lite/guide/model_maker) uses transfer learning to simplify the process of training a TensorFlow Lite model using a custom dataset. Retraining a TensorFlow Lite model with your own custom dataset reduces the amount of training data required and will shorten the training time.

**Transfer Learning**
Transfer learning is a machine learning technique that enables data scientists to benefit from the knowledge gained from a previously used machine learning model for a similar task. This learning takes humans’ ability to transfer their knowledge as an example. 
Imagine than you learn how to drive car, you can learn how to drive a truck more easily. 
Similarly, a model trained for detecting fruits in a image can be use to detect hand gestures.

## Setup colab Working directory (Google Colab only)

After uploading the *ml-usecase-tensorflow-object-detection* file to your Google Drive and open this notebook you need to mount the drive and setup a working directory.


In [None]:
from google.colab import drive
drive.mount('/content/drive/')

%cd drive/MyDrive/<'path to the ml-usecase-tensorflow-object-detection file'>

## Install Perequirements

In [None]:
!pip install -q --use-deprecated=legacy-resolver tflite-model-maker
!pip install -q pycocotools

## Set Up Workspace
### Import Packages

In [None]:
import numpy as np
import os
import cv2
from PIL import Image

from tflite_model_maker.config import QuantizationConfig
from tflite_model_maker.config import ExportFormat
from tflite_model_maker import model_spec
from tflite_model_maker import object_detector

import tensorflow as tf
assert tf.__version__.startswith('2')

tf.get_logger().setLevel('ERROR')
from absl import logging
logging.set_verbosity(logging.ERROR)

### Paths and Files

In [None]:
CUSTOM_MODEL_NAME = 'my_ssd_mobnet'  # Name of the Network we are going to use 
PRETRAINED_MODEL_NAME = 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8'
TF_RECORD_SCRIPT_NAME = 'generate_tfrecord.py'

paths = {
    'WORKSPACE_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace'),
    'SCRIPTS_PATH': os.path.join('MyFirstTFOD','Tensorflow','scripts'),
    'APIMODEL_PATH': os.path.join('MyFirstTFOD','Tensorflow','models'),
    'ANNOTATION_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace','annotations'),
    'IMAGE_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace','images'),
    'MODEL_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace','models'),
    'PRETRAINED_MODEL_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace','pre-trained-models'),
    'CHECKPOINT_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace','models',CUSTOM_MODEL_NAME), 
    'OUTPUT_PATH': os.path.join('MyFirstTFOD','Tensorflow', 'workspace','models',CUSTOM_MODEL_NAME, 'export'), 
    'TFJS_PATH':os.path.join('MyFirstTFOD','Tensorflow', 'workspace','models',CUSTOM_MODEL_NAME, 'tfjsexport'), 
    'TFLITE_PATH':os.path.join('MyFirstTFOD','Tensorflow', 'workspace','models',CUSTOM_MODEL_NAME, 'tfliteexport'), 
    'PROTOC_PATH':os.path.join('MyFirstTFOD','Tensorflow','protoc')
 }

files = {
    'PIPELINE_CONFIG':os.path.join('MyFirstTFOD','Tensorflow', 'workspace','models', CUSTOM_MODEL_NAME, 'pipeline.config'),
    'TF_RECORD_SCRIPT': os.path.join(paths['SCRIPTS_PATH'], TF_RECORD_SCRIPT_NAME), 
}

## Load Data

Here we are going to use the data that you collected in *Image_Collection* notebook. 

The model maker library lets you load data in 2 different ways: 

- **From CSV:** learn more about this format [here](https://cloud.google.com/vision/automl/object-detection/docs/csv-format), this is the faster way to consume data using Model Maker.
- **From Pascal Voc:** learn more about this format [here](https://towardsdatascience.com/coco-data-format-for-object-detection-a4c5eaf518c5), this is the output format of the LabelImg Tool (that we used).

**To learn more go to [Tensorflow Dataloader](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/object_detector/DataLoader)**

```python
@classmethod

from_pascal_voc(
    images_dir: str,
    annotations_dir: str,
    label_map: Union[List[str], Dict[int, str], str],
    annotation_filenames: Optional[Collection[str]] = None,
    ignore_difficult_instances: bool = False,
    num_shards: int = 100,
    max_num_images: Optional[int] = None,
    cache_dir: Optional[str] = None,
    cache_prefix_filename: Optional[str] = None
) -> DetectorDataLoader

```
**To continue we will look at**

![pascal_voc](../../docs/pascal_voc.png)



### Create Label Map

The lable map is what allows the model to know the name of the objects that you whant to clssify. So plaese make sure that you provide the same lables that you gave in LableImg.


In [None]:
lable_map = ['ThumbsUp','ThumbsDown','Peace','ThankYou']

### Train and Test Data

In [None]:
TRAIN_PATH = os.path.join(paths['IMAGE_PATH'],'train')
TEST_PATH = os.path.join(paths['IMAGE_PATH'],'test')

train_data = object_detector.DataLoader.from_pascal_voc(TRAIN_PATH,TRAIN_PATH,lable_map)
validation_data = object_detector.DataLoader.from_pascal_voc(TEST_PATH,TEST_PATH,lable_map)

## Select Tensorflow Object Detection Model 

The Model Maker lybrary has many models avaiable like *Audio*, *Image* and *text classification*, object detection, recomendation and Q&A models. 
We encorage you to look at the list below and test different models. 

![models](../../docs/models.png)

**For this usecase we are going to use a Object Detection **

We will use EfficientDet-Lite2 model. 
EfficientDet-Lite[0-4] are a family of mobile/IoT-friendly object detection models derived from the [EfficientDet](https://arxiv.org/abs/1911.09070) architecture.

Here is the performance of each EfficientDet-Lite models compared to each others.

| Model architecture | Size(MB)* | Latency(ms)** | Average Precision*** |
|--------------------|-----------|---------------|----------------------|
| EfficientDet-Lite0 | 4.4       | 37            | 25.69%               |
| EfficientDet-Lite1 | 5.8       | 49            | 30.55%               |
| EfficientDet-Lite2 | 7.2       | 69            | 33.97%               |
| EfficientDet-Lite3 | 11.4      | 116           | 37.70%               |
| EfficientDet-Lite4 | 19.9      | 260           | 41.96%               |

<i> * Size of the integer quantized models. <br/>
** Latency measured on Pixel 4 using 4 threads on CPU. <br/>
*** Average Precision is the mAP (mean Average Precision) on the COCO 2017 validation dataset.
</i>

In [None]:
spec = model_spec.get('efficientdet_lite2')

## Train Model

* The number of epochs will be set to 1000 `epochs = 1000`, which means it will go through the training dataset 1000 times. You can look at the validation accuracy during training and stop early to avoid overfitting.
* Set `batch_size = 7` here so you will see that it takes 2 steps to go through the 14 images in the training dataset.
* Set `train_whole_model=True` to fine-tune the whole model instead of just training the head layer to improve accuracy. The trade-off is that it may take longer to train the model.

In [None]:
model = object_detector.create(train_data, epochs=1000, model_spec=spec, batch_size=7, train_whole_model=True, validation_data=validation_data)

## Evaluate the Model

After training the object detection model using the images in the training dataset, use the remaining 25 images in the test dataset to evaluate how the model performs against new data it has never seen before.

As the default batch size is 64, it will take 1 step to go through the 25 images in the test dataset.

The evaluation metrics are same as [COCO](https://cocodataset.org/#detection-eval).

In [None]:
model.evaluate(test_data)

## Export Model as a TensorFlow Lite model

Export the trained object detection model to the TensorFlow Lite format by specifying which folder you want to export the quantized model to. The default post-training quantization technique is full integer quantization.

The export formats can be one or a list of the following:

*   `ExportFormat.TFLITE`
*   `ExportFormat.LABEL`
*   `ExportFormat.SAVED_MODEL`

By default, it exports only the TensorFlow Lite model file containing the model [metadata](https://www.tensorflow.org/lite/convert/metadata) so that you can later use in an on-device ML application. The label file is embedded in metadata.

In many on-device ML application, the model size is an important factor. Therefore, it is recommended that you quantize the model to make it smaller and potentially run faster. As for EfficientDet-Lite models, full integer quantization  is used to quantize the model by default. Please refer to [Post-training quantization](https://www.tensorflow.org/lite/performance/post_training_quantization) for more detail.

In [None]:
model.export(export_dir=paths['TFLITE_PATH'])

You can also choose to export other files related to the model for better examination. For instance, exporting both the saved model and the label file as follows:

```python
model.export(export_dir=paths['TFLITE_PATH'], export_format=[ExportFormat.SAVED_MODEL, ExportFormat.LABEL])
```

### Customize Post-training quantization on the TensorFlow Lite model

[Post-training quantization](https://www.tensorflow.org/lite/performance/post_training_quantization) is a conversion technique that can reduce model size and inference latency, while also improving CPU and hardware accelerator inference speed, with a little degradation in model accuracy. Thus, it's widely used to optimize the model.

Model Maker library applies a default post-training quantization techique when exporting the model. If you want to customize post-training quantization, Model Maker supports multiple post-training quantization options using [QuantizationConfig](https://www.tensorflow.org/lite/api_docs/python/tflite_model_maker/config/QuantizationConfig) as well. Let's take float16 quantization as an instance. First, define the quantization config.

In [None]:
config = QuantizationConfig.for_float16()
model.export(export_dir=paths['TFLITE_PATH'], tflite_filename='model_fp16.tflite', quantization_config=config)

# Try by yourself

### Customize the EfficientDet model hyperparameters

The model and training pipline parameters you can adjust are:

* `model_dir`: The location to save the model checkpoint files. If not set, a temporary directory will be used.
* `steps_per_execution`: Number of steps per training execution.
* `moving_average_decay`: Float. The decay to use for maintaining moving averages of the trained parameters.
* `var_freeze_expr`: The regular expression to map the prefix name of variables to be frozen which means remaining the same during training. More specific, use `re.match(var_freeze_expr, variable_name)` in the codebase to map the variables to be frozen.
* `tflite_max_detections`: integer, 25 by default. The max number of output detections in the TFLite model.
* `strategy`:  A string specifying which distribution strategy to use. Accepted values are 'tpu', 'gpus', None. tpu' means to use TPUStrategy. 'gpus' mean to use MirroredStrategy for multi-gpus. If None, use TF default with OneDeviceStrategy.
* `tpu`:  The Cloud TPU to use for training. This should be either the name used when creating the Cloud TPU, or a grpc://ip.address.of.tpu:8470 url.
* `use_xla`: Use XLA even if strategy is not tpu. If strategy is tpu, always use XLA, and this flag has no effect.
* `profile`: Enable profile mode.
* `debug`: Enable debug mode.

Other parameters that can be adjusted is shown in [hparams_config.py](https://github.com/google/automl/blob/df451765d467c5ed78bbdfd632810bc1014b123e/efficientdet/hparams_config.py#L170).

For instance, you can set the `var_freeze_expr='efficientnet'` which freezes the variables with name prefix `efficientnet` (default is `'(efficientnet|fpn_cells|resample_p6)'`). This allows the model to freeze untrainable variables and keep their value the same through training.





In [None]:
spec = model_spec.get('efficientdet_lite0')
spec.config.var_freeze_expr = 'efficientnet'

### Change the Model Architecture

You can change the model architecture by changing the `model_spec`. For instance, change the `model_spec` to the EfficientDet-Lite4 model.

In [None]:
spec = model_spec.get('efficientdet_lite4')

### Tune the training hyperparameters

The `create` function is the driver function that the Model Maker library uses to create models. The `model_spec` parameter defines the model specification. The `object_detector.EfficientDetSpec` class is currently supported. The `create` function comprises of the following steps:

1. Creates the model for the object detection according to `model_spec`.
2. Trains the model.  The default epochs and the default batch size are set by the `epochs` and `batch_size` variables in the `model_spec` object.
You can also tune the training hyperparameters like `epochs` and `batch_size` that affect the model accuracy. For instance,

*   `epochs`: Integer, 50 by default. More epochs could achieve better accuracy, but may lead to overfitting.
*   `batch_size`: Integer, 64 by default. The number of samples to use in one training step.
*   `train_whole_model`: Boolean, False by default. If true, train the whole model. Otherwise, only train the layers that do not match `var_freeze_expr`.

For example, you can train with less epochs and only the head layer. You can increase the number of epochs for better results.

In [None]:
my_model = # TODO: Fill in to create a model with new hyperparameters

## Test your model in one of the test images

**Load Model and make sure Model Path exists**

In [None]:
# extract model tar file (#Colab only)
MODEL_FILE = os.path.join('model.tflite')
if os.path.exists(MODEL_FILE):
    if not os.path.exists(paths['TFLITE_PATH']):
        os.makedirs(paths['TFLITE_PATH'])
    !mv {MODEL_FILE} {paths['TFLITE_PATH']} 

# Load the TFLite model
interpreter = tf.lite.Interpreter(model_path=os.path.join(paths['TFLITE_PATH'],'model.tflite'))
interpreter.allocate_tensors()

In [None]:
classes = ['??'] * 4
label_map = {1:'ThumbsUp', 2:'ThumbsDown', 3:'Peace', 4:'ThankYou'}
for label_id, label_name in label_map.items():
  classes[label_id-1] = label_name

# Define a list of colors for visualization
COLORS = np.random.randint(0, 255, size=(len(classes), 3), dtype=np.uint8)


def preprocess_image(image_path, input_size):
  """Preprocess the input image to feed to the TFLite model"""
  img = tf.io.read_file(image_path)
  img = tf.io.decode_image(img, channels=3)
  img = tf.image.convert_image_dtype(img, tf.uint8)
  original_image = img
  resized_img = tf.image.resize(img, input_size)
  resized_img = resized_img[tf.newaxis, :]
  resized_img = tf.cast(resized_img, dtype=tf.uint8)
  return resized_img, original_image


def detect_objects(interpreter, image, threshold, k):
  """Returns a list of detection results, each a dictionary of object info."""

  signature_fn = interpreter.get_signature_runner()

  # Feed the input image to the model
  output = signature_fn(images=image)

  # Get all outputs from the model
  count = int(np.squeeze(output['output_0']))
  scores = np.squeeze(output['output_1'])
  classes = np.squeeze(output['output_2'])
  boxes = np.squeeze(output['output_3'])
  
  results = []
  for i in range(count):
    if scores[i] >= threshold:
      result = {
        'bounding_box': boxes[i],
        'class_id': classes[i],
        'score': scores[i]
      }
      results.append(result)
  
  return results[:k]


def run_odt_and_draw_results(image_path, interpreter, threshold=0.5, k=3):
  """Run object detection on the input image and draw the detection results"""
  # Load the input shape required by the model
  _, input_height, input_width, _ = interpreter.get_input_details()[0]['shape']

  # Load the input image and preprocess it
  preprocessed_image, original_image = preprocess_image(
      image_path,
      (input_height, input_width)
    )

  # Run object detection on the input image
  results = detect_objects(interpreter, preprocessed_image, threshold=threshold, k=k)

  # Plot the detection results on the input image
  original_image_np = original_image.numpy().astype(np.uint8)
  for obj in results:
    # Convert the object bounding box from relative coordinates to absolute
    # coordinates based on the original image resolution
    ymin, xmin, ymax, xmax = obj['bounding_box']
    xmin = int(xmin * original_image_np.shape[1])
    xmax = int(xmax * original_image_np.shape[1])
    ymin = int(ymin * original_image_np.shape[0])
    ymax = int(ymax * original_image_np.shape[0])

    # Find the class index of the current object
    class_id = int(obj['class_id'])

    # Draw the bounding box and label on the image
    color = [int(c) for c in COLORS[class_id]]
    cv2.rectangle(original_image_np, (xmin, ymin), (xmax, ymax), color, 2)
    # Make adjustments to make the label visible for all objects
    y = ymin - 15 if ymin - 15 > 15 else ymin + 15
    label = "{}: {:.0f}%".format(classes[class_id], obj['score'] * 100)
    cv2.putText(original_image_np, label, (xmin, y),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

  # Return the final image
  original_uint8 = original_image_np.astype(np.uint8)
  return original_uint8

### Open CV

In the *TEST_IMG* variable paste the name of one of your test images.

In [None]:
# Path to images 
TEST_PATH = os.path.join(paths['IMAGE_PATH'],'test')
RESULTS_PATH = os.path.join(paths['IMAGE_PATH'],'results')

TEST_IMG = 'peace.401cd57a-a934-11ec-8dc7-dca904818221.jpg' 
TEST_IMG_PATH = os.path.join(TEST_PATH, TEST_IMG)
RESULT_IMG_PATH = os.path.join(RESULTS_PATH,'test_img.jpg')

if not os.path.exists(RESULT_IMG_PATH):
    os.makedirs(RESULTS_PATH)
    !cp {TEST_IMG_PATH} {RESULT_IMG_PATH}

im = Image.open(RESULT_IMG_PATH)
im.thumbnail((512, 512), Image.ANTIALIAS)
im.save(RESULT_IMG_PATH, 'PNG')


# Load the TFLite model
interpreter = tf.lite.Interpreter(model_path=os.path.join(paths['TFLITE_PATH'],'model.tflite'))
interpreter.allocate_tensors()

DETECTION_THRESHOLD = 0.1

# Run inference and draw detection result on the local copy of the original file
detection_result_image = run_odt_and_draw_results(
    RESULT_IMG_PATH,
    interpreter,
    threshold=DETECTION_THRESHOLD,
    k=1
)
# Show the detection result
Image.fromarray(detection_result_image)