##### *Copyright 2019 Google LLC*
*Licensed under the Apache License, Version 2.0 (the "License")*

In [0]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Retrain a classification model for Edge TPU

<a href="https://colab.research.google.com/github/scottamain/tutorials/blob/master/retrain_classification_ptq_tf1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"></a>
&nbsp;&nbsp;&nbsp;&nbsp;
<a href="https://github.com/scottamain/tutorials/blob/master/retrain_classification_ptq_tf1.ipynb" target="_parent"><img src="https://img.shields.io/static/v1?logo=GitHub&label=&color=333333&style=flat&message=View%20on%20GitHub" alt="View in GitHub"></a>


In this tutorial, we'll use TensorFlow to create an image classification model, train it with a flowers dataset, and convert it into the TensorFlow Lite format compatible with the Edge TPU.

We won't perform full-model training because we're starting with a pre-trained version of the MobileNet V2 graph. We'll replace the final classification layers with our own that we can train with our own dataset, which makes for much faster training—a retraining strategy called transfer learning.

**Note:** This workflow is currently compatible only with TensorFlow 1.x because the ```TFLiteConverter``` in Tensorflow 2.0 does not currently allow us to specify uint8 as the model input and output type during post-training quantization, which is required for execution on the Edge TPU.

## Import the required libraries

In [0]:
try:
  # The %tensorflow_version magic only works in colab.
  %tensorflow_version 1.x
except Exception:
  pass
# For your offline code, be sure you have tensorflow==1.15
import tensorflow as tf

tf.enable_eager_execution()

import os
import numpy as np
import matplotlib.pyplot as plt

## Prepare the training data

Let's start by downloading and organizing the flowers dataset we'll use to retrain the model.

Pay attention to this part so you can reproduce it with your own images dataset.

In [0]:
_URL = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"

zip_file = tf.keras.utils.get_file(origin=_URL, 
                                   fname="flower_photos.tgz", 
                                   extract=True)

flowers_dir = os.path.join(os.path.dirname(zip_file), 'flower_photos')


Use `ImageDataGenerator` to rescale the images.

Create the train generator and specify where the train dataset directory, image size, batch size.

Create the validation generator with similar approach as the train generator with the flow_from_directory() method.

In [0]:
IMAGE_SIZE = 224
BATCH_SIZE = 64

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255, 
    validation_split=0.2)

train_generator = datagen.flow_from_directory(
    flowers_dir,
    target_size=(IMAGE_SIZE, IMAGE_SIZE),
    batch_size=BATCH_SIZE, 
    subset='training')

val_generator = datagen.flow_from_directory(
    flowers_dir,
    target_size=(IMAGE_SIZE, IMAGE_SIZE),
    batch_size=BATCH_SIZE, 
    subset='validation')

In [0]:
for image_batch, label_batch in val_generator:
  break
image_batch.shape, label_batch.shape

Save the labels in a file which will be downloaded later.

In [0]:
print (train_generator.class_indices)

labels = '\n'.join(sorted(train_generator.class_indices.keys()))

with open('labels.txt', 'w') as f:
  f.write(labels)

In [0]:
!cat labels.txt

!mv labels.txt flower_labels.txt

## Build the model

Now we'll create a model capable of transfer learning on just the last, fully-connected layer. 

We start with the pre-trained MobileNet V2 model as the base of the graph, then replace the classification layer with our own trainable layers.

**Note:** Not all models from [```tf.keras.applications```](https://www.tensorflow.org/api_docs/python/tf/keras/applications) are compatible with the Edge TPU. For details, read about [quantizing Keras models](https://coral.ai/docs/edgetpu/models-intro/#quantizing-keras-models).



### Create the base model 

We start by selecting MobileNet V2 as the base model that will be used as the feature extractor. 

The MobileNet V2 model architecture is readily available from Keras, pre-trained on the ImageNet dataset (a large dataset of 1.4M images and 1000 classes of web images).

By specifying the `include_top=False` argument, we load the network *without* the classification layers at the top. This effectively converts the model into a feature extractor because all the pre-trained weights and biases are preserved in the lower layers.

In [0]:
IMG_SHAPE = (IMAGE_SIZE, IMAGE_SIZE, 3)

# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                              include_top=False, 
                                              weights='imagenet')

You will freeze the convolutional base created from the previous step and use that as a feature extractor, add a classifier on top of it and train the top-level classifier.

In [0]:
base_model.trainable = False

### Add a classification head

By now instantiating a new `Sequential` model, we can pass the frozen base from above as the base of the graph, then append additional layers that we'll use to perform fine-tune training.

In [0]:
model = tf.keras.Sequential([
  base_model,
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.GlobalAveragePooling2D(),
  tf.keras.layers.Dense(5, activation='softmax')
])

## Compile the Keras model

This isn't the final compilation step; it's more of a configuration step that's needed before we can start training.

Since there are multiple classes, we use a categorical cross-entropy loss.

In [0]:
model.compile(optimizer=tf.keras.optimizers.Adam(), 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

You can see a string summary of the final network with the `summary()` method:

In [0]:
model.summary()

As you can see, the majority of the model graph is frozen in mobilenetv2.

In [0]:
print('Number of trainable variables = {}'.format(len(model.trainable_variables)))

## Train the model

<!-- TODO(markdaoust): delete steps_per_epoch in TensorFlow r1.14/r2.0 -->

Now you can train the model using the `train_generator` and `val_generator` we created at the beginning. This takes 5-10 minutes to finish.

In [0]:
epochs = 10

history = model.fit_generator(train_generator, 
                    epochs=epochs, 
                    validation_data=val_generator)

### Review the learning curves

Let's take a look at the learning curves of the training and validation accuracy/loss when using the MobileNet V2 base model as a fixed feature extractor. 

In [0]:
acc = history.history['acc']
val_acc = history.history['val_acc']

loss = history.history['loss']
val_loss = history.history['val_loss']

plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.title('Training and Validation Accuracy')

plt.subplot(2, 1, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.ylabel('Cross Entropy')
plt.ylim([0,1.0])
plt.title('Training and Validation Loss')
plt.xlabel('epoch')
plt.show()

## Fine tune the base model
In our feature extraction experiment, you were only training a few layers on top of an MobileNet V2 base model. The weights of the pre-trained network were **not** updated during training.

One way to increase performance even further is to train (or "fine-tune") the weights of the top layers of the pre-trained model alongside the training of the classifier you added. The training process will force the weights to be tuned from generic features maps to features associated specifically to our dataset.

### Un-freeze the top layers of the model


All you need to do is unfreeze the `base_model` and set the bottom layers be un-trainable. Then, recompile the model (necessary for these changes to take effect), and resume training.

In [0]:
base_model.trainable = True

In [0]:
# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))

# Fine tune from this layer onwards
fine_tune_at = 100

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
  layer.trainable =  False

### Compile the model

Compile the model using a much lower training rate.

In [0]:
model.compile(loss='categorical_crossentropy',
              optimizer = tf.keras.optimizers.Adam(1e-5),
              metrics=['accuracy'])

In [0]:
model.summary()

In [0]:
print('Number of trainable variables = {}'.format(len(model.trainable_variables)))

### Continue training the model

In [0]:
history_fine = model.fit_generator(train_generator, 
                         epochs=5,
                         validation_data=val_generator)

### Review the learning curves again

Let's take a look at the learning curves of the training and validation accuracy/loss, when fine tuning the last few layers of the MobileNet V2 base model and training the classifier on top of it. The validation loss is much higher than the training loss, so you may get some overfitting.

You might also get some overfitting as the new training set is relatively small and similar to the original MobileNet V2 datasets.


In [0]:
acc = history_fine.history['acc']
val_acc = history_fine.history['val_acc']

loss = history_fine.history['loss']
val_loss = history_fine.history['val_loss']

plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.title('Training and Validation Accuracy')

plt.subplot(2, 1, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.ylabel('Cross Entropy')
plt.ylim([0,1.0])
plt.title('Training and Validation Loss')
plt.xlabel('epoch')
plt.show()

## Convert to TFLite

Now save the trained model and convert it to TensorFlow Lite format.

Just as an example, the following performs a direct conversion to TensorFlow Lite format:

In [0]:
#saved_model_dir = 'save/fine_tuning'
#tf.saved_model.save(model, saved_model_dir)
#converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)

saved_keras_model = 'model.h5'
model.save(saved_keras_model)
converter = tf.lite.TFLiteConverter.from_keras_model_file(saved_keras_model)

converter.optimizations = [tf.lite.Optimize.DEFAULT]
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()

with open('mobilenet_v2_1.0_224.tflite', 'wb') as f:
  f.write(tflite_model)

But that isn't compatible with the Edge TPU because it's not fully quantized.

To fully quantize the model, we need to perform post-training quantization with a representative dataset, and we need to specify a few more arguments for the TFLiteConverter:

In [0]:
def representative_data_gen():
  dataset_list = tf.data.Dataset.list_files(flowers_dir + '/*/*')
  for i in range(100):
    image = next(iter(dataset_list))
    image = tf.io.read_file(image)
    image = tf.io.decode_jpeg(image, channels=3)
    image = tf.image.resize(image, [IMAGE_SIZE, IMAGE_SIZE])
    # xqq: It needs to be applie with the same preprocessing as what to training dataset.
    # So here, dividing the image with 255 is not general enough for all cases.
    image = tf.cast(image / 255., tf.float32)
    image = tf.expand_dims(image, 0)
    yield [image]

converter = tf.lite.TFLiteConverter.from_keras_model_file(saved_keras_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter.representative_dataset = representative_data_gen
tflite_model = converter.convert()

with open('mobilenet_v2_1.0_224_quant.tflite', 'wb') as f:
  f.write(tflite_model)

### Compare the accuracy


First check the final accuracy of the Keras model:

In [0]:
batch_images, batch_labels = next(val_generator)

logits = model(batch_images)
prediction = np.argmax(logits, axis=1)
truth = np.argmax(batch_labels, axis=1)

print('PREDS: ', prediction)
print('TRUTH: ', truth)

keras_accuracy = tf.keras.metrics.Accuracy()
keras_accuracy(prediction, truth)

print("Raw model accuracy: {:.3%}".format(keras_accuracy.result()))

Then check the accuracy of the .tflite file, using the same dataset batch:

In [0]:

def set_input_tensor(interpreter, input):
  input_details = interpreter.get_input_details()[0]
  tensor_index = input_details['index']
  scale, zero_point = input_details['quantization']
  input_tensor = interpreter.tensor(tensor_index)()[0]
  # xqq: Since we set input as uint8, the input should be of uint8 type within [0, 255].
  input_tensor[:, :] = np.uint8(input / scale + zero_point)

def classify_image(interpreter, input):
  set_input_tensor(interpreter, input)
  interpreter.invoke()
  output_details = interpreter.get_output_details()[0]
  output = interpreter.get_tensor(output_details['index'])
  #print('OUTPUT: ', output)
  # If the model is quantized (uint8 data), then dequantize the results
  if output_details['dtype'] == np.uint8:
    scale, zero_point = output_details['quantization']
    output = scale * (output - zero_point)
  top_1 = np.argmax(output)
  return top_1

interpreter = tf.lite.Interpreter('mobilenet_v2_1.0_224_quant.tflite')
interpreter.allocate_tensors()

batch_prediction = []
batch_truth = np.argmax(batch_labels, axis=1)

for i in range(len(batch_images)):
  prediction = classify_image(interpreter, batch_images[i])
  batch_prediction.append(prediction)

print('PREDS: ', batch_prediction)
print('TRUTH: ', batch_truth)

tflite_accuracy = tf.keras.metrics.Accuracy()
tflite_accuracy(batch_prediction, batch_truth)
print("Quant TF Lite accuracy: {:.3%}".format(tflite_accuracy.result()))


## Compile for the Edge TPU


Download the Edge TPU Compiler:

In [0]:
! curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

! echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

! sudo apt-get update

! sudo apt-get install edgetpu-compiler	

Compile the TF Lite models:

In [0]:
! edgetpu_compiler mobilenet_v2_1.0_224_quant.tflite

## Download the model

Now download the converted model and labels. (Look out for a browser popup that might need approval to download the files.)

In [0]:
from google.colab import files

files.download('mobilenet_v2_1.0_224_quant.tflite')
files.download('mobilenet_v2_1.0_224_quant_edgetpu.tflite')
files.download('flower_labels.txt')

## Summary

* **Using a pre-trained model for feature extraction**:  When working with a small dataset, it is common to take advantage of features learned by a model trained on a larger dataset in the same domain. This is done by instantiating the pre-trained model and adding a fully-connected classifier on top. The pre-trained model is "frozen" and only the weights of the classifier get updated during training.
In this case, the convolutional base extracted all the features associated with each image and you just trained a classifier that determines the image class given that set of extracted features.

* **Fine-tuning a pre-trained model**: To further improve performance, one might want to repurpose the top-level layers of the pre-trained models to the new dataset via fine-tuning.
In this case, you tuned your weights such that your model learned high-level features specific to the dataset. This technique is usually recommended when the training dataset is large and very similar to the orginial dataset that the pre-trained model was trained on.
