<a href="https://colab.research.google.com/github/Danny-Dasilva/Train_Custom_Model/blob/master/EdgeTPU_with_Keras.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Build a model by using Keras and convert it to the Edge TPU tflite file.

### Install EdgeTPU Compiler

In [1]:
%%bash

echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 6A030B21BA07F4FB

sudo apt update > /dev/null
sudo apt install edgetpu > /dev/null

deb https://packages.cloud.google.com/apt coral-edgetpu-stable main
Executing: /tmp/apt-key-gpghome.S5fmFJRlUr/gpg.1.sh --keyserver keyserver.ubuntu.com --recv-keys 6A030B21BA07F4FB


gpg: key 6A030B21BA07F4FB: public key "Google Cloud Packages Automatic Signing Key <gc-team@google.com>" imported
gpg: Total number processed: 1
gpg:               imported: 1




debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 5.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin: 


## Edge TPU with Keras

build very simple model in this notebook.

- data: Fashion MNISt
- input shape: 28 x 28
- output shape: 10
- hidden layers: only 1 dense layer

In [2]:
import tensorflow as tf
from tensorflow import keras

import numpy as np
import matplotlib.pyplot as plt
from keras.utils import np_utils

print(tf.__version__)

1.14.0


Using TensorFlow backend.


In [3]:
fashion_mnist = keras.datasets.fashion_mnist
(trainX, trainY), (testX, testY) = fashion_mnist.load_data()

testLabels = testY
trainLabels = trainY

trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))
#print(trainX[1])

trainX = trainX / 255.0
testX = testX / 255.0
# one-hot encode the training and testing labels
#print(trainX[1])

trainY = np_utils.to_categorical(trainY, 10)
testY = np_utils.to_categorical(testY, 10)

# initialize the label names
labelNames = ["top", "trouser", "pullover", "dress", "coat",
	"sandal", "shirt", "sneaker", "bag", "ankle boot"]


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


### Build the model

- define build_keras_model function since we have to build model 2 times (for train and eval)

In [0]:
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras import backend as K
from keras.optimizers import SGD


imgindex = 11

NUM_EPOCHS = 12
INIT_LR = 1e-2
BS = 256
chanDim = -1
classes = 10

def build_keras_model():
  
    return keras.Sequential([
            keras.layers.Conv2D(32, (3, 3), padding="same", input_shape=(28,28,1)),
            keras.layers.Activation("relu"),
            keras.layers.BatchNormalization(axis=chanDim, fused=False),
            keras.layers.Conv2D(32, (3, 3), padding="same"),
            keras.layers.Activation("relu"),
            keras.layers.BatchNormalization(axis=chanDim, fused=False),
            keras.layers.MaxPooling2D(pool_size=(2, 2)),
            keras.layers.Dropout(0.25),
            keras.layers.Conv2D(64, (3, 3), padding="same"),
            keras.layers.Activation("relu"),
            keras.layers.BatchNormalization(axis=chanDim, fused=False),
            keras.layers.Conv2D(64, (3, 3), padding="same"),
            keras.layers.Activation("relu"),
            keras.layers.BatchNormalization(axis=chanDim, fused=False),
            keras.layers.MaxPooling2D(pool_size=(2, 2)),
            keras.layers.Dropout(0.25),
            keras.layers.Flatten(),
            keras.layers.Dense(512),
            keras.layers.Activation("relu"),
            keras.layers.BatchNormalization(fused=False),
            keras.layers.Dropout(0.5),
            keras.layers.Dense(classes),
            keras.layers.Activation("softmax")
    ])



## Train model and save it's checkpoints

- Use new Session and Graph to ensure that we can use absolutory same name of variables for train and eval phase.
- call `tf.contrib.quantize.create_training_graph` after building model since we want to do Quantization Aware Training

In [5]:
# train
train_graph = tf.Graph()
train_sess = tf.Session(graph=train_graph)


keras.backend.set_session(train_sess)
with train_graph.as_default():
    train_model = build_keras_model()

    tf.contrib.quantize.create_training_graph(input_graph=train_graph, quant_delay=100)
    train_sess.run(tf.global_variables_initializer())    
    
    #opt = SGD(lr=INIT_LR, momentum=0.9, decay=INIT_LR / NUM_EPOCHS)
    train_model.compile(loss="categorical_crossentropy", optimizer="adam",
	   metrics=["accuracy"])

    train_model.fit(trainX, trainY,
      validation_data=(testX, testY),
      batch_size=BS, epochs=NUM_EPOCHS)    
    # save graph and checkpoints
    saver = tf.train.Saver()
    saver.save(train_sess, 'checkpoints')
    predictions = train_model.predict(testX)

W0903 05:18:06.280748 139652712466304 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0903 05:18:09.226524 139652712466304 lazy_loader.py:50] 
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Train on 60000 samples, validate on 10000 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12


In [18]:
print(predictions[imgindex]*100)
def plot_image(i, predictions_array, true_label, img):
  predictions_array, true_label, img = predictions_array[i], true_label[i], img[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  
  plt.imshow(img, cmap=plt.cm.binary)
  
  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'
  
  plt.xlabel("{} {:2.0f}% ({})".format(labelNames[predicted_label],
                                100*np.max(predictions_array),
                                labelNames[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
  predictions_array, true_label = predictions_array[i], true_label[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions_array)
  
  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')
  

tr = np.squeeze(testX)
print(np.shape(tr))
#ts = np.squeeze(test_images)

plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(imgindex, predictions, testLabels, tr)
plt.subplot(1,2,2)
plot_value_array(imgindex, predictions,  testLabels)
plt.show()

[9.4616204e-05 2.1930659e-04 9.4616204e-05 1.2521635e-04 4.4186553e-04
 9.9974403e+01 8.9028460e-04 1.9416835e-02 3.1416544e-03 1.1782135e-03]
(10000, 28, 28)


NameError: ignored

### Freeze model and save it

- Create new Session and Graph
- Call `tf.contrib.quantize.create_eval_graph` and get graph_def after building model before saver.restore
- Call `saver.restore` to load the trained weights.
   - saver.restore may add unneeded variables to the graph. So we have to get the graph_def before save.restore is called.
- We can use `tf.graph_util.convert_variables_to_constants` to freeze the graph_def

In [8]:
# eval
eval_graph = tf.Graph()
eval_sess = tf.Session(graph=eval_graph)

keras.backend.set_session(eval_sess)

with eval_graph.as_default():
    keras.backend.set_learning_phase(0)
    eval_model = build_keras_model()
    tf.contrib.quantize.create_eval_graph(input_graph=eval_graph)
    eval_graph_def = eval_graph.as_graph_def()
    saver = tf.train.Saver()
    saver.restore(eval_sess, 'checkpoints')

    frozen_graph_def = tf.graph_util.convert_variables_to_constants(
        eval_sess,
        eval_graph_def,
        [eval_model.output.op.name]
    )

    with open('frozen_model.pb', 'wb') as f:
        f.write(frozen_graph_def.SerializeToString())

W0903 05:25:05.761829 139652712466304 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0903 05:25:05.875834 139652712466304 deprecation.py:323] From <ipython-input-8-995fccbe9e12>:17: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
W0903 05:25:05.877070 139652712466304 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/graph_util_impl.py:270: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.comp

### Generate tflite file

- use QUANTIZED_UINT8 option
- Quantization Aware training adds min/max information. So we don't need  default_ranges_min default_ranges_max 
- We don't need call freeze_graph.py since the graph is already freezed.

In [9]:
def load_graph(frozen_graph_filename):
    # We load the protobuf file from the disk and parse it to retrieve the 
    # unserialized graph_def
    with tf.gfile.GFile(frozen_graph_filename, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())

    # Then, we import the graph_def into a new Graph and returns it 
    with tf.Graph().as_default() as graph:
        # The name var will prefix every op/nodes in your graph
        # Since we load everything in a new graph, this is not needed
        tf.import_graph_def(graph_def, name="prefix")
    return graph

g = load_graph("frozen_model.pb")

for op in g .get_operations(): 
    print(op.name)

prefix/conv2d_input
prefix/conv2d/kernel
prefix/conv2d/bias
prefix/conv2d/Conv2D/ReadVariableOp
prefix/conv2d/BiasAdd/ReadVariableOp
prefix/batch_normalization/gamma
prefix/batch_normalization/beta
prefix/batch_normalization/moving_mean
prefix/batch_normalization/moving_variance
prefix/batch_normalization/batchnorm/ReadVariableOp
prefix/batch_normalization/batchnorm/add/y
prefix/batch_normalization/batchnorm/add
prefix/batch_normalization/batchnorm/Rsqrt
prefix/batch_normalization/batchnorm/mul/ReadVariableOp
prefix/batch_normalization/batchnorm/mul
prefix/batch_normalization/batchnorm/ReadVariableOp_1
prefix/batch_normalization/batchnorm/mul_2
prefix/batch_normalization/batchnorm/ReadVariableOp_2
prefix/batch_normalization/batchnorm/sub
prefix/conv2d_1/kernel
prefix/conv2d_1/bias
prefix/conv2d_1/Conv2D/ReadVariableOp
prefix/conv2d_1/BiasAdd/ReadVariableOp
prefix/batch_normalization_1/gamma
prefix/batch_normalization_1/beta
prefix/batch_normalization_1/moving_mean
prefix/batch_normaliz

In [10]:
%%bash

tflite_convert \
    --output_file=model.tflite \
    --graph_def_file=frozen_model.pb \
    --inference_type=QUANTIZED_UINT8 \
    --input_arrays=conv2d_input \
    --output_arrays=activation_5/Softmax \
    --mean_values=0 \
    --std_dev_values=255

2019-09-03 05:25:25.489733: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-09-03 05:25:25.513716: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-09-03 05:25:25.514451: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
2019-09-03 05:25:25.514754: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-09-03 05:25:25.516048: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-09-03 05:25:25.517315: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10

### Check generated tflite file.
.
- Use TFLiteInterpreter to check the generated file is valid

In [11]:
# load TFLite file
interpreter = tf.lite.Interpreter(model_path=f'model.tflite')
# Allocate memory. 
interpreter.allocate_tensors()

# get some informations .
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print(input_details)
print(output_details)

[{'name': 'conv2d_input', 'index': 34, 'shape': array([ 1, 28, 28,  1], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.003921568859368563, 0)}]
[{'name': 'activation_5/Softmax', 'index': 5, 'shape': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.00390625, 0)}]


In [12]:
!ls

checkpoint			 checkpoints.meta  sample_data
checkpoints.data-00000-of-00001  frozen_model.pb
checkpoints.index		 model.tflite


- I'm not sure how to use quantization attribute in input/output_details. But maybe
  - If quantization attribute is (a, b), then the input data f should be transform to (f/a + b) and casted to uint8

In [0]:
def quantize(detail, data):
    shape = detail['shape']
    dtype = detail['dtype']
    a, b = detail['quantization']
    
    return (data/a + b).astype(dtype).reshape(shape)


def dequantize(detail, data):
    a, b = detail['quantization']
    
    return (data - b)*a

In [17]:
quantized_input = quantize(input_details[0], test_images[:1])
interpreter.set_tensor(input_details[0]['index'], quantized_input)

interpreter.invoke()

# The results are stored on 'index' of output_details
quantized_output = interpreter.get_tensor(output_details[0]['index'])

print('sample result of quantized model')
print(dequantize(output_details[0], quantized_output))

NameError: ignored

### Compile the tflite file using EdgeTPU Compiler 

In [19]:
%%bash

edgetpu_compiler 'model.tflite'

Edge TPU Compiler version 2.0.258810407

Model compiled successfully in 90 ms.

Input model: model.tflite
Input size: 1.61MiB
Output model: model_edgetpu.tflite
Output size: 1.72MiB
On-chip memory available for caching model parameters: 7.82MiB
On-chip memory used for caching model parameters: 1.70MiB
Off-chip memory used for streaming uncached model parameters: 6.00KiB
Number of Edge TPU subgraphs: 1
Total number of operations: 19
Operation log: model_edgetpu.log
See the operation log file for individual operation details.


INFO: Initialized TensorFlow Lite runtime.


We can download the generated file.

In [0]:
from google.colab import files

files.download('model_edgetpu.tflite')