# Optimization Challenge

In this challenge, you will have to quantize a trained model and check for changes in model metrics. You will have to quantize your model into both INT8 and FLOAT16 and then compare the drop in accuracy and reduction in model size.

There are cells to train a model on the MNIST dataset. There is also a function that you can use to calculate the accuracy of the model.

You will have to finish the cells with TODO. You can expand the cells in the solution notebook if you get stuck or to verify your answer.

## Importing Packages

The first step is to import the packages we need. We will be using `tensorflow` and `keras` to train our models. We will also be using `numpy` to evaluate the performance of our model. Later on, we will be using `os` to find the size of our original and quantized models.

In [None]:
import os
import numpy as np
import tensorflow as tf
from tensorflow import keras

## Training a model on MNIST
TensorFlow provides an easy API to download the MNIST dataset and separate it into training and testing dataset. So first we will use that to load our data.

We will preprocess our data by dividing all the pixel values by 255 to normalize them.

Next we will create a simple CNN model with two convolutional layers and two dense layers.

Finally, we can compile and train our model.


In [None]:
# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0

# Creating the model
model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=24, kernel_size=(3, 3), activation='relu'),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation='relu'),
  keras.layers.Flatten(),
  keras.layers.Dense(32, activation='relu'),
  keras.layers.Dense(10)
])

# Training the model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
  train_images,
  train_labels,
  epochs=2,
  validation_split=0.1,
)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7f68ef0573d0>

## Evaluation Model Performance
Below is a function that take a tflite interpreter and runs inference on the test set of the MNIST data. We can use this to evaluate the performance of our original and quantized models

In [None]:
def evaluate_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for i, test_image in enumerate(test_images):
    if i % 1000 == 0:
      print('Evaluated on {n} results so far.'.format(n=i))
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  print('\n')
  # Compare prediction results with ground truth labels to calculate accuracy.
  prediction_digits = np.array(prediction_digits)
  accuracy = (prediction_digits == test_labels).mean()
  return accuracy

## TODO: Quantize the model weights to INT8

In the cell below, write code that take the keras `model` we trained before and then performs weight quanization to INT8.

Then create a `tflite` interpreter and use the `evaluate_model()` function to check the accuracy of the model

In [None]:
#TODO Perform Weight Quantization

In [None]:
#TODO Evaluate Model Performance

## TODO: Quantize the model weights to FLOAT16

In the cell below, write code that take the keras `model` we trained before and then performs weight quanization to FLOAT16.

Then create a `tflite` interpreter and use the `evaluate_model()` function to check the accuracy of the model

In [None]:
#TODO Perform Weight Quantization

In [None]:
#TODO Evaluate Model Performance

## TODO: Save the model files and calculate their size
In the cell below, write code to save both the original and quantized model files and then calculate the model size. You can use the `os.path.getsize()` function to get the size of a file.

In [None]:
#TODO Calculate the model size