# TensorFlow Lite: Model Optimization for On-Device Machine Learning

## Downloading the dataset
By Installing Kaggle In our Colab Notebook and using the kaggle API

**Install the Kaggle library**

In [1]:
!pip install kaggle

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


**Make a directory named “.kaggle”**

In [2]:
 !mkdir ~/.kaggle

**Copy the “kaggle.json” into this new directory**

In [3]:
!cp kaggle.json ~/.kaggle/

**Allocate the required permission for this file.**

In [4]:
!chmod 600 ~/.kaggle/kaggle.json

**Download the Dataset into Colab**

In [5]:
!kaggle datasets download -d masoudnickparvar/brain-tumor-mri-dataset

Downloading brain-tumor-mri-dataset.zip to /content
 93% 138M/149M [00:02<00:00, 93.5MB/s]
100% 149M/149M [00:02<00:00, 73.2MB/s]


In [6]:
!unzip \*.zip

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: Training/glioma/Tr-gl_0712.jpg  
  inflating: Training/glioma/Tr-gl_0713.jpg  
  inflating: Training/glioma/Tr-gl_0714.jpg  
  inflating: Training/glioma/Tr-gl_0715.jpg  
  inflating: Training/glioma/Tr-gl_0716.jpg  
  inflating: Training/glioma/Tr-gl_0717.jpg  
  inflating: Training/glioma/Tr-gl_0718.jpg  
  inflating: Training/glioma/Tr-gl_0719.jpg  
  inflating: Training/glioma/Tr-gl_0720.jpg  
  inflating: Training/glioma/Tr-gl_0721.jpg  
  inflating: Training/glioma/Tr-gl_0722.jpg  
  inflating: Training/glioma/Tr-gl_0723.jpg  
  inflating: Training/glioma/Tr-gl_0724.jpg  
  inflating: Training/glioma/Tr-gl_0725.jpg  
  inflating: Training/glioma/Tr-gl_0726.jpg  
  inflating: Training/glioma/Tr-gl_0727.jpg  
  inflating: Training/glioma/Tr-gl_0728.jpg  
  inflating: Training/glioma/Tr-gl_0729.jpg  
  inflating: Training/glioma/Tr-gl_0730.jpg  
  inflating: Training/glioma/Tr-gl_0731.jpg  
  inflating: Tr

### **About Dataset**
This dataset is a combination of the following three datasets :
*   **figshare**
*   **SRTAJ**
*   **Br35H**

This dataset contains **7023** images of human brain MRI images which are classified into **4 classes**:

1.   **glioma**
2.   **meningioma**
3.   **no tumor**
4.   **pituitary**


no tumor class images were taken from the Br35H dataset.

## Data preprocessing
Import the necessary libraries and packages.



In [7]:
!pip install mediapipe-model-maker

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting mediapipe-model-maker
  Downloading mediapipe_model_maker-0.2.0-py3-none-any.whl (117 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m117.5/117.5 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
Collecting mediapipe>=0.10.0 (from mediapipe-model-maker)
  Downloading mediapipe-0.10.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (33.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m33.8/33.8 MB[0m [31m41.5 MB/s[0m eta [36m0:00:00[0m
Collecting tf-models-official==2.11.6 (from mediapipe-model-maker)
  Downloading tf_models_official-2.11.6-py2.py3-none-any.whl (2.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m72.5 MB/s[0m eta [36m0:00:00[0m
Collecting immutabledict (from tf-models-official==2.11.6->mediapipe-model-maker)
  Downloading immutabledict-2.2.4-py3-none-any.whl (4.1 kB)
Co

In [1]:
import os
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import seaborn as sns
assert tf.__version__.startswith('2')
from mediapipe_model_maker import image_classifier,quantization
%load_ext tensorboard


TensorFlow Addons (TFA) has ended development and introduction of new features.
TFA has entered a minimal maintenance and release mode until a planned end of life in May 2024.
Please modify downstream libraries to take dependencies from other repositories in our TensorFlow community (e.g. Keras, Keras-CV, and Keras-NLP). 

For more information see: https://github.com/tensorflow/addons/issues/2807 



Let us first define the ***batch_size *** and ***img_size***.

In [9]:
batch_size = 32
img_height = 224
img_width = 224

In [3]:
train_dir = '/content/Training/'
test_dir = '/content/Testing/'

### **Read Training Data**

In [4]:
# Create Training Dataset
data = image_classifier.Dataset.from_folder(train_dir)
train_ds, remaining_data = data.split(0.8)

Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089


In [5]:
# Create Validation set
val_ds = remaining_data

In [6]:
class_names = train_ds.label_names
print(class_names)

['glioma', 'meningioma', 'notumor', 'pituitary']


In [7]:
# Create a function to customize autopct parameter of plt.pie()
def make_autopct(values):
    def my_autopct(pct):
        # The pct is percentage value that matplotlib supplies for every wedge
        total = sum(values)
        val = int(round(pct*total/100.0))
        return f'{pct:.2f}%  ({val})'
    return my_autopct

In [8]:
cmap = sns.color_palette("Blues")

**Inference**: Both, the training set and the validation set have quite equal distribution of instances for each class. Thus, we dont't have to add any additional weigths to a particular class when we train our model.

### Train Model
We will be retraining the EfficientNet Lite 0  model. It ia trained on Imagenet (ILSVRC-2012-CLS), optimized for TFLite, and designed for performance on mobile CPU, GPU, and EdgeTPU. Due to the requirements from edge devices, the following changes are made to the original EfficientNets :

* Removed the squeeze-and-excite blocks(SE) as SE is not well
* supported for some mobile accelerators.
* Replaced all the swish with RELU6 for easier post-quantization.
* Fixed the stem and head while scaling models up in order to keep the models small and fast.

In [10]:
# Create model
spec = image_classifier.SupportedModels.EFFICIENTNET_LITE0
hparams = image_classifier.HParams(export_dir="exported_model",epochs=20,batch_size=batch_size)
options = image_classifier.ImageClassifierOptions(supported_model=spec, hparams=hparams)

In [11]:
model = image_classifier.ImageClassifier.create(
    train_data = train_ds,
    validation_data = val_ds,
    options=options,
)



Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 keras_layer (KerasLayer)    (None, 1280)              3413024   
                                                                 
 dropout (Dropout)           (None, 1280)              0         
                                                                 
 dense (Dense)               (None, 4)                 5124      
                                                                 
Total params: 3,418,148
Trainable params: 5,124
Non-trainable params: 3,413,024
_________________________________________________________________
None
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [12]:
# Read test set
test_ds = image_classifier.Dataset.from_folder(test_dir)

In [14]:
# Evaluating the Model on test dataset.
_, baseline_model_accuracy = model.evaluate(test_ds)
print('Baseline test accuracy:', baseline_model_accuracy)

Baseline test accuracy: 0.8802440762519836


## Quanttization

Quantization works by reducing the precision of the numbers used to represent a model's parameters, which by default are 32-bit floating-point numbers. This results in a smaller model size and faster computation.

In [15]:
def evaluate_tflite(interpreter, quantization_type='fp16'):
  #Get input and output tensors
  input_details = interpreter.get_input_details()
  output_details = interpreter.get_output_details()

  # Load the TF dataset
  data_dir = "Testing"
  img_height, img_width = 224, 224
  batch_size = 1
  dataset = tf.keras.preprocessing.image_dataset_from_directory(
      data_dir,
      image_size=(img_height, img_width),
      batch_size=batch_size)

  # Evaluate the model on the dataset
  correct = 0
  total = 0
  for images, labels in dataset:
      # Preprocess the input images
      images = tf.cast(images, tf.float32) / 255.0
      images = tf.image.resize(images, (img_height, img_width))
      images = np.array(images)

      if quantization_type =='int8':
        images = np.around(images * 255.0).astype(np.uint8)

      # Run inference on the TFLite model
      interpreter.set_tensor(input_details[0]['index'], images)
      interpreter.invoke()
      output = interpreter.get_tensor(output_details[0]['index'])

      # Get the predicted labels
      predicted_labels = np.argmax(output, axis=1)

      # Update the accuracy count
      correct += np.sum(predicted_labels == labels)
      total += len(labels)

  # Print the accuracy
  accuracy = correct / total
  return(accuracy)

### Float 16 Quantaziation


In Float-16 quantization, weights are converted to 16-bit floating-point values. This results in a 2x reduction in model size. There is a significant reduction in model size in exchange for minimal impacts to latency and accuracy.

In [16]:
#Defining Config
config = quantization.QuantizationConfig.for_float16()
#Exporting Model
model.export_model(model_name='model_fp16.tflite',quantization_config=config)

We have passed the Float 16 quantization to the ***converter.target_spec.supported_type*** to specify the type of quantization. The rest of the code remains the same for a general way of conversion for the TF Lite Model.


Let’s check this Float 16 quantized TF Lite’s model performance on the Test Set.


In [17]:
interpreter = tf.lite.Interpreter(model_path="exported_model/model_fp16.tflite")
interpreter.allocate_tensors()
test_accuracy = evaluate_tflite(interpreter)
print('Float 16 Quantized TFLite Model Test Accuracy:', test_accuracy*100)
print('Baseline Keras Model Test Accuracy:', baseline_model_accuracy*100)

Found 1311 files belonging to 4 classes.
Float 16 Quantized TFLite Model Test Accuracy: 84.82074752097635
Baseline Keras Model Test Accuracy: 88.02440762519836


### Integer Qunatization

Integer quantization is an optimization strategy that converts 32-bit floating-point numbers (such as weights and activation outputs) to the nearest 8-bit fixed-point numbers. This resulted in a smaller model and increased inferencing speed.

The integer quantization requires a representative dataset, i.e. a few images from the training dataset, for the conversion to happen.


In [18]:
#Defining Config
config = quantization.QuantizationConfig.for_int8(test_ds)
#Exporting Model
model.export_model(model_name='model_int8.tflite', quantization_config=config)

Let’s evaluate the obtained Integer Quantized TF Lite model on Test Dataset.

In [19]:
interpreter = tf.lite.Interpreter(model_path="exported_model/model_int8.tflite")
interpreter.allocate_tensors()
test_accuracy = evaluate_tflite(interpreter,quantization_type='int8')
print('Int 8 Quantized TFLite Model Test Accuracy:', test_accuracy*100)
print('Baseline Keras Model Test Accuracy:', baseline_model_accuracy*100)

Found 1311 files belonging to 4 classes.
Int 8 Quantized TFLite Model Test Accuracy: 86.04118993135012
Baseline Keras Model Test Accuracy: 88.02440762519836


### Dynamic Range Quantization

In Dynamic Range Quantization, weights are converted to 8-bit precision values. Dynamic range quantization achieves a 4x reduction in the model size.

In [20]:
#Defining Config
config = quantization.QuantizationConfig.for_dynamic()
#Exporting Model
model.export_model(model_name='model_dynamic.tflite',quantization_config=config)

Let’s evaluate this TF Lite model on the test dataset.

In [21]:
interpreter = tf.lite.Interpreter(model_path="exported_model/model_dynamic.tflite")
interpreter.allocate_tensors()
test_accuracy = evaluate_tflite(interpreter)
print('Dynamic Quantized TFLite Model Test Accuracy:', test_accuracy*100)
print('Baseline Keras Model Test Accuracy:', baseline_model_accuracy*100)

Found 1311 files belonging to 4 classes.
Dynamic Quantized TFLite Model Test Accuracy: 85.73607932875667
Baseline Keras Model Test Accuracy: 88.02440762519836
