<a href="https://colab.research.google.com/github/AhmedFarrukh/DeepLearning-EdgeComputing/blob/main/Reproducing_Paper_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In an effort to reproduce the findings of the paper, "To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference", 7 popular convolutional neural network models will be trained, quantized and then tested for accuracy and inference time.

In [None]:
import tensorflow as tf
from PIL import Image
import numpy as np
import os
import sys
import time
import numpy as np
import pathlib

In [None]:
modelNames = ["MobileNet", "ResNet50", "ResNet101", "InceptionV3", "VGG16", "VGG19", "ResNet152"]

In [None]:
for modelName in modelNames:
  model_class = getattr(tf.keras.applications, modelName)
  model = model_class(weights='imagenet')

  converter = tf.lite.TFLiteConverter.from_keras_model(model)
  tflite_model = converter.convert()

  converter = tf.lite.TFLiteConverter.from_keras_model(model)
  converter.optimizations = [tf.lite.Optimize.DEFAULT]
  tflite_model_quant = converter.convert()

  tflite_models_dir = pathlib.Path("/tmp/tflite_models/")
  tflite_models_dir.mkdir(exist_ok=True, parents=True)

  # Save the unquantized/float model:
  tflite_model_file = tflite_models_dir/(modelName+".tflite")
  tflite_model_file.write_bytes(tflite_model)
  # Save the quantized model:
  tflite_model_quant_file = tflite_models_dir/(modelName+"_quant.tflite")
  tflite_model_quant_file.write_bytes(tflite_model_quant)


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_1_0_224_tf.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet101_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels.h5
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet152_weights_tf_dim_ordering_tf_kernels.h5


In [None]:
!mkdir /tmp/benchmark
!wget https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model -P /tmp/benchmark
!chmod +x /tmp/benchmark/linux_x86-64_benchmark_model
!touch /tmp/benchmark/results

--2024-07-08 11:02:40--  https://storage.googleapis.com/tensorflow-nightly-public/prod/tensorflow/release/lite/tools/nightly/latest/linux_x86-64_benchmark_model
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.23.207, 74.125.203.207, 74.125.204.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.23.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6237672 (5.9M) [application/octet-stream]
Saving to: ‘/tmp/benchmark/linux_x86-64_benchmark_model’


2024-07-08 11:02:41 (5.81 MB/s) - ‘/tmp/benchmark/linux_x86-64_benchmark_model’ saved [6237672/6237672]



In [None]:
for modelName in modelNames:
  os.system("echo \"" + modelName + "; Original\n\" >> /tmp/benchmark/results" )

  os.system("/tmp/benchmark/linux_x86-64_benchmark_model \
    --graph=/tmp/tflite_models/" + modelName +".tflite"+" \
    --num_threads=1 >> /tmp/benchmark/results")

  os.system("echo \"\n" + modelName + "; Quantized\n\" >> /tmp/benchmark/results" )

  os.system("/tmp/benchmark/linux_x86-64_benchmark_model \
    --graph=/tmp/tflite_models/" + modelName +"_quant.tflite"+" \
    --num_threads=1 >> /tmp/benchmark/results")

  os.system("echo \"" + "\n\n\" >> /tmp/benchmark/results" )

f = open("/tmp/benchmark/results", "r")
print(f.read())


MobileNet; Original

INFO: STARTING!
INFO: Log parameter values verbosely: [0]
INFO: Num threads: [1]
INFO: Graph: [/tmp/tflite_models/MobileNet.tflite]
INFO: Signature to run: []
INFO: #threads used for CPU inference: [1]
INFO: Loaded model /tmp/tflite_models/MobileNet.tflite
INFO: The input model file size (MB): 16.9034
INFO: Initialized session in 137.482ms.
INFO: Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
INFO: count=24 first=21809 curr=24807 min=19942 max=24807 avg=21104.3 std=1030

INFO: Running benchmark for at least 50 iterations and at least 1 seconds but terminate if exceeding 150 seconds.
INFO: count=50 first=27186 curr=20509 min=20320 max=30551 avg=21604.6 std=2033

INFO: Inference timings in us: Init: 137482, First inference: 21809, Warmup (avg): 21104.3, Inference (avg): 21604.6
INFO: Note: as the benchmark tool itself affects memory footprint, the following is only APPROXIMATE to the actual memory footprin

Next, we load and process the ImageNet Dataset, to measure accuracy and estimate inference times.

In [None]:
!pip install tensorflow-datasets
import tensorflow_datasets as tfds




In [None]:
ds_name = 'imagenette'
ds, info = tfds.load(ds_name, split='validation', as_supervised=True, with_info=True)
get_label_name = info.features['label'].int2str
text_labels = [get_label_name(i) for i in range(10)]

Downloading and preparing dataset 1.45 GiB (download: 1.45 GiB, generated: 1.46 GiB, total: 2.91 GiB) to /root/tensorflow_datasets/imagenette/full-size-v2/1.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

Let's also load the Imagenet sysnet mapping.

In [None]:
import json
!wget -q -O imagenet_class_index.json https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json

# Load the mapping file
with open('imagenet_class_index.json') as f:
    class_index = json.load(f)

Next, let's define a function to measure the accuracy of each model.

In [None]:
def inference(model_path, modelName):

  tflite_model_path = model_path
  interpreter = tf.lite.Interpreter(model_path=tflite_model_path)
  interpreter.allocate_tensors()

  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  totalTime = 0
  correct_predictions = 0
  total_predictions = 0

  preProcessDetails = {"MobileNet": ([224, 224], "mobilenet"),
                     "ResNet50": ([224, 224], "resnet"),
                     "ResNet101": ([224, 224], "resnet"),
                     "InceptionV3": ([299, 299], "inception_v3"),
                     "VGG16": ([224, 224], "vgg16"),
                     "VGG19": ([224, 224], "vgg19"),
                     "ResNet152": ([224, 224], "resnet")}
  def preprocess(image, label):
      image = tf.image.resize(image, preProcessDetails[modelName][0])
      model = getattr(tf.keras.applications, preProcessDetails[modelName][1])
      image = model.preprocess_input(image)
      return image, label

  ds2 = ds.map(preprocess).batch(1)

  for image, label in ds2:
    start_time = time.time()
    interpreter.set_tensor(input_index, image)
    interpreter.invoke()
    predictions = interpreter.get_tensor(output_index)
    totalTime += time.time() - start_time
    #print(class_index[str(np.argmax(predictions[0]))][0], text_labels[label.numpy()[0]])
    if class_index[str(np.argmax(predictions[0]))][0] == text_labels[label.numpy()[0]]:
      correct_predictions += 1
    total_predictions += 1

  return (totalTime/len(ds), correct_predictions/total_predictions)

Let's actually evalute each model on the dataset.

In [None]:
for modelName in modelNames:
  print(modelName + ": ", inference("/tmp/tflite_models/" + modelName +".tflite", modelName))
  print(modelName + " Quantized: ", inference("/tmp/tflite_models/" + modelName +"_quant.tflite", modelName))