<a href="https://colab.research.google.com/github/Ekhao/CPUvsGPUvsTPUInferenceTiming/blob/main/CPU_vs_GPU_vs_TPU_Inference_Timing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CPU vs GPU vs TPU Inference Time
This colab notebook aims to provide a comparison between the inference performance of CPUs, GPUs and TPUs.

Unfortunately it is not possible to dynamically switch between computing instances on Google Colab. Therefore this notebook must be run once for each type of computing platform. Some instructions may be specific to one platform e.g. the TPU. These instructions are marked as such and put in standalone code blocks.

## Imports

In [None]:
import tensorflow as tf
import tensorflow_datasets as tfds

## CPU Specifics
We don't need to run any specific code to run inference using the CPU. However, we can check what cpu we are using by using the "lscpu" command.

In [None]:
!lscpu
strategy = tf.distribute.get_strategy()

## GPU Specifics
We also do not need to run any special code to run inference on a GPU. We can again however, check the GPU that we are using by using the "nvidia-smi" command.

In [None]:
!nvidia-smi
strategy = tf.distribute.get_strategy()

## TPU Specifics
Unlike the CPU and GPU, the TPU requires some code to set up. This is likely because it is a newer platform which does not have as streamlined of an interface as the CPU and GPU. 

We also print a list of TPU devices that we are using.

In [None]:
try:
  tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu="")
  print("TPU Available")
except ValueError:
  tpu = None
if tpu:
  tf.config.experimental_connect_to_cluster(tpu)
  tf.tpu.experimental.initialize_tpu_system(tpu)
  strategy = tf.distribute.TPUStrategy(tpu)

## Set up the a model
For this experiment we choose to use the ResNetV2 Neural Network. This model can be loaded easily through keras.

In [None]:
with strategy.scope():
  model = tf.keras.applications.ResNet152V2(
      include_top=True,
      weights="imagenet",
  )

## Set up a Dataset
ResNet152V2 is a model meant to be processing image data. It is trained on Imagenet - but as we don't really care about accurate predictions we can use any available dataset. 

We choose to use the "tf_flowers" as it is available in google cloud storage for free. Note that storage in google cloud storage is required for TPUs.

In [None]:
data = tfds.load("tf_flowers", split="train", as_supervised=True, try_gcs=True)

We need to preprocess the dataset a bit for two reasons:


*   The images of tf_flowers are of different sizes. We need to resize them all to 224x224 pixels to be used with the ResNet152V2 model.
*   A tensorflow dataset can be batched, cached and prefetched to optimize memory latency. To provide a fair comparison between CPUs, GPUs and TPUs we apply these optimizations so that each processing unit can work as fast as possible.



In [None]:
# A thin wrapper function that takes an entry of a tensorflow dataset loaded "as_supervised" and resizes the image in the entry
def resize_image(image, label):
  return tf.image.resize(image, [224,224]), label


data = data.map(resize_image, num_parallel_calls=tf.data.AUTOTUNE).repeat(3).batch(128).cache().prefetch(tf.data.AUTOTUNE)

## Run Inference
In this block we make the ResNet152V2 model run inference on the tf_flowers dataset.

For fairness it makes sense to run this twice to reduce the time spent on overhead for first loading.

In [None]:
model.predict(data)