# <p style="text-align: center;">MIS 284N - Big Data and Distributed Programming</p>
## <p style="text-align: center;">Project 3 - Machine Learning using Tensorflow and Google Colab</p>
## <p style="text-align: center;">Total points: 100</p>
## <p style="text-align: center;">Due: Sunday, October 17th submitted via Canvas by 11:59 pm</p>

This will be a in-class project done in teams of 2. 

In this Project, we will work with CIFAR10 image dataset. 
The starter code to download the database using keras is given below. 
Test the project on Google Colab running on a CPU, GPU and TPU
 

# In every line of code, please write a comment to briefly explain what that line is doing.
Your grades will be based on your understanding of the code you write! 


# Task 1
Convert the features in a form that can be given as input to tensorflow library/functions

In this task you will perform data augmentation. That is, pre-process the data to make the model more robust. Experiment with data augmentation techniques like rotation, translation, horizontal-flip, scaling, ZCA whitening and histogram equalization. 
You can choose any two or more augmentation technique(s) of your choice. 

# Task 2
Try to build a Neural Network model, train on the features and report the accuracy.
Report your observations on the time taken on CPU and GPU (with and without CuDNN kernel) 



1.   Create a CNN based model with 4 hidden layers with 64, 128, 256 and 512 units in each succesive layer. Use a 5x5 convolution kernel and change as necessary. (Use at least 2 augmentations on your input) 
2.   Create an LSTM based model with 1 LSTM layer with 256 units. 



In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals

import collections
import matplotlib.pyplot as plt
import numpy as np

try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf
import time
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers

In [None]:
batch_size = 64
input_dim = 3
units = 256
output_size = 10

def build_model(allow_cudnn_kernel=True):
  if allow_cudnn_kernel:
    lstm_layer = tf.keras.layers.LSTM(units, input_shape=(None, input_dim))
  else:
    lstm_layer =  tf.keras.layers.RNN(
        tf.keras.layers.LSTMCell(units),
        input_shape=(None, input_dim))
  model = tf.keras.models.Sequential([
      lstm_layer,
      tf.keras.layers.BatchNormalization(),
      tf.keras.layers.Dense(output_size, activation='softmax')]
  )
  return model

In [None]:
from keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [None]:
x_train.shape

(50000, 32, 32, 3)

In [None]:
x_train = x_train.reshape(50000, 1024, 3)
x_train = x_train / 255.0

x_test = x_test.reshape(10000, 1024, 3)
x_test = x_test / 255.0

In [None]:
x_train.shape

(50000, 1024, 3)

In [None]:
x_train.shape

(50000, 1024, 3)

# with CUDNN with cpu

In [None]:
model = build_model(allow_cudnn_kernel=True)

model.compile(loss='sparse_categorical_crossentropy', 
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(x_train, y_train,
          validation_data=(x_test, y_test),
          verbose=1, steps_per_epoch = 100,
          batch_size=batch_size,
          epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5

In [None]:
print("CPU with CuDNN LSTM test accuracy")
scores = slow_model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

# Task 3
Run the LSTM solution in Task2 on a TPU and report the performance 

Without CuDNN 

In [None]:
import os

tf.debugging.set_log_device_placement(True)
try:
  resolver = tf.distribute.cluster_resolver.TPUClusterResolver() 
  print('Running on TPU ', resolver.cluster_spec().as_dict()['worker'])
except ValueError:
  raise BaseException('ERROR: Not connected to a TPU runtime')


tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)

start = time.time()
with strategy.scope():
  slow_model = build_model(allow_cudnn_kernel=False)
  slow_model.set_weights(slow_model.get_weights())
  slow_model.compile(loss='sparse_categorical_crossentropy', 
                    optimizer='sgd', 
                    metrics=['accuracy'])
  slow_model.fit(x_train, y_train, 
                validation_data=(x_test, y_test),
                verbose=1, steps_per_epoch = 100,
                batch_size=batch_size,
                epochs=5)
stop = time.time()
print(f"Training time: {stop - start}s")

Running on TPU  ['10.114.115.154:8470']
INFO:tensorflow:Clearing out eager caches


INFO:tensorflow:Clearing out eager caches






INFO:tensorflow:Initializing the TPU system: grpc://10.114.115.154:8470


INFO:tensorflow:Initializing the TPU system: grpc://10.114.115.154:8470


Executing op __inference__tpu_init_fn_12059 on task /job:worker/replica:0/task:0/device:TPU_SYSTEM:0
INFO:tensorflow:Finished initializing TPU system.


INFO:tensorflow:Finished initializing TPU system.


INFO:tensorflow:Found TPU system:


INFO:tensorflow:Found TPU system:


INFO:tensorflow:*** Num TPU Cores: 8


INFO:tensorflow:*** Num TPU Cores: 8


INFO:tensorflow:*** Num TPU Workers: 1


INFO:tensorflow:*** Num TPU Workers: 1


INFO:tensorflow:*** Num TPU Cores Per Worker: 8


INFO:tensorflow:*** Num TPU Cores Per Worker: 8


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)


Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:0
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:0
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:0
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:1
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:1
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:1
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:2
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:2
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:2
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:3
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:3
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:3
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:4
Execu

In [None]:
print("TPU without CuDNN LSTM test accuracy")
scores = slow_model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

With CuDNN TPU

In [None]:
import os

tf.debugging.set_log_device_placement(True)

try:
  resolver = tf.distribute.cluster_resolver.TPUClusterResolver() 
  print('Running on TPU ', resolver.cluster_spec().as_dict()['worker'])
except ValueError:
  raise BaseException('ERROR: Not connected to a TPU runtime')

tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)

start = time.time()
with strategy.scope():
  slow_model = build_model(allow_cudnn_kernel=True)
  slow_model.set_weights(slow_model.get_weights())
  slow_model.compile(loss='sparse_categorical_crossentropy', 
                    optimizer='sgd', 
                    metrics=['accuracy'])
  slow_model.fit(x_train, y_train, 
                validation_data=(x_test, y_test),
                verbose=1, steps_per_epoch = 100,
                batch_size=batch_size,
                epochs=5)
stop = time.time()
print(f"Training time: {stop - start}s")

Running on TPU  ['10.114.115.154:8470']
INFO:tensorflow:Clearing out eager caches


INFO:tensorflow:Clearing out eager caches






INFO:tensorflow:Initializing the TPU system: grpc://10.114.115.154:8470


INFO:tensorflow:Initializing the TPU system: grpc://10.114.115.154:8470


Executing op __inference__tpu_init_fn_24404 on task /job:worker/replica:0/task:0/device:TPU_SYSTEM:0
INFO:tensorflow:Finished initializing TPU system.


INFO:tensorflow:Finished initializing TPU system.


INFO:tensorflow:Found TPU system:


INFO:tensorflow:Found TPU system:


INFO:tensorflow:*** Num TPU Cores: 8


INFO:tensorflow:*** Num TPU Cores: 8


INFO:tensorflow:*** Num TPU Workers: 1


INFO:tensorflow:*** Num TPU Workers: 1


INFO:tensorflow:*** Num TPU Cores Per Worker: 8


INFO:tensorflow:*** Num TPU Cores Per Worker: 8


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)


Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:0
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:0
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:0
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:1
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:1
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:1
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:2
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:2
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:2
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:3
Executing op VarHandleOp on task /job:worker/replica:0/task:0/device:TPU:3
Executing op AssignVariableOp on task /job:worker/replica:0/task:0/device:TPU:3
Executing op _EagerConst on task /job:worker/replica:0/task:0/device:TPU:4
Execu

In [None]:
print("TPU with CuDNN LSTM test accuracy")
scores = slow_model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])