## <p style="text-align: center;">MIS 284N - Big Data and Distributed Programming</p>
## <p style="text-align: center;">Project 3 - Machine Learning using Tensorflow and Google Colab</p>
## <p style="text-align: center;">Total points: 100</p>
## <p style="text-align: center;">Due: Sunday, October 20th submitted via Canvas by 11:59 pm</p>

Your homework should be written in a **Jupyter notebook**. You may work in groups of two if you wish. Only one student per team needs to submit the assignment on Canvas.  But be sure to include name and UTID for both students.

Also, please make sure your code runs and the graphics (and anything else) are displayed in your notebook before submitting. (%matplotlib inline)

This project is about giving exposure about Tensorflow, its usage, Cloud services and help us in understanding the time taken to run computation on CPU and GPU. 

In this Project, we will work with CIFAR10 image dataset. 
The starter code to download the dataste using keras is given below. 
You should run this project on Google Colab. You would be using CPU, GPU.
Use tensorflow version 2.0. 

# In every line of code, please write a comment to briefly explain what that line is doing.
Your grades will be based on your understanding of the code you write! 

Note: The code you write should be your own!

# Task 1
Convert the features in a form that can be given as input to tensorflow library/functions

In this task you will perform data augmentation. That is, pre-process the data to make the model more robust. Most common data augmentation techniques are rotation, flips and histogram equalization. 
You can choose an augmentation technique of your choice. 

In [10]:
pip install keras

Note: you may need to restart the kernel to use updated packages.


In [11]:
from __future__ import print_function
import time
import tensorflow as tf
import keras
from keras.preprocessing.image import ImageDataGenerator

import matplotlib.pyplot as plt
import numpy as np


from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import os
from keras import optimizers
from keras.optimizers import RMSprop
import pprint

In [12]:
from keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# datagen = ImageDataGenerator(width_shift_range=0.1, horizontal_flip=True)
# datagen.fit(X_train)

# Task 2
Try to build a Neural Network model, train on the features and report the accuracy.
Report your observations on the time taken on GPU and TPUs. 

### 1. Create a CNN based model with 5 hidden layers and 100 hidden units each layer.

In [13]:
batch_size = 32
num_classes = 10
epochs = 10

#this is an empty sequential model
model = tf.keras.models.Sequential()

# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
    rotation_range=0,  # randomly rotates images from 0 to 180
    # randomly shift images horizontally so that they are a fraction of total width
    width_shift_range=0.1,
    # randomly shift images vertically so that they are a fraction of total height
    height_shift_range=0.1,
    horizontal_flip=True,  # randomly flips images horizontally
    vertical_flip=False)  # randomly flips images vertically

In [14]:
#Lets print the shapes
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# Converting class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples


In [15]:
#### CNN ####
#5 hidden layers with 100 hidden units in each layer
def my_model():
    
    #Adding Layer 1
    model.add(tf.keras.layers.BatchNormalization(input_shape=x_train.shape[1:]))
    model.add(tf.keras.layers.Conv2D(100, (5, 5), padding='same', activation='elu'))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Dropout(0.25))

    #Adding Layer 2
    model.add(tf.keras.layers.BatchNormalization(input_shape=x_train.shape[1:]))
    model.add(tf.keras.layers.Conv2D(100, (5, 5), padding='same', activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Dropout(0.25))

    #Adding Layer 3
    model.add(tf.keras.layers.BatchNormalization(input_shape=x_train.shape[1:]))
    model.add(tf.keras.layers.Conv2D(100, (5, 5), padding='same', activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Dropout(0.25))

    #Adding Layer 4
    model.add(tf.keras.layers.BatchNormalization(input_shape=x_train.shape[1:]))
    model.add(tf.keras.layers.Conv2D(100, (5, 5), padding='same', activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Dropout(0.25))

    #Adding Layer 5
    model.add(tf.keras.layers.BatchNormalization(input_shape=x_train.shape[1:]))
    model.add(tf.keras.layers.Conv2D(100, (5, 5), padding='same', activation='relu'))
    model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
    model.add(tf.keras.layers.Dropout(0.25))

    #Flattening layers and kissing them together
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(256))
    model.add(tf.keras.layers.Activation('elu'))
    model.add(tf.keras.layers.Dropout(0.5))
    model.add(tf.keras.layers.Dense(10))
    model.add(tf.keras.layers.Activation('softmax'))
    return model

In [16]:
start = time.time() #start measuring time
model = my_model()
model.summary() #show summary of model
end = time.time() #end measuring time
print(end - start) #printing the time taken on GPU 

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
batch_normalization (BatchNo (None, 32, 32, 3)         12        
_________________________________________________________________
conv2d (Conv2D)              (None, 32, 32, 100)       7600      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 16, 16, 100)       0         
_________________________________________________________________
dropout (Dropout)            (None, 16, 16, 100)       0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 16, 16, 100)       400       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 100)       250100    
__

In [17]:
#optimizer
opt = keras.optimizers.RMSprop(lr=0.0001, decay=1e-6)

#train
model.compile(loss='categorical_crossentropy',
              optimizer='RMSprop',
              metrics=['accuracy'])


In [None]:
start = time.time() #start measuring time
# Computing the quantities required for feature-wise normalization
datagen.fit(x_train)

#Fit the model on the batches generated by datagen.flow().
model.fit_generator(datagen.flow(x_train, y_train, 
                    batch_size=batch_size), 
                    epochs=epochs, 
                    validation_data=(x_test, y_test), 
                    workers=4,
                    use_multiprocessing=True)
end = time.time() #end measuring time
print(end - start) #printing the time taken on GPU 

Epoch 1/10


#### CNN - Time taken on GPU

It took 536.1103231906891 seconds to run on GPU.

#### CNN - Time taken on TPU

### 2. Create an LSTM based model with 2 hidden layers and 1024 hidden units in each layer.

In [None]:
import numpy as np
from keras.layers import Dense, Embedding
from keras.layers import LSTM

x_train_flattened = np.asarray(x_train.flatten()).reshape(50000,1,3072)
x_test_flatttened = np.asarray(x_test.flatten()).reshape(10000,1,3072)

lstm_model = keras.models.Sequential()

# Add first LSTM layer with 1024 hidden units
lstm_model.add(keras.layers.LSTM(1024, input_shape=(1, 3072), dropout=0.2, recurrent_dropout=0.2, return_sequences=True))

# Add second LSTM layer with 1024 hidden units
lstm_model.add(keras.layers.LSTM(1024, dropout=0.2, recurrent_dropout=0.2))

# Create sigmoid activation layer
lstm_model.add(keras.layers.Dense(10, activation='sigmoid'))

# Not sure if we need a softmax layer?

# Try different optimizers and optimizer configs
lstm_model.compile(loss='binary_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])
lstm_model.summary()

In [None]:
lstm_model.fit(x_train_flattened, y_train,
         batch_size=10,
         epochs=1,
         validation_data=(x_test_flattened, y_test))

#### LSTM - Time Taken on GPU

#### LSTM - Time Taken on TPU

# Task 3 (Extra credit, 25 points)
Run the above on a TPU and report the time taken to fit the models. 