In [None]:
# execute this cell before you start

import tensorflow as tf
from tensorflow.keras import layers

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras

print(tf.VERSION)
print(tf.keras.__version__)


#  CA3
## due on 22/03/2019

to submit the assignment, please do the following:

- do `Cell -> All output -> Clear` to clear all your output
- save the notebook (CA3.ipynb)

# The Cifar 10 dataset

Consider the data in  `keras.datasets.cifar10` and train a network which reliably categorizes the data. You can get some inspiartion from the following worked out example:

https://keras.io/examples/cifar10_cnn/

Try to understand tradeoffs:

- What increases computing time?
- What increases accuracy?

#### Load the data and create the class labels :

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

# Class labels from https://www.cs.toronto.edu/~kriz/cifar.html
class_labels = [    
    'airplane', 
    'automobile',
    'bird',
    'cat',
    'deer',
    'dog',
    'frog',
    'horse',
    'ship',
    'truck'
]

#### Get an overview of the data:

In [None]:
x_train.shape[1:]

plt.figure(figsize=(10,10))
for i in range(0,25):
    plt.subplot(5,5, i+1)
    plt.imshow(x_train[i])
    y_train[i]    


#### Normalize the images :

In [None]:
x_train = x_train/255.0
x_test = x_test/255.0


#### Build and fit the model :

In [None]:
%%time

model = keras.models.Sequential()

model.add(keras.layers.Conv2D(filters=32, kernel_size=(3,3), strides=(2,2),
                              padding='same', input_shape=x_train.shape[1:],
                             activation=tf.nn.relu))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Dropout(0.2))

# Using strides=2 instead of strides=1 significantly decreases 
# computing time (down to 5 minutes from about 20), with
# an acceptable decrease in accuracy of about 3% (down to 72% from 75%)
model.add(keras.layers.Conv2D(filters=64, kernel_size=(3,3), strides=(2,2),
                              padding='same', input_shape=x_train.shape[1:],
                             activation=tf.nn.relu))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Dropout(0.2))

model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(512, activation=tf.nn.relu))
model.add(keras.layers.Dense(len(class_labels), activation=tf.nn.softmax))

model.compile(optimizer='adam',loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Fit the model, with 30% of the data as validation data
fit_result = model.fit(x_train, y_train, epochs=25, validation_split=0.3)
history = fit_result.history


#### Plot Epochs vs Accuracy

In [None]:
plt.plot(fit_result.epoch, history['acc'], 'b', label='Training acc')
plt.plot(fit_result.epoch, history['val_acc'], 'r', label='Validation acc')
plt.title('Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

#### Plot Epochs vs Loss

In [None]:
plt.plot(fit_result.epoch, history['loss'], 'b', label='Training loss')
plt.plot(fit_result.epoch, history['val_loss'], 'r', label='Validation loss')
plt.title('Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()

#### Plot evaluation:
It can be seen that the training accuracy increases throughout. However, the validation accuracy fluctuates and more or less stagnates after 10 epochs. Similar is the case with the loss, where the training loss decreases throughout although the validation loss saturates after about 10 epochs.
This is a case of over fitting the data

#### Evaluate the test data:

In [None]:
predictions = model.predict(x_test)

#### Modify the test labels and predictions to comparable dimensions:

In [None]:
y_test2 = y_test.squeeze()
predictions = [np.argmax(i) for i in predictions]

In [None]:
test_accuracy = sum(y_test2 == predictions)/len(predictions)
print('Test Accuracy : ',test_accuracy) 

##### What increases computing time?
In our model, we have used a stride length of 2 for the convolutional layer. A stride lenght of 1 improves the accuracy, although it significantly increases the computing time, to ~20 minutes, from the curent value of
~5 minutes.

#####  What increases accuracy?
For improved accuracy, we could have added more number of convolutional layers, though this would increase computing time. Also, setting strides=1 for the convolutional layer increases accuracy by about 3% (up to 75% from the current 72%). We can also experiment with an improved loss function to increase accuracy.
Setting a higher value for epochs also could result in marginally better accuracy, although again at the expense of time.