**Note: Please make your own copy of this notebook to run and execute, thank you!**

1.   Go to the menu tab on the top left corner
2.   Click on "File"
3.   Under the File tab menu click on "Save a copy in Drive..."

## Tip for running multiple code cells
A useful tool when running code in Colaboratory is the **Runtime tab**. Clicking on this tab will open a menu with various options that will allow you to run multiple code cells simultaneously. For example, "Run before" will run all the code cells before the currently selected cell in order starting with the first. This is particularly helpful if you run into an error while editing your code and you want to ensure all the variables and data have been initialized properly prior to the cell you're working on.

# Load libraries and dataset

**Documentation:**

[Python 3 Documentation](https://docs.python.org/3/tutorial/index.html)

[Numpy Documentation](https://docs.scipy.org/doc/numpy/user/quickstart.html)

[Keras Documentation](https://keras.io/)



In [0]:
### DO NOT MODIFY ###
# Import Numpy, Matplotlib, and Keras Data Science libraries to perform most of the heavy lifting for us
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import keras
from keras.models import load_model
from keras.optimizers import SGD
from keras.utils import print_summary, to_categorical
from google.colab import files

**[CIFAR-10 Dataset](https://www.cs.toronto.edu/~kriz/cifar.html):**

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. 

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class. 

Split our data into features, labels, training, and testing data.

In [0]:
### DO NOT MODIFY ###
# Load the CIFAR-10 dataset from Keras
# Split dataset into testing, traing, features, and labels
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Exploratory Analysis

Now let's take a look at the size, shape, and dimensions of our intial data.

In [0]:
# Display the size and shape of features, labels, training, and testing datasets
print("Size and shape of the training features are: {}".format(x_train.shape))
print("Size and shape of the training labels are: {}".format(y_train.shape))
print("Size and shape of the testing features are: {}".format(x_test.shape))
print("Size and shape of the testing labels are: {}".format(y_test.shape))

In case you are interest in looking at individual images, run the snippet of code below. You can change the index from 10 to another number within the training data to see what image shows up that will be fed to our network.

In [0]:
# This samples the 10-th image from the training dataset.
index = 10 # change this value to see another image
image = x_train[index]

### DO NOT MODIFY ###
# Display the image and its label.
plt.figure(figsize=(3,3)) # Initialize the size of the plot frame
plt.imshow(image); plt.grid('off');plt.axis('off') # Feed image values to plot
plt.show() # Generate plot onto screen

# Data Preprocessing

We'll take care of the image preprocessing, but take a look if you are interested in how we manage the data before we feed it into our model.

In [0]:
### DO NOT MODIFY ###
# Specify the number of class labels in our data
num_classes = 10

# Keep original labels
labels_test = y_test

# Labels are stored as unique integers
# Convert labels into unique one-hot encodings of length num_class
# Each label will then be converted to a series of zeros with one unique column containing a one for a unique label
# example 10000000000 or 0001000000, etc (don't worry if you are not familiar with this step)
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

In [0]:
### DO NOT MODIFY ###
# Our input data is pixels from images where each pixel value is between ranges 0-255
# Range of values make it difficult for our network to learn
# Convert raw pixel values into values between 0-1
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 250.0
x_test /= 250.0

# Model Building

Alright now it's your turn! We went ahead and created the outline of what you will need to do to construct your model and get it up and running. If you have not heard of some of the terms used in this notebook such as convolutions, maxpooling, or softmax don't worry you are not expected to know what these functions do yet!

For now, we have constructed the ConvMaxLayer and ForwardLayer wrappers so you can focus constructing your model by taking a look at comments provided along with the MNIST mini-project demo to see how such a model might be constructed. If the comments and demo are still not clear - please refer to the documentation listed above to learn more or ask for some help!

In [0]:
### DO NOT MODIFY ###
# Forward-Feed Neural Network Layer
class ForwardLayer(tf.keras.Model):
  
    # Initialize our variables
    def __init__(self, size, dropout_rate):
        super(ForwardLayer, self).__init__()
        self.size = size
        self.dropout_rate = dropout_rate
        
    # Define our layers
    def build(self, input_shape):
        self.dense = tf.keras.layers.Dense(self.size, input_shape=input_shape)
        self.batchnorm = tf.keras.layers.BatchNormalization()
        self.dropout = tf.keras.layers.Dropout(self.dropout_rate)
    
    # Computation of network layers
    def call(self, inputs):
        x = self.dense(inputs)
        x = self.batchnorm(x)
        x = tf.nn.relu(x)
        x = self.dropout(x)
        return x
      
    # Define output shape
    def compute_output_shape(self, input_shape):
      return (input_shape[0], self.size)

In [0]:
# Initialize our network object
net = tf.keras.Sequential()
train = False # Freeze the Convolutional Blocks from being trained

# Convolutional Block One
net.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=1, padding='valid', input_shape=(32,32,3), trainable=train, name='conv_1'))
net.add(tf.keras.layers.BatchNormalization(trainable=train, name='bn_2'))
net.add(tf.keras.layers.MaxPooling2D(pool_size=2, strides=2, padding='valid', trainable=train, name='maxpl_3'))
net.add(tf.keras.layers.Dropout(0.3, trainable=train, name='drp_4'))
   
# Convolutional Block Two
net.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=1, padding='valid', trainable=train, name='conv_5'))
net.add(tf.keras.layers.BatchNormalization(trainable=train, name='bn_6'))
net.add(tf.keras.layers.MaxPooling2D(pool_size=2, strides=2, padding='valid', trainable=train, name='maxpl_7'))
net.add(tf.keras.layers.Dropout(0.3, trainable=train, name='drp_8'))

# Convolutional Block Three
net.add(tf.keras.layers.Conv2D(filters=64, kernel_size=3, strides=1, padding='valid', trainable=train, name='conv_9'))
net.add(tf.keras.layers.BatchNormalization(trainable=train, name='bn_10'))
net.add(tf.keras.layers.MaxPooling2D(pool_size=2, strides=2, padding='valid', trainable=train, name='maxpl_11'))
net.add(tf.keras.layers.Dropout(0.2, trainable=train, name='drp_12'))

# Flatten
net.add(tf.keras.layers.Flatten(name='flt_13'))
  
### TO DO: DEFINE THE DENSE PORTION OF THE NETWORK ###
# Create first forward-feed layer
# Add: a forward layer: ForwardLayer(size_of_layer, drop_out_prob, input_shape)
  
# Add more forward-feed layers (Optional)
# Add: a forward layer: ForwardLayer(size_of_layer, drop_out_prob, input_shape)

### DO NOT MODIFY ###
# Output Layer
net.add(tf.keras.layers.Dense(num_classes, activation='softmax'))

Now that you have your model defined we'll need to actually build it and define how it will learn. To do so we'll need to call the network's compile function to provide it the loss, optimization, and metric parameters.

In [0]:
### DO NOT MODIFY ###
# The compile function will build our model by using the defined loss, optimization, and metric parameters
net.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print('Model is done compiling!')

# Load Pre-Trained Weights

While we could train out model from the very start as we did with the MNIST project this would take a long time for this dataset.

As promised in the MNIST mini-project we'll show you how to load a h5 file to your model to help with speedup that contains the pre-train weights for the convolution layers. However you will still need to train your dense layers you defined earlier before we can test and use it.

To get access to the predefined weight file you can download it [here](https://drive.google.com/open?id=1ZPUGlYmdx57QfgcJMqpDxNfFWdpqTbe_).

In [0]:
# Load the pretrained_weights.h5 file
uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(name=fn, length=len(uploaded[fn])))

Now that we have the pre-trained weights loaded we need to call the load_weights function to load them to our model. The weights are only for the convolution layers so we will still need to train our dense layers.

In [0]:
# File loads the weights for the pretrained convolutional network (allows us to focus on the forward-feed dense layers for faster training times)
net.load_weights('pretrained_weights.h5', by_name=True)

In [0]:
# View the model before we train it
net.summary()

# Train Model

Our next step is define the batch size and number of epochs to train before we call the network's fit procedure with all our data and training parameters listed below. 

The batch size refers to how many samples (in this case, samples = images) are processed by the model before the internal weights are updated. Naturally, this means the batch size will be anywhere between 1 (just one sample) and the size of the training set (all the samples in the training set). It is generally advantageous to set the batch size to a relatively small value such as 32 or 64 since that limits how much memory we use on each iteration. That is, we only load a handful of images at a time rather than all of them at once. 

"Epoch" is simply another word for iteration. The more epochs we run, the more the model is able to train and (hopefully) the lower the error rate will be. Of course, training takes time, so for this project we first recommend starting with between 2-3 epochs and see if you can get the model to have an accuracy above 50% (random guessing is about 10%). Next, play around with the various model and training parameters (such as the number of epochs), or add more layers to see if you can get the model above 70% percent.

In [0]:
### TO DO: TRAIN OUR NETWORK ###
# Use the fit function to give the network the training features, training labels, batch size, number of epochs to train, validation split size
net.fit(x_train, y_train, batch_size="Replace and define the batch size", epochs="Replace and define the number of epochs to train", validation_split=0.2, shuffle=True)
net.summary()

# Test Model Performance

Now that we trained our model here is the moment of truth. A model that only performs well on training data is not a very good model! Let's see how well the model does on testing data to get an idea of how it might perform out in the wild. 

If you can get the model to predict images above 70% pat yourself on the back! These are really advanced networks that until just several years ago were extremely difficult for even top AI researchers to achieve!

In [0]:
### DO NOT MODIFY ###
# Use the networks evaluate function to see how well the model predicts the correct labels on the testing dataset
scores = net.evaluate(x_test, y_test, verbose=0)

# Report the final accuracy score
print('Test Loss:', scores[0])
print('Test accuracy:', scores[1])

# Applications

Alright the moment has come, let's see how well our neural networks predicts each class object!

In [0]:
### DO NOT MODIFY ###
# Create plot and parameters to layout our images and predictions
labels = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

num = 3
margin = 0.05
ind = np.arange(len(labels))
width = (1. - 2. * margin) / num

fig, ax = plt.subplots(nrows=num, ncols=2)
fig.tight_layout()
fig.suptitle('Image Predictions', fontsize=20, y=1.1)

# Loop through each image to plot and make a prediction
for i in range(num):
  image = x_test[i]
  label = labels[labels_test[i][0]]
  x = image.reshape(1,image.shape[0],image.shape[1],image.shape[2])
  pred = net.predict(x, batch_size=None, verbose=0, steps=None).flatten()

  # Display image and correct label
  ax[i][0].imshow(image.squeeze())
  ax[i][0].set_title("Correct Label: {}".format(label))
  ax[i][0].set_axis_off()
  
  # Display the predicted confidence in each label
  ax[i][1].barh(ind + margin, pred, width)
  ax[i][1].set_yticks(ind + margin)
  ax[i][1].set_yticklabels(labels)

# Save Model

In case you would like to continue working on your own model, make sure to save the weights and download the file to your local computer. For more information and tips on how to reload your model check out the documentation [here.](https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model)

In [0]:
# Saves your trained model for use later
net.save_weights('cifar_final_weights.h5') # Saves just the weights (like we did earlier)

In [0]:
# Download the files to local computer drive
files.download('cifar_final_weights.h5')

# Reflection and Analysis

1.  What was the accuracy of your final model? 

2.  Did the training, validation, and testing accuracy differ significantly at the end of training?

3. About how long did it take to train your final model?

4.  What factors did you consider when you built your model? 

5.  What parameters did you play around with and consider? Did you notice any significant changes when you changed this parameter? What was the overall affect of the model's performance?

6.  If you had more time do you think you could have produced a better model? If so what would you play around or experiment with in order to determine this? What factors might prevent your model from being able to accurately capture the data?

7. How do you think you can apply this model to everyday applications? How do you thing it could help serve society?
