# Fall Problem Session 12
## CIFAR-10 II

In this notebook you will continue to create neural networks to classify the images of the CIFAR-10 data, <a href="https://www.cs.toronto.edu/~kriz/cifar.html">https://www.cs.toronto.edu/~kriz/cifar.html</a>. You will make convolutional neural networks (CNNs) and see if they improve upon the feed forward networks from `Fall Problem Session 11`.

In particular, this material will touch on the following lecture notebooks:
- `Lectures/Neural Networks/2. The MNIST Data Set`,
- `Lectures/Neural Networks/3. Multilayer Neural Networks`,
- `Lectures/Neural Networks/4. keras`,
- `Lectures/Neural Networks/5. Introduction to Convolutional Neural Networks` and
- `Lectures/Neural Networks/7. Loading Pre-Trained Models`.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")

#### 1. Load and prepare the data

Load the `cifar10` data from `keras.datasets`, scale the pixel values and make a validation split.

In [None]:
## import cifar10 from keras.datasets



## import train_test_split


In [None]:
## Load the data


## scale the data



In [None]:
## make the validation set



#### 2. A first convolutional neural network

In this problem you will make your first CNN. 

##### a. 

First import everything you will need from `keras`.

In [None]:
## Import what you need from keras




##### b.

Try building a CNN with a single convolutional layer of depth $8$ using a $3\times 3$ filter followed by a pooling layer using a $2\times 2$ filter with size $2$ strides.

Remember that the `X` data here has a different shape than the `X` for the MNIST data. This should impact what you place in the `input_shape` argument of the first convolutional layer.

<i>If training this network seems slow to the point of being unworkable, try changing the depth from $8$ to $4$.</i>

In [None]:
## Make an empty sequential model
model1 = 

## Add the convolutional layer here
model1.add()

## Add the pooling layer here
model1.add()

## Add the flatten layer
model1.add()

## Add the feed forward layer, use 100 nodes
model1.add()

## Add the output layer
model1.add()

## Same compile step from notebook 11
model1.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
## fit the model for 25 epochs, this can take a little bit
## remember the validation data argument.
n_epochs=25
history1 = model1.fit()

history_dict1 = history1.history

In [None]:
## Plot the training set accuracy and the validation set accuracy
## against the number of epochs trained
plt.figure(figsize=(10,6))

plt.scatter(range(1,n_epochs+1), 
            history_dict1['accuracy'], 
            label="Training Data")
plt.scatter(range(1,n_epochs+1), 
            history_dict1['val_accuracy'], 
            marker='v',
            label="Validation Data")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

#### 3. A CNN with padding

##### a.

Add in the `padding='same'` argument to the convolutional layer from the network above. Fit this network.

##### Sample Solution

In [None]:
## Make an empty sequential model
model2 = models.Sequential()

## Add the convolutional layer here
## remember the padding='same' argument
model2.add()

## Add the pooling layer here
model2.add(layers.MaxPooling2D((2,2), strides=2))

## Add the flatten layer
model2.add(layers.Flatten())

## Add the feed forward layer, use 100 nodes
model2.add(layers.Dense(100, activation='relu'))

## Add the output layer
model2.add(layers.Dense(10, activation='softmax'))

## Some compile step from notebook 11
model2.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
## fit the model for 25 epochs, this can take a bit
n_epochs=25
history2 = model2.fit(X_tt, 
                      to_categorical(y_tt), 
                      epochs=n_epochs, 
                      batch_size=512,
                      validation_data=(X_val,to_categorical(y_val)))

history_dict2 = history2.history

##### b.

Plot the validation accuracy for both models and see if the addition of padding had a noticeable impact on the model performance.

##### Sample Solution

In [None]:
plt.figure(figsize=(10,6))


plt.scatter(label="No Padding")



plt.scatter(marker='v',
            label="With Padding")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

##### Write any thoughts you have here

#### 4. Adding a dropout layer

Sometimes while building convolutional or recurrent neural networks you will add what is known as a <i>dropout</i> layer before the final feed forward layer of the network.

A dropout layer will randomly turn off input nodes with a probability that you select when setting up the network. For example a dropout layer with probability $0.25$ will turn each of the input nodes to $0$ with probability $0.25$. 

This may seem counterintuitive because we will be getting rid of some of the work the previous layers of our network have done. However, neural networks have a ton of parameters, meaning that they tend to overfit on the training data. By randomly turning some nodes to $0$ we lessen the networks ability to overfit, which may in turn improve performance on observations not included in the training set.

##### a.

For `model3` use your results from above to choose either `model1` or `model2` and then add a dropout layer between  the `.Flatten()` layer and the feed forward `Dense()` layer. Dropout layers can be inserted in `keras` with `layers.Dropout(dropout_probability)`.

In [None]:
## Make an empty sequential model
model3 = models.Sequential()

## Add the convolutional layer here
model3.add()

## Add the pooling layer here
model3.add(layers.MaxPooling2D((2,2), strides=2))

## Add the flatten layer
model3.add(layers.Flatten())

## Add the dropout layer, set the dropout_probability as you'd like
model3.add()

## Add the feed forward layer, use 100 nodes
model3.add(layers.Dense(100, activation='relu'))

## Add the output layer
model3.add(layers.Dense(10, activation='softmax'))

## Some compile step from notebook 11
model3.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
## fit the model for 25 epochs, this can take a bit
n_epochs=25
history3 = model3.fit(X_tt, 
                      to_categorical(y_tt), 
                      epochs=n_epochs, 
                      batch_size=512,
                      validation_data=(X_val,to_categorical(y_val)))

history_dict3 = history3.history

##### b.

Plot the validation accuracies of the original model and the version with a dropout layer. Does one seem to outperform the other?

In [None]:
plt.figure(figsize=(10,6))


plt.scatter(label="No Dropout")


plt.scatter(marker='v',
            label="With Dropout")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

##### Write any thoughts you have here

#### 5. Choose a CNN

Choose a CNN from the ones you have tested out in this notebook.

##### Write your model choice down here

#### 6. Choosing a final model

We will now pretend that we are done trying models and want to select a final model for these data.

##### a.

At the end of `Fall Problem Session 11` you saved your best feed forward neural network model to file. Load this network now using `load_model` from `keras.models`.

<i>Note: If you were not able to save a model during `Fall Problem Session 11` you can use the model I saved called `matt_model_fall_pb_sess_11`.</i>

In [None]:
## import load_model here


In [None]:
## Load the model here using load_model
ff_model = 

##### b.

Compare the performance of the feed forward model and the CNN model on the validation set.


<i>Note: If you are using `matt_model_fall_pb_sess_11`, this model was trained using the PCA transformed data. You will need to refit PCA on the training portion of the validation split and then transform the validation data prior to finding the validation set performance for this model. The PCA captured 90% of the original data's variance.</i>

In [None]:
## importing accuracy_score
from sklearn.metrics import accuracy_score

In [None]:
### Feed Forward Network below ###

In [None]:
### CNN Network below ###

#### 7. Test set performance

##### a.

Retrain your final model on the entire training set. Note that this means you will need to redefine your model and retrain it from scratch.

In [None]:
## Make an empty sequential model
final_model = 

In [None]:
## fit the model for 25 epochs, this can take a bit
n_epochs=25
history = final_model.fit(X_train, 
                      to_categorical(y_train), 
                      epochs=n_epochs, 
                      batch_size=512)

##### b.

Find the accuracy on the test set.

In this notebook we stuck with very small CNNs in order to keep the training time down as much as possible.

You are welcome to build bigger CNNs to see if you can improve upon the performance you have already achieved. You should be prepared, however, for the training time on such networks to be quite long, in comparison to what you did here. Per the documentation, <a href="http://www.cs.toronto.edu/~kriz/cifar.html">http://www.cs.toronto.edu/~kriz/cifar.html</a>, the benchmark model takes over an hour to train, but does attain $82\%$ test set accuracy.

--------------------------

This notebook was written for the Erd&#337;s Institute C&#337;de Data Science Boot Camp by Matthew Osborne, Ph. D., 2022.

Any potential redistributors must seek and receive permission from Matthew Tyler Osborne, Ph.D. prior to redistribution. Redistribution of the material contained in this repository is conditional on acknowledgement of Matthew Tyler Osborne, Ph.D.'s original authorship and sponsorship of the Erdős Institute as subject to the license (see License.md)