# Problem Session 12
## MNIST of Fashion II

In this notebook you will work on problems that relate to our neural network content. In particular, this material will touch on the following lecture notebooks:
- `Lectures/Neural Networks/2. The MNIST Data Set`,
- `Lectures/Neural Networks/3. Multilayer Neural Networks`,
- `Lectures/Neural Networks/4. keras`,
- `Lectures/Neural Networks/5. Introduction to Convolutional Neural Networks` and
- `Lectures/Neural Networks/7. Loading Pre-Trained Models`.

In [None]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
from seaborn import set_style

set_style("whitegrid")

##### 1. Load the data

In this notebook you will continue to work to build neural networks to classify images of common fashion items. First run the code below in order to load the data set.

In [None]:
## docs: https://keras.io/api/datasets/fashion_mnist/
from keras.datasets import fashion_mnist

In [None]:
## This can take a little bit to run,
## especially if it is your first time running this code
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

label_dict = {0:"T-shirt/top",
                 1:"Trouser",
                 2:"Pullover",
                 3:"Dress",
                 4:"Coat",
                 5:"Sandal",
                 6:"Shirt",
                 7:"Sneaker",
                 8:"Bag",
                 9:"Ankle boot"}

##### 2. Validation set and scaling

Create a validation set with $20\%$ of the training set. Also scale the data by dividing by the maximum pixel value, $255$.

In [None]:
## scale data here



In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X_tt, X_val, y_tt, y_val = train_test_split(X_train, y_train,
                                               shuffle=True,
                                               random_state=213,
                                               test_size=.2)

##### 3. Loading a feed forward model

In `Problem Session 11` you worked to build a number of feed forward models for this classification problem. If your group was able to make it to the end of that notebook you would have saved the neural network model that performed best.

Load the model here for comparison purposes at the end of this notebook.

<i>If you were not able to save a model while working on `Problem Session 11`, you can load `nb11_matts_final_model` here. This model was trained on the PCA transformed version of the data.</i>

In [None]:
## import load_model here


In [None]:
## load_model here


##### 4. A first convolutional neural network

As a first convolution neural network, try building a CNN with a single convolutional layer of depth $64$ using a $3\times 3$ filter followed by a pooling layer using a $2\times 2$ filter with size $2$ strides.

In [None]:
## Import what you need from keras
from keras import models
from keras import layers
from keras import optimizers
from keras import losses
from keras import metrics
from keras.utils.np_utils import to_categorical

In [None]:
## Make convolutional neural net reshaped versions
## of the training and validation data
X_tt_conv = 
X_val_conv = 

In [None]:
## Make an empty sequential model
model1 = 

## Add the convolutional layer here
model1.add( layers.Conv2D() )

## Add the pooling layer here
model1.add( layers.MaxPooling2D() )

## Add the flatten layer
model1.add(layers.Flatten())

## Add the feed forward layer, use 64 nodes
model1.add()

## Add the output layer
model1.add()

## Same compile step from notebook 11
model1.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
## fit the model for 25 epochs, this can take a bit
n_epochs=23
history1 = model1.fit(X_tt_conv, 
                      to_categorical(y_tt), 
                      epochs=n_epochs, 
                      batch_size=512,
                      validation_data=(X_val_conv,to_categorical(y_val)))

history_dict1 = history1.history

In [None]:
## Plot the training set accuracy and the validation set accuracy
## against the number of epochs trained
plt.figure(figsize=(14,8))

plt.scatter(range(1,n_epochs+1), 
            , 
            label="Training Data")
plt.scatter(range(1,n_epochs+1), 
            , 
            marker='v',
            label="Validation Data")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

##### 5. Adding in padding

Add in the `padding='same'` argument to the convolutional layer from the network above. Fit this network and compare the validation accuracies for `model2` and `model1`.

In [None]:
## Make an empty sequential model
model2 = models.Sequential()

## Add the convolutional layer here, don't forget the padding


## Add the pooling layer here


## Add the flatten layer


## Add the feed forward layer, use 64 nodes


## Add the output layer


## Same compile step from notebook 11
model2.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
history2 = model2.fit(X_tt_conv, 
                      to_categorical(y_tt), 
                      epochs=n_epochs, 
                      batch_size=512,
                      validation_data=(X_val_conv,to_categorical(y_val)))

history_dict2 = history2.history

In [None]:
## Plot the validation set accuracy for model1 and model2
## against the number of epochs trained
plt.figure(figsize=(14,8))

plt.scatter(range(1,n_epochs+1), 
            , 
            label="No Padding")
plt.scatter(range(1,n_epochs+1), 
            , 
            marker='v',
            label="Padding")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

How does the performance compare?

##### Write here




##### 6. Adding a dropout layer

Sometimes while building convolutional or recurrent neural networks you will add what is known as a <i>dropout</i> layer before the final feed forward layer of the network.

A dropout layer will randomly turn off input nodes with a probability that you select when setting up the network. For example a dropout layer with probability $0.25$ will turn each of the input nodes to $0$ with probability $0.25$. 

This may seem counterintuitive because we will be getting rid of some of the work the previous layers of our network have done. However, neural networks have a ton of parameters, meaning that they tend to overfit on the training data. By randomly turning some nodes to $0$ we lessen the networks ability to overfit, which may in turn improve performance on observations not included in the training set.

For `model3` use your results from above to choose either `model1` or `model2` and then add a dropout layer between  the `.Flatten()` layer and the `Dense(64)` layer. Dropout layers can be inserted in `keras` with `layers.Dropout(dropout_probability)`, <a href="https://keras.io/api/layers/regularization_layers/dropout/">https://keras.io/api/layers/regularization_layers/dropout/</a>.

In [None]:
## Make an empty sequential model
model3 = models.Sequential()

## Add the convolutional layer here



## Add the pooling layer here



## Add the flatten layer



## Add the dropout layer, set the dropout probability to .5


## Add the feed forward layer, use 64 nodes



## Add the output layer


## Same compile step from notebook 11
model3.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
history3 = model3.fit(X_tt_conv, 
                      to_categorical(y_tt), 
                      epochs=n_epochs, 
                      batch_size=512,
                      validation_data=(X_val_conv,to_categorical(y_val)))

history_dict3 = history3.history

In [None]:
## Plot the validation set accuracy for your chosen model and model3
## against the number of epochs trained
plt.figure(figsize=(14,8))

plt.scatter(range(1,n_epochs+1), 
            , 
            label="No Dropout")
plt.scatter(range(1,n_epochs+1), 
            , 
            marker='v',
            label="With Dropout")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

How does the model with dropout compare to the equivalent model without dropout?

##### Write here




##### 6. Choosing a final convolutional neural network model

Select one convolutional neural network model from the three models we have considered in this notebook.

Remake the model and train it to the appropriate number of epochs.

In [None]:
## Make an empty sequential model
cnn_final_model = models.Sequential()

## Add the convolutional layer here


## Add the pooling layer here


## Add the flatten layer


## Add the dropout layer, set the dropout probability to .5


## Add the feed forward layer, use 64 nodes


## Add the output layer


## Same compile step from notebook 11
cnn_final_model.compile(optimizer = 'rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['accuracy'])

In [None]:
history_cnn_final = cnn_final_model.fit(X_tt_conv, 
                      to_categorical(y_tt), 
                      epochs=20, 
                      batch_size=512,
                      validation_data=(X_val_conv,to_categorical(y_val)))

history_dict_cnn_final = history_cnn_final.history

##### 7. Compare to feed forward

Compare the validation set accuracy of this network to the accuracy from the pre-trained model you loaded.

<i>If you are using `nb11_matts_final_model` you will need to run the data through PCA trained on the training set with `n_components = .99`</i>.

In [None]:
## code here




In [None]:
## code here




In [None]:
## code here




In [None]:
## Write the Feed forward validation accuracy here
accuracy_score()

In [None]:
## Write the CNN validation accuracy here
accuracy_score()

##### 8. Performance on the test set

While you are free to fiddle around and try additional neural networks, this problem assumes that you have landed on a final network.

Gauge the performance of your final network on the test set.

Compare it to its performance on the training set.

In [None]:
## Note you first have to retrain the model on the entire training set
del cnn_final_model


## Make an empty sequential model
cnn_final_model = models.Sequential()

## copy and paste your final model here



In [None]:
history_cnn_final = cnn_final_model.fit(X_train.reshape(-1, 28, 28, 1), 
                      to_categorical(y_train), 
                      epochs=20, 
                      batch_size=512)

history_dict_cnn_final = history_cnn_final.history

In [None]:
print("Training set accuracy")
print("++++++++++++++++++++++++++++")


In [None]:
print("Test set accuracy")
print("++++++++++++++++++++++++++++")


--------------------------

This notebook was written for the Erd&#337;s Institute C&#337;de Data Science Boot Camp by Matthew Osborne, Ph. D., 2022.

Any potential redistributors must seek and receive permission from Matthew Tyler Osborne, Ph.D. prior to redistribution. Redistribution of the material contained in this repository is conditional on acknowledgement of Matthew Tyler Osborne, Ph.D.'s original authorship and sponsorship of the Erdős Institute as subject to the license (see License.md)