# Fall Problem Session 11
## CIFAR-10 I

In this notebook you will work on problems that relate to our neural network content. In particular, this material will touch on the following lecture notebooks:
- `Lectures/Neural Networks/1. Perceptrons`,
- `Lectures/Neural Networks/2. The MNIST Data Set`,
- `Lectures/Neural Networks/3. Multilayer Neural Networks` and
- `Lectures/Neural Networks/4. keras`.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

#### 1. Load and inspect the data

In this notebook you will work to build neural networks to classify the images found in the CIFAR-10 collection. Before building any models you will need to load and get to know the data.

##### a. 

Import `cifar10` from `keras.datasets`.

##### b.

Look throught the `keras` documentation on the `cifar10` data to see how to load the training and test data.

Documentation: <a href="https://keras.io/api/datasets/cifar10/">https://keras.io/api/datasets/cifar10/</a>

<i>Note: The step of loading the data may take a while if this is your first time loading the data.</i>

In [None]:
## Load the data here



In [None]:
## run this code chunk, you'll use it in a bit.
label_dict = {0:'airplane',
                 1:'automobile',
                 2:'bird',
                 3:'cat',
                 4:'deer',
                 5:'dog',
                 6:'frog',
                 7:'horse',
                 8:'ship',
                 9:'truck'}

##### c.

The CIFAR (Canadian Institute for Advanced Research) 10 data are a collection of $60{,}000$ $32\times 32$ pixelated color images each being an instance of one of the 10 possible classes listed above. Each of the ten classes has $6{,}000$ instances ($5{,}000$ in the training set, $1{,}000$ in the test set). Here is a link to the documentation for this data set, <a href="https://www.cs.toronto.edu/~kriz/cifar.html">https://www.cs.toronto.edu/~kriz/cifar.html</a>.

- Look at the shape of `X_train`. 
- Print out the first observation of the training set.
- Then run the given code chunk to see some example images.

In [None]:
## Look at the shape here


In [None]:
## print out the first observation



In [None]:
## extra code chunk if you need it



In [None]:
np.random.seed(38401)

fig,ax = plt.subplots(5,2,figsize=(14, 30))

j = 0
for i in np.random.choice(range(len(y_train)), 10):
    ax[j//2, j%2].imshow(X_train[i,:,:])
    
    ax[j//2, j%2].set_title("Image of " + label_dict[y_train[i][0]])
    
    j = j + 1
    
plt.show()

#### 3. Prepare the data

We will need to prepare the data before we can build a model.

##### a.

Just like we did for the grayscale MNIST images you will need to scale the pixels so they range from $0$ to $1$. Each of the RGB pixel values have a minimum value of $0$ and a maximum value of $255$.

Scale the data in the code cells provided below.

##### b.

In this notebook you will use a feed forward neural network, that means you will need to reshape the array to be a 2D `numpy` array where each row is an observation and each column is one of the pixel RGB values.

Reshape the data in the cells provided below.

In [None]:
X_train_r = 
X_test_r = 

##### c.

This is an instance where cross-validation would take too long for the purposes of the problem session. We will instead use the validation set approach for model comparisons. Make a validation split of the training set. Use $15\%$ of the data for the validation set.

#### 4. Your first neural network

In this problem you will build your first neural network.

##### a.

Import all of the `keras` stuff you need to build a feed forward neural network. 

In [None]:
## Import here
from keras import 




## For y when training the network
from keras.utils.np_utils import to_categorical


### If you have an earlier version of keras ###
# from keras.utils import to_categorical

##### b.

Fill in the missing code chunks below to build a feed forward neural network with a single hidden layer with $50$ nodes.

<i>Note: if this network takes a long time to train on your computer, feel free to make the hidden layer smaller. That will speed up the training steps a little bit.</i>

In [None]:
## Create an empty model object here
model1 =  

## Add the Dense layer with 50 nodes,
## remember to specify the activation function and the input_shape
model1.add()

## Add the Dense output layer, how many nodes should this have?
## what should the activation function be?
model1.add()


## Compile the network here with
## the 'rmsprop' optimizer, the 'categorical_crossentropy' loss and
## 'accuracy' as the only metric
model1.compile(optimizer = ,
                 loss = ,
                 metrics = )

## You'll train the model for 40 epochs
n_epochs = 40

## fit the model here, don't forget to place the
## ys in to_categorical
## use a batch_size of 512
## don't forget to include the validation_data
history1 = model1.fit()

In [None]:
## Plot the training and validation accuracies here
history_dict1 = history1.history

sns.set_style("whitegrid")

plt.figure(figsize=(10,6))

plt.scatter(range(1,n_epochs+1), 
            history_dict1['accuracy'], 
            label="Training Data")
plt.scatter(range(1,n_epochs+1), 
            history_dict1['val_accuracy'], 
            marker='v',
            label="Validation Data")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

#### 5. Additional neural networks

In this problem you will make a couple more feed forward networks to try and improve upon the performance of your first network.

##### a.

Try making another feed forward network with a single layer. Increase the size of the hidden layer, as compared to model 1.

Does it outperform model 1?

In [None]:
## Build your model here
## Call your history history2 in order for the plot below to work
model2 =  





In [None]:
## Plot the validation accuracies of models 1 and 2 here
## Make sure your history variable was called history2, if not you'll need to
## change the name here
history_dict2 = history2.history

plt.figure(figsize=(10,6))

plt.scatter(range(1,n_epochs+1), 
            history_dict1['val_accuracy'], 
            label="Model 1")
plt.scatter(range(1,n_epochs+1), 
            history_dict2['val_accuracy'], 
            marker='v',
            label="Model 2")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

##### b. 

Now try making a feed forward network with two hidden layers. Choose whatever size you would like for those layers. How does this compare to the other two models you have made?

In [None]:
## Create an empty model object here
## Call your history history3 in order for the plot below to work
model3 =  

In [None]:
## Plot the validation accuracies of all three models here
## Make sure your history variable was called history3, if not you'll need to
## change the name here
history_dict3 = history3.history

plt.figure(figsize=(10,6))

plt.scatter(range(1,n_epochs+1), 
            history_dict1['val_accuracy'], 
            label="Model 1")
plt.scatter(range(1,n_epochs+1), 
            history_dict2['val_accuracy'], 
            marker='v',
            label="Model 2")
plt.scatter(range(1,n_epochs+1), 
            history_dict3['val_accuracy'], 
            marker='x',
            label="Model 3")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

#### 6. Trying more processed data

Now you will explore if pre-processing the data further will improve the model performance.


##### a.

Use the function below to convert the original RGB image into grayscale. Then create a neural network for this new version of the data that has the same architecture as one of your networks from above.

In [None]:
def to_grayscale(X):
    return 0.2989*X[:,:,:,0] + 0.5870*X[:,:,:,1] + 0.1140*X[:,:,:,2]

In [None]:
## use to_grayscale on X_train here
X_train_g = 

## make the grayscale validation split using the same
## random_state as above
X_tt_g, X_val_g, y_tt_g, y_val_g = train_test_split()

In [None]:
## Create an empty model object here
## Make sure your history variable is called history4

## Also, be sure to change the input_shape so it matches the
## dimensions of the grayscale data
model4 =  models.Sequential()



In [None]:
## Plot the validation accuracy of the RGB version of the model
## and the grayscale version of the model here

## Make sure your history variable was called history4, if not you'll need to
## change the name here
history_dict4 = history4.history

plt.figure(figsize=(10,6))

plt.scatter(range(1,n_epochs+1), 
            , 
            label="RGB Model")
plt.scatter(range(1,n_epochs+1), 
            , 
            marker='v',
            label="Grayscale Model")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

##### b. 

Now try running the original RGB image data through PCA that captures $90\%$ of the original data's variance, then build a neural network (with the same architecture as the network in part <i>a.</i>) on that data.

In [None]:
## Import PCA from sklearn

In [None]:
## Make the PCA object
pca = PCA(.9)

## Get the PCA data
## remember to only fit on tt
X_tt_pca = 

## remember to use only .transform here
X_val_pca = 

In [None]:
## Create an empty model object here
## Make sure your history variable was called history5

## Also, be sure to change the input_shape so it matches the
## dimensions of the PCA transformed data
model5 =  



In [None]:
## Plot the validation accuracy of the Non-PCA version of the model
## and the PCA version of the model here

## Make sure your history variable was called history5, if not you'll need to
## change the name here
history_dict5 = history5.history


plt.figure(figsize=(10,6))

plt.scatter(range(1,n_epochs+1), 
            , 
            label="Non-PCA Model")
plt.scatter(range(1,n_epochs+1), 
            , 
            marker='v',
            label="PCA Model")

plt.xlabel("Epoch", fontsize=18)
plt.ylabel("Accuracy", fontsize=18)

plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

plt.legend(fontsize=14)


plt.show()

#### 7. Saving a model

##### a.

Choose one of the models you built in this session and retrain it to the optimal number of epochs. Store the model in a variable just called `model`.

In [None]:
## Call your model, model
model =  models.Sequential()

## fill in the rest here






##### b.

Running the code below will save your trained model to a file that can be reloaded at a later time.

Save your model, you will return to it in `Fall Problem Session 12`.

In [None]:
#### fill in a model name here
model_filename = ""

### This will save the model to file
model.save(model_filename)

That's all for this notebook. You may have noticed that none of your models are particularly good. This is a difficult classification problem and you will continue working on it in `Fall Problem Session 12`. Perhaps we can improve model performance.

--------------------------

This notebook was written for the Erd&#337;s Institute C&#337;de Data Science Boot Camp by Matthew Osborne, Ph. D., 2022.

Any potential redistributors must seek and receive permission from Matthew Tyler Osborne, Ph.D. prior to redistribution. Redistribution of the material contained in this repository is conditional on acknowledgement of Matthew Tyler Osborne, Ph.D.'s original authorship and sponsorship of the Erdős Institute as subject to the license (see License.md)