# Practice 10: Kaggle's CIFAR10 with CNNs

Use this notebook as the starting point for the Practice activities.

Student Name:    **[  Put your Name Here ]**


 [Video Walkthough of Practice9.]



# Section 0

=== *You must run this section to set up things for any of the sections below * ===
### Setting up Python tools



We'll use three libraries for this tutorial: 
- [pandas](http://pandas.pydata.org/) : dataframes for spreadsheet-like data analysis, reading CSV files, time series
- [numpy](http://www.numpy.org/) : for multidimensional data and linear algebra tools
- [matplotlib](http://matplotlib.org/) : Simple plotting and graphing
- [seaborn](http://stanford.edu/~mwaskom/software/seaborn/) : more advanced graphing
-  [scikit-learn](https://scikit-learn.org/stable/) : provides many machine learning algorithms and tools to training and test.




In [5]:
# First, we'll import pandas and numpy, two data processing libraries
import pandas as pd
import numpy as np

# We'll also import seaborn and matplot, twp Python graphing libraries
import seaborn as sns
import matplotlib.pyplot as plt
# Import the needed sklearn libraries
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.preprocessing import LabelEncoder

# The Keras library provides support for neural networks and deep learning
print ("====== This should generate a FutureWaring on Conversion ===== ignore this warning")
import keras
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Lambda, Flatten, LSTM
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import Adam, RMSprop
from keras.utils import np_utils

# We will turn off some warns in this notebook to make it easier to read for new students
import warnings
warnings.filterwarnings('ignore')



# Section 2: CNN for Kaggle Digit Recognition Challenge

We will apply Convolutional Neural Networks (CNNs) to the Kaggle Digit Challenge.

First, we will read in the digit images from Kaggle.

## Task 2: CNN Layers

For an overview of CNNs, see [MIT 6.S191: Convolutional Neural Networks](https://youtu.be/H-HVZJ7kGI0?t=1132). While the entire video is good, the key description of CNN layers start at 19:00.

We will use the following Keras pre-built layers to build our CNN.

- **Conv2D**(16, (3, 3), activation='relu')
 - 16 filters, each one 3x3 pixels with default stride of 1
- **MaxPooling2D**(pool_size=(2, 2))
 - 2x2 max pooling filter with default stride of 2
- **Dropout**(0.25)
 - Randomly ignore 25% of the weights
- **Flatten**()
 - Convert a 2D layer into a 1D layer
- **Dense**(32, activation='relu')
 - Standard fully connected layer we have used before

One possibly configuration would be:

```
NN = Sequential()
NN.add(Conv2D(8, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)))
NN.add(Conv2D(8, (3, 3), activation='relu'))
NN.add(MaxPooling2D(pool_size=(2, 2)))
NN.add(Conv2D(16, (3, 3), activation='relu'))
NN.add(Flatten())
NN.add(Dense(32, activation='relu'))
NN.add(Dense(output_Size, activation='softmax'))
```

Describe what changes to the CNN layers you will make. Options include:

1.   Adding more or less filters in in Conv2D layer. The first parameter is the number of filters at that level.
2.   Add more Conv2D layers. It is common to stack two to four layers together.
3.   Consider trying larger or smaller filters. While 3x3 pixel filters are common, filters range from 1x1 to 7x7.
4.   Try more MaxPooling2D layers
5.   Add some DropOut layers to combat over fitting


*Note: You should not change the input or output layers, they are fixed by our problem definition*


In [12]:
# Set up the Neural Network
input_Size = 32 * 32 * 3    # images are 28 x 28 pixels or 784 pixels
output_Size = 10

NN = Sequential()
NN.add(Conv2D(8, kernel_size=(3, 3), activation='relu', input_shape=(32,32,3)))
NN.add(Conv2D(8, (3, 3), activation='relu'))
NN.add(MaxPooling2D(pool_size=(2, 2)))
#NN.add(Dropout(0.25))
NN.add(Conv2D(16, (3, 3), activation='relu'))
NN.add(Flatten())
NN.add(Dense(32, activation='relu'))
NN.add(Dense(output_Size, activation='softmax'))
print ("Neural Network Model created")
NN.summary()

# Compile neural network model
NN.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Neural Network Model created
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 30, 30, 8)         224       
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 28, 28, 8)         584       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 14, 14, 8)         0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 12, 12, 16)        1168      
_________________________________________________________________
flatten_2 (Flatten)          (None, 2304)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 32)                73760     
_________________________________________________________________
dense_4 (Dense)              (None, 10)        

In [None]:
from keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint

learning_rate_reduction = ReduceLROnPlateau(monitor='loss', 
                                            patience=5, 
                                            verbose=2, 
                                            factor=0.5,                                            
                                            min_lr=0.000001)

early_stops = EarlyStopping(monitor='loss', 
                            min_delta=0, 
                            patience=20, 
                            verbose=2, 
                            mode='auto')

checkpointer = ModelCheckpoint(filepath = 'cis3115_MNIST.{epoch:02d}-{accuracy:.6f}.hdf5',
                               verbose=2,
                               save_best_only=True, 
                               save_weights_only = True)


In [14]:
# Read data from the actual Kaggle download files stored in a raw file in GitHub
github_folder = 'https://raw.githubusercontent.com/CIS3115-Machine-Learning-Scholastica/CIS3115ML-Units7and8/master/petfinder-adoption/'
local_folder = './CIFAR10/'

data_folder = local_folder
# Uncomment the next line to switch from using the github files to the kaggle files for a submission
#data_folder = kaggle_folder

train_folder =  data_folder + 'train_images_by_cat'
validate_folder = data_folder + 'validate_images_by_cat'
sample_submission = pd.read_csv(data_folder + 'sampleSubmission.csv')

print ("Reading training images from: " ,train_folder)
print ("Reading validation images from: " ,validate_folder)


Reading training images from:  ./CIFAR10/train_images_by_cat
Reading validation images from:  ./CIFAR10/validate_images_by_cat


In [15]:
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rotation_range=10,
        width_shift_range=0.1,
        height_shift_range=0.1,
        rescale=1./255,
        shear_range=0.1,
        zoom_range=0.1,
        horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)

In [19]:
#batch_size = 8
batch_size = 32

# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
        train_folder,  # this is the target directory
        target_size=(32, 32),  # all images will be resized to 150x150
        batch_size=batch_size,
        class_mode='categorical')  # since we use binary_crossentropy loss, we need binary labels

# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
        validate_folder,
        target_size=(32, 32),
        batch_size=batch_size,
        class_mode='categorical')

Found 45481 images belonging to 10 classes.
Found 5612 images belonging to 10 classes.


## Train the Neural Network

There are 45,481 training images and 5,612 validation testing images. 
- train_generator: The image generator above to use. Can rotate or shift images if desired
- steps_per_epoch=1000: Number of images to process per epoch, will random choose from 45,000 images
- epochs=100: Number of epochs. Should be large enough to train on every image mulitple times
- learning_rate_reduction: Reduce the learning rate if loss does not keep dropping
- early_stops: Stop if loss does not keep dropping
- validation_generator: The image generator to use for validation, not rotations or shifting
- validation_steps: Number of random images to use during validation, we have 5,600 to choose from 
 
### Note: This is a large data set and training may take an hour or so... 

This is why we are only using 10 epochs initially

In [22]:


# Train the model with the images in the folders
history = NN.fit_generator(
        train_generator,
        steps_per_epoch=1000,                    # Number of images to process per epoch 
        epochs=100,                              # Number of epochs
        callbacks=[learning_rate_reduction, early_stops],
        validation_data=validation_generator,
        validation_steps=200 )                  # batch_size
#NN.save_weights('cifar_weights.h5')  # always save your weights after training or during training


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100

Epoch 00019: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100

KeyboardInterrupt: 

## Plot the Training History

We store the performance during training is a variable named 'history'. The x-axis is the training time or number of epochs.

- Accuracy: Accuracy of the predictions, hopefully this is increasing to near 1.0
- Loss: How close the output is to the desired output, this should decrease to near 0.0

In [None]:
# 10. Evaluate model on test data
print ("Running final scoring on test data")
score = NN.evaluate(X_test, y_test, verbose=1)
print ("The accuracy for this model is ", format(score[1], ",.2f"))

# Plot the loss and accuracy curves for training and validation 
fig, ax = plt.subplots(2,1)

ax[0].plot(history.history['acc'], color='b', label="Training accuracy")
ax[0].plot(history.history['val_acc'], color='r',label="Testing accuracy")
ax[0].set_title("Accruacy")
legend = ax[0].legend(loc='best', shadow=True)
              
ax[1].plot(history.history['loss'], color='b', label="Training loss")
ax[1].plot(history.history['val_loss'], color='r', label="Testing loss",axes =ax[1])
ax[1].set_title("Loss")
legend = ax[1].legend(loc='best', shadow=True)
plt.ylim(0,1)

## Create the Submission for Kaggle

The following code generates a file named CIS3115_Submission.csv which you need to download to your local PC and then upload to [Kaggle's Digit Recognition competition](https://www.kaggle.com/c/digit-recognizer/submit).



In [None]:
predictions = NN.predict_classes(X_submit_kaggle, verbose=0)

submissions=pd.DataFrame({"ImageId": list(range(1,len(predictions)+1)), "Label": predictions})

submissions.to_csv("CIS3115_Submission.csv", index=False, header=True)

## Kaggle Submission

Run the code above after training the network above. It will go through the 28,000 submission images and generate an prediction for each. These are saved in a file named "CIS3115_Submission.csv"

**Colab Users: ** The submission file is stored in the Colab files tied to this colab notebook in the Google cloud. 
1. Open the left-side menu by clicking on the > icon near the top-left
2. Select the file tab
3. Hit the Refresh button and the file should be displayed in the list
4. Right-click on the file and choose "Download" and save it to a folder on your PC.

**Juptyter Notebook Users: ** The submission file will be stored in the same folder as your Jupyter notebook file.

Once you have the file, return to  the [Kaggle Digit Recognition challenge](https://www.kaggle.com/c/digit-recognizer) and select the Submit button. Follow the steps to upload your submission and see how it scores.

Record your initial submission score here: _ _ _ _ _ _ _ _ _ _ _ _


## Task 3: Report Best Score

Try finding a good mix of the following:

1. Number and size of convolution layers

1. Number and rate of dropout layers

1. Learning Rate reduction

Submit your best network to the [Kaggle Digit Recognition challenge](https://www.kaggle.com/c/digit-recognizer) and compare it to your original score

Base Kaggle scores here:  98.8%

Best Kaggle scores here:  _ _ _ _ _ _ _ _ _ _



# Wrapping Up

Remember to **share this sheet with your instructo**r and submit a link to it in Blackboard.