# Convolutional Neural Networks

In this notebook, we will try to get a general overview of CNNs and what can be done with them.
We will use the MNIST dataset.
At the end of the notebook as an extra side, you can also try to implement something similar by loading the CIFAR-10 dataset.

Please note that this notebook is not an advanced implementation of CNNs. It is just for you to learn ho to implement from scratch a simple CNN, without using any pre-trained network.

## MNIST Dataset

The MNIST dataset is a large database of handwritten digits. It contains 60,000 training images and 10,000 testing images

### Data Preparation

** Import the packages that you may need.**

In [1]:
from __future__ import absolute_import, division, print_function
import numpy as np
import keras
from keras.datasets import cifar10, mnist
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Reshape
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras.utils import to_categorical
#from seansUtils.research import StatsCallback, ModelSummary
import pickle
from matplotlib import pyplot as plt
import seaborn as sns
plt.rcParams['figure.figsize'] = (15, 8)

%matplotlib inline

** Load the MNIST dataset.**

In [32]:
(train_images, train_labels),(test_images, test_labels) = mnist.load_data()
train_images.shape

(60000, 28, 28)

** Perform some data pre-processing on both input and labels. Hint: reshape the input with dimension (28,28,1)**

In [33]:
train_images = train_images.reshape(train_images.shape[0],28,28,1).astype('float32')
test_images = test_images.reshape(test_images.shape[0], 28, 28,1).astype('float32')
train_images /=255
test_images /=255
train_labels = to_categorical(train_labels,10)
test_labels = to_categorical(test_labels,10)

** Print the shape of the data and some sample to visualize them.**

In [34]:
# Print the Data
print('--- THE DATA ---')
print('train_images shape:', train_images.shape)
print(train_images.shape[0], 'train samples')
print(test_images.shape[0], 'test samples')
print(train_images.shape)
print(train_labels.shape)

--- THE DATA ---
train_images shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
(60000, 28, 28, 1)
(60000, 10)


## Vanilla CNN

This is the most basic CNN: you will have to build a convolutional neural network that is composed by 2 Convolutional layers and 2 Fully Connected layers. Use proper activation functions.

** Set the number of batches and epochs.**

In [35]:
batch_size = 64
epochs = 5


** Build the Vanilla CNN model. **

In [36]:
model_vanilla = Sequential()

# 1st Conv Layer
model_vanilla.add(Convolution2D(32, 3, 3, input_shape=(28, 28, 1)))
model_vanilla.add(Activation('relu'))

# 2nd Conv Layer
model_vanilla.add(Convolution2D(32, 3, 3))
model_vanilla.add(Activation('relu'))

# Fully Connected Layer
model_vanilla.add(Flatten())
model_vanilla.add(Dense(128))
model_vanilla.add(Activation('relu'))

# Prediction output Layer
model_vanilla.add(Dense(10))
model_vanilla.add(Activation('softmax'))

** Get a summary of the model. **

In [37]:
model_vanilla.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_8 (Conv2D)            (None, 9, 9, 32)          320       
_________________________________________________________________
activation_14 (Activation)   (None, 9, 9, 32)          0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 3, 3, 32)          9248      
_________________________________________________________________
activation_15 (Activation)   (None, 3, 3, 32)          0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 288)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 128)               36992     
_________________________________________________________________
activation_16 (Activation)   (None, 128)              

** Configure the model with an optimizer and a loss. **

In [38]:
model_vanilla.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

** Train the model. **

In [39]:

model_vanilla.fit(train_images, train_labels, epochs, batch_size)

Epoch 1/64
Epoch 2/64
Epoch 3/64
Epoch 4/64
Epoch 5/64
Epoch 6/64
Epoch 7/64
Epoch 8/64
Epoch 9/64
Epoch 10/64
Epoch 11/64
Epoch 12/64
Epoch 13/64
Epoch 14/64
Epoch 15/64
Epoch 16/64
Epoch 17/64
Epoch 18/64
Epoch 19/64
Epoch 20/64
Epoch 21/64
Epoch 22/64
Epoch 23/64
Epoch 24/64
Epoch 25/64
Epoch 26/64
Epoch 27/64
Epoch 28/64
Epoch 29/64
Epoch 30/64
Epoch 31/64
Epoch 32/64
Epoch 33/64
Epoch 34/64
Epoch 35/64
Epoch 36/64
Epoch 37/64
Epoch 38/64
Epoch 39/64
Epoch 40/64
Epoch 41/64
Epoch 42/64
Epoch 43/64
Epoch 44/64
Epoch 45/64
Epoch 46/64
Epoch 47/64
Epoch 48/64
Epoch 49/64
Epoch 50/64
Epoch 51/64
Epoch 52/64
Epoch 53/64
Epoch 54/64
Epoch 55/64
Epoch 56/64
Epoch 57/64
Epoch 58/64
Epoch 59/64
Epoch 60/64
Epoch 61/64
Epoch 62/64
Epoch 63/64
Epoch 64/64


<tensorflow.python.keras.callbacks.History at 0x7fc7d2e1ecd0>

### CNN with Max Pooling and Dropout

Let's implement the same CNN as above but plus Max Pooling and Dropout.

**Build the new network with max pooling and dropout. You should think a little bit where Max Pooling and Dropout should be inserted. **

In [41]:
model_vanilla_pooling = Sequential()

# 1st Convolutional Layer
model_vanilla_pooling.add(Convolution2D(32, 3, 3, input_shape=(28, 28, 1)))
model_vanilla_pooling.add(Activation('relu'))

# 2nd Convolutional Layer
model_vanilla_pooling.add(Convolution2D(32, 3, 3))
model_vanilla_pooling.add(Activation('relu'))

# Max Pooling
model_vanilla_pooling.add(MaxPooling2D(pool_size=(2,2)))
    
# Dropout
model_vanilla_pooling.add(Dropout(0.25))

# Fully Connected Layer
model_vanilla_pooling.add(Flatten())
model_vanilla_pooling.add(Dense(128))
model_vanilla_pooling.add(Activation('relu'))
    
# More Dropout
model_vanilla_pooling.add(Dropout(0.5))

# Fully Connected Layer for Prediction
model_vanilla_pooling.add(Dense(10))
model_vanilla_pooling.add(Activation('softmax'))

** Get a summary of the model. **

In [42]:
model_vanilla_pooling.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_12 (Conv2D)           (None, 9, 9, 32)          320       
_________________________________________________________________
activation_20 (Activation)   (None, 9, 9, 32)          0         
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 3, 3, 32)          9248      
_________________________________________________________________
activation_21 (Activation)   (None, 3, 3, 32)          0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 1, 1, 32)          0         
_________________________________________________________________
dropout (Dropout)            (None, 1, 1, 32)          0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 32)               

** Configure the network. **

In [43]:
model_vanilla_pooling.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

** Train the network. **

In [44]:

model_vanilla_pooling.fit(train_images, train_labels, epochs, batch_size)

Epoch 1/64
Epoch 2/64
Epoch 3/64
Epoch 4/64
Epoch 5/64
Epoch 6/64
Epoch 7/64
Epoch 8/64
Epoch 9/64
Epoch 10/64
Epoch 11/64
Epoch 12/64
Epoch 13/64
Epoch 14/64
Epoch 15/64
Epoch 16/64
Epoch 17/64
Epoch 18/64
Epoch 19/64
Epoch 20/64
Epoch 21/64
Epoch 22/64
Epoch 23/64
Epoch 24/64
Epoch 25/64
Epoch 26/64
Epoch 27/64
Epoch 28/64
Epoch 29/64
Epoch 30/64
Epoch 31/64
Epoch 32/64
Epoch 33/64
Epoch 34/64
Epoch 35/64
Epoch 36/64
Epoch 37/64
Epoch 38/64
Epoch 39/64
Epoch 40/64
Epoch 41/64
Epoch 42/64
Epoch 43/64
Epoch 44/64
Epoch 45/64
Epoch 46/64
Epoch 47/64
Epoch 48/64
Epoch 49/64
Epoch 50/64
Epoch 51/64
Epoch 52/64
Epoch 53/64
Epoch 54/64
Epoch 55/64
Epoch 56/64
Epoch 57/64
Epoch 58/64
Epoch 59/64
Epoch 60/64
Epoch 61/64
Epoch 62/64
Epoch 63/64
Epoch 64/64


<tensorflow.python.keras.callbacks.History at 0x7fc7ba2c3eb0>

** Evaluate the model on the test data. **

In [45]:
test_loss, test_acc = model_vanilla_pooling.evaluate(test_images, test_labels)



** Print the test accuracy. **

In [46]:
print(test_acc)

0.9391000270843506


** Evaluate the model on the test data. **

** Print the test accuracy. **

## Extra: More complex CNN with CIFAR-10

As an extra part, you can also load the CIFAR-10 dataset, perform a similar data pre-processing as the MNIST dataset and implement a proper CNN. In this case, the dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. Therefore you will need a network that is a little bit deeper, with 4 convolution layer. 

This part is not guided as the previous one, it's up to you to start from scratch and try out the implementation. However the procduere is pretty similar.

In [None]:
from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
train_images = train_images.reshape(train_images.shape[0],28,28,1).astype('float32')
test_images = test_images.reshape(test_images.shape[0], 28, 28,1).astype('float32')
train_images /=255
test_images /=255
train_labels = to_categorical(train_labels,10)
test_labels = to_categorical(test_labels,10)

model_vanilla_pooling = Sequential()

# 1st Convolutional Layer
model_vanilla_pooling.add(Convolution2D(32, 3, 3, input_shape=(32, 32, 3)))
model_vanilla_pooling.add(Activation('relu'))

# 2nd Convolutional Layer
model_vanilla_pooling.add(Convolution2D(32, 3, 3))
model_vanilla_pooling.add(Activation('relu'))

# Max Pooling
model_vanilla_pooling.add(MaxPooling2D(pool_size=(2,2)))
    
# Dropout
model_vanilla_pooling.add(Dropout(0.25))

# Fully Connected Layer
model_vanilla_pooling.add(Flatten())
model_vanilla_pooling.add(Dense(128))
model_vanilla_pooling.add(Activation('relu'))
    
# More Dropout
model_vanilla_pooling.add(Dropout(0.5))

# Fully Connected Layer for Prediction
model_vanilla_pooling.add(Dense(10))
model_vanilla_pooling.add(Activation('softmax'))

model_vanilla_pooling.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model_vanilla_pooling.fit(train_images, train_labels, epochs, batch_size)



In [None]:
model_vanilla_pooling.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model_vanilla_pooling.fit(train_images, train_labels, epochs, batch_size)


In [None]:
test_loss, test_acc = model_vanilla_pooling.evaluate(test_images, test_labels)
print(test_acc)

In [56]:
y_train.shape

(50000, 1)

In [57]:
train_labels.shape

(60000, 10, 10, 10)