# Overview

In this notebook we are going to introduce convolutional neural networks (CNNs) and experiment with some of their key hyper-parameters.

In [None]:
from keras.models import Sequential
from keras.layers import Dense, Conv2D, GlobalMaxPooling2D, MaxPooling2D
from keras.datasets import mnist
from keras.optimizers import SGD
from keras.utils import to_categorical
from keras.applications import VGG16
import numpy as np

# Data

We're going to get the data we will be working with throughout this module, MNIST, and grab the training and testing set. Additionally, we are going to down-sample the training set because the computation time might be a bit much

In [None]:
# Get the training and testing data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Down-sample the training data
rng = np.random.RandomState(17)
idx = rng.choice(np.arange(X_train.shape[0]), size=10000, replace=False)
X_train = X_train[idx, :]
y_train = y_train[idx]

# Row-Major vs Column-Major

In [None]:
# Native NumPy, row-major format
x = np.array([1, 2, 3, 4])
x.reshape((2, 2), order="C")

In [None]:
# Column-major format
x.reshape((2, 2), order="F")

In [None]:
# Compare the performance of functions knowing that Python is row-major
def copy_col(x):
    n = len(x)
    arr = np.empty(shape=(n, n))
    for i in range(n):
        arr[:, i] = x
    
    return None

def copy_row(x):
    n = len(x)
    arr = np.empty(shape=(n, n))
    for i in range(n):
        arr[i, :] = x
    
    return None

In [None]:
# Let's test these functions and compare which is quicker -- this can be easily done using
# %%timeit kernel magic
x = rng.randn(10000)

In [None]:
%timeit copy_col(x)

In [None]:
%timeit copy_row(x)

# Data Pre-Processing

Before we begin training the convolutional network, we need to normalize our data

Since we're working in Python, the data is already stored in row-major format, therefore the first exercise will not make sense. However, if you would like, you can practice converting to column-major format using the techniques shown above.

## Exercise 1

Normalize the data using standard techniques

**HINT**: It is valid with images to simply use the global mean and standard deviation

In [None]:
# Add the last channel to the training and testing data
train_samples, height, width = X_train.shape
X_train = X_train.reshape((train_samples, height, width, 1))

test_samples = X_test.shape[0]
X_test = X_test.reshape((test_samples, height, width, 1))

# We can compute the mean and standard deviation using
# functions from NumPY
mu = X_train.mean()
sigma = X_train.std()

# Apply the normalization for each element of the matrices
X_train = (X_train - mu) / sigma
X_test = (X_test - mu) / sigma

In [None]:
# Finally we need to one-hot encode our training vectors. This be done by typing
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# Convolutional Networks

For this portion of the lecture, we will now practice working with CNNs in Keras

## Exercise 2

For this exercise, I want you to use your knowledge of the Keras API to train a convolutional network with the following properties. You will need the layers: Conv2D and GlobalMaxPooling2D and I want the network to have the following specifications

- One convolutional layer with 32 filters, (3, 3) kernel size and padding = "same"
- One global pooling layer with default settings
- One dense layer with 64 nodes
- Standard SGD optimizer
- ReLU activation function for each layer
- Run the model for 3 epochs with a 25% validation split

In [None]:
_, height, width, channels = X_train.shape

# Define the model architecture
model = Sequential([
    Conv2D(filters=32, kernel_size=(3, 3), activation="relu", 
           padding="same", input_shape=(height, width, channels)),
    GlobalMaxPooling2D(),
    Dense(64, activation="relu"),
    Dense(y_train.shape[1], activation="softmax")
])

# Compile the model
model.compile(optimizer=SGD(), loss="categorical_crossentropy")

# Fit the model
model.fit(X_train, y_train, epochs=3, batch_size=128, validation_split=0.25)

## Exercise 3

Using the kernel size that was assigned to your group, implement a neural network with the same parameters as before except with the specified kernel size. Make sure that you are typing out the full model and not just copying-and-pasting what you have done previously; report the performance on the test set and compare this to the previous model we trained

**The value of this exercise is coding this out yourself. If you would like, copy-paste the code I provided above and try different values for the hyper-parameter**.

## Exercise 4

For the final exercise of this section, I want you to add one more convolutional layer (w/ 32 filters) and a default max_pooling layer between it Use a kernel size of 7 and keep everything else the same; remember to type out the full model so that you're getting used to the API and repor the model's final performance on the test set; compare this to how we did with a single layer convolutional model

In [None]:
# I will provide the model -- you can copy the compile and fit steps since they don't change
model = Sequential([
    Conv2D(filters=32, kernel_size=(7, 7), activation="relu", 
           padding="same", input_shape=(height, width, channels)),
    MaxPooling2d(), # Added the pooling layer between the convolutional ones
    Conv2D(filters=32, kernel_size=(7, 7), activation="relu"), # Added the second convolutional layer
    GlobalMaxPooling2D(),
    Dense(64, activation="relu"),
    Dense(y_train.shape[1], activation="softmax")
])

# Transfer Learning

This is not an exercise, I just want to make you aware of how you can employ this technique because it's a common practice used for real-world problems

In [None]:
# Get the VGG16 weights
vgg = VGG16(include_top=False, input_shape=(128, 128, 3), pooling="max")

# Free the weights in the VGG model
for layer in vgg.layers:
    layer.trainable = False
    
# Create a new model using the VGG16 weights
model = Sequential([
    vgg,
    Dense(128, activation="relu"),
    Dense(1, activation="sigmoid")
])

In [None]:
# Get a summary of the new model
model.summary()