# Comparing Performance of Different Types of Neural Networks

**Time**
- Teaching: 15 minutes

**Questions**
- "How do vanilla neural networks compare to convolutional neural networks?"

* * * * *

### Convolutional Neural Networks
The neural networks we have created so far are known as *vanilla neural networks* also known as *fully-connected, feed-foward neural networks*. 

These have many great usecases, but for problems in computer vision, we often use a different architecture called covolutional neural networks (CNNS).

We will review the the details of how the work in the slides, but for now let's just compare their efficacy in image classification to vanilla neural nets by:
1. Loading a vanilla neural network we used in the last notebook
2. Building a convolutional neural network
3. Comparing the classification accuracy between the two models

### Imports

In [None]:
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from tensorflow.keras.utils import to_categorical
from keras.models import load_model

import os
import matplotlib.pyplot as plt
import numpy as np

### Input data 

In [None]:
def get_mnist_data(subset=True):
    """
    Returns the MNIST dataset as a tuple:
    (x_train, y_train, x_val, y_val, x_test, y_test)
    
    When subset=TRUE:
    Returns only a subset of the mnist dataset.
    Especially important to use if you are on datahub and only have 1-2GB of memory.
    """
    
    if subset:
        N_TRAIN = 5000
        N_VALIDATION = 1000
        N_TEST = 1000
    else:
        N_TRAIN = 48000
        N_VALIDATION = 12000
        N_TEST = 10000
    
    (x_train_and_val, y_train_and_val), (x_test, y_test) = mnist.load_data()
    
    x_train = x_train_and_val[:N_TRAIN,:,:]
    y_train = y_train_and_val[:N_TRAIN]
    
    x_val = x_train_and_val[N_TRAIN: N_TRAIN + N_VALIDATION,:,:]
    y_val = y_train_and_val[N_TRAIN: N_TRAIN + N_VALIDATION]
    
    x_test = x_test[:N_TEST]
    y_test = y_test[:N_TEST]
    
    return x_train, y_train, x_val, y_val, x_test, y_test
    
    

In [None]:
x_train, y_train, x_val, y_val, x_test, y_test = get_mnist_data(subset=True)

### Transformation to one dimension

This is the same transformation we used in the last notebook to flatten our image pixels to one dimension for use in a vanilla neural network.

In [None]:
def one_dim_transform_data(xdata, ydata):
    """
    Transforms image data:
        1. Flattens pixel dimensions from 2 -> 1
        2. Scales pixel values between [0,1]
    Transforms target data (ydata):
        - Formats targets as one hot encoded columns
    """
    
    x = {}
    for name, partition in zip(["x_train", "x_val", "x_test"],xdata):
        flatten = partition.reshape((partition.shape[0], 28 * 28))
        scaled = flatten.astype('float32') / 255
        x[name] = scaled
    
    y = {}
    for name, partition in zip(["y_train", "y_val", "y_test"],ydata):
        y[name] = to_categorical(partition)
    
    return x['x_train'], y['y_train'], x['x_val'], y['y_val'], x['x_test'], y['y_test']

In [None]:
x_train_trans, y_train_trans, x_val_trans, y_val_trans, x_test_trans, y_test_trans = one_dim_transform_data([x_train, x_val, x_test],
                                                                                                    [y_train, y_val, y_test])

### Loading saved models

In [None]:
model_filename = os.path.join("..", "data", "third_nn")

vanilla_model = load_model(model_filename)

### Transformation back to two dimensions
While vanilla neural networks primarily handle one-dimensional input data, convolutional neural networks work well on multidimensional input data!

We will backtransform our 1-dimensional data into 2-dimensions.

*Note* - We also must add a depth/channel dimension to our data. Color pictures have 3 channels for Red, Green, and Blue while our black and white mnist images only have 1.

In [None]:
def back_transform_2d(data):
    """
    Takes a list of flattened input pixel data.
    Reshapes pixel data from a single vector to two dimensions.
    """
    
    two_dimensional_data = []
    
    for d in data:
        # reshape to [index for image, pixel row, pixel column, channels]
        transformed = d.reshape(d.shape[0], 28, 28, 1)
        two_dimensional_data.append(transformed)
    
    return [t for t in two_dimensional_data]

In [None]:
x_train_2d, x_val_2d, x_test_2d = back_transform_2d([x_train_trans, x_val_trans, x_test_trans])

Let's confirm that we succesfully reshaped our data back to 28x28 pixels.

In [None]:
x_train_trans.shape

In [None]:
x_train_2d.shape

Success!

### Convolutional neural network

Ignore the details regarding implementation for now, just know we are building a rather small convolutional neural network here to compare with our vanilla neural network.

In [None]:
convnet = Sequential()

convnet.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
convnet.add(MaxPooling2D((2, 2)))
convnet.add(Conv2D(64, (3, 3), activation='relu'))
convnet.add(MaxPooling2D((2, 2)))
convnet.add(Conv2D(64, (3, 3), activation='relu'))

convnet.add(Flatten())
convnet.add(Dense(64, activation='relu'))
convnet.add(Dense(10, activation='softmax'))

convnet.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

### Comparing architectures and number of parameters

In [None]:
vanilla_model.summary()

In [None]:
convnet.summary()

Notice any interesting differences between the two model architectures?

### Convolutional Neural Network Training

In [None]:
convnet_history = convnet.fit(x_train_2d,
                      y_train_trans, 
                      epochs=10,
                      batch_size=64,
                      validation_data=(x_val_2d, y_val_trans))

### Accuracy over epochs

In [None]:
def plot_epoch_accuracy(history_dict):
    """
    Plots the training and validation accuracy of a neural network.
    """
    
    acc = history_dict['accuracy']
    val_acc = history_dict['val_accuracy']
    epochs = range(1, len(acc) + 1)
    plt.plot(epochs, acc, color = 'navy', alpha = 0.8, label='Training Accuracy')
    plt.plot(epochs, val_acc, color = 'green', label='Validation Accuracy')
    plt.title('Training and validation Accuracy')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()
    return plt.show()

In [None]:
plot_epoch_accuracy(convnet_history.history)

### Accuracy on the test data
So how well does our CNN perform on the test data?

In [None]:
def get_model_accuracy(model, x_test, y_test):
    """
    Takes a model and a test set of data.
    Returns the accuracy.
    """
    
    score = model.evaluate(x_test, y_test, verbose=0)
    
    accuracy = round(score[1]*100, 1)
    
    return accuracy



In [None]:
vanilla_accuracy = get_model_accuracy(vanilla_model, x_test_trans, y_test_trans)
convnet_accuracy = get_model_accuracy(convnet,x_test_2d, y_test_trans)

In [None]:
print(f"Classification accuracy results: \n\nVanilla Neural Network: {vanilla_accuracy}%\nConvolution Neural Network: {convnet_accuracy}%")

In [None]:
def plot_wrong_predictions(model, x_test, y_test, title = ""):
    """
    Plots 16 incorrectly predicted images.
    """
    
    # Back transform images
    x_images = x_test.reshape(x_test.shape[0], 28, 28)
    
    # Format predictions and targets
    predictions = model.predict(x_test)
    predicted = np.argmax(predictions, axis=1)
    target = np.argmax(y_test, axis = 1)
    
    # Get wrong indices
    wrong_indices = np.where(predicted != target)[0]
    
    fig, axes = plt.subplots(4,4, figsize = (30,30))
    fig.suptitle(title, fontsize=30)
    
    axes = axes.ravel()
    
    for ax, index in zip(axes, wrong_indices[:17]):
        ax.imshow(x_images[index], cmap=plt.cm.binary, interpolation='nearest')
        ax.set_title(f"Predicted {predicted[index]}, Actual is {target[index]}", size = 25)
        ax.axis('off')
    
    return plt.show()

### Comparing wrong predictions

Let's visualize incorrect wrong predictions between these two models and see if we can get some insight into how reasonable these mistakes are.

In [None]:
plot_wrong_predictions(vanilla_model, x_test_trans, y_test_trans, title = "Wrong Predictions in Vanilla Neural Networks")

In [None]:
plot_wrong_predictions(convnet, x_test_2d, y_test_trans, title = "Wrong Predictions in Convolutional Neural Networks")