# Convolutional Neural Nets

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/olaiya/MLTutorialNotebooks/blob/master/cnn.ipynb)

# Contents

- [1. Constructing a CNN with Tensorflow](#1.)
    - [1.1 Import required libraries](#1.1)
    - [1.2 Download MNIST dataset](#1.2)
    - [1.3 Build, train and run a simple CNN](#1.3)
    - [1.4 Loss and Accuracy](#1.4)
    - [1.5 Image identification](#1.5)
    - [1.6 Filter and feature maps](#1.6)
- [2. CNN Classification using the CIFAR10 Dataset](#2.)
    - [2.1 Download CIFAR10 dataset](#2.1)
    - [2.2 Constructing a CNN for identifying CIFAR10 colour images](#2.2)
    - [2.3 Evaluating the accuracy](#2.3)
    - [2.4 More layers,less parameters](#2.4)
    - [2.5 Evaluating the test sample](#2.5)
    - [2.6 Generated filters](#2.6)
- [2.7 Exercise: classification of the fashion MNIST dataset ](#3.)

## 1. Constructing a CNN with Tensorflow <a name="1."></a>

Let's look at constructing a convolutional neural net. We can use the same MNIST number dataset we used with our mlp and classify the data using CNNs

### 1.1 Import requied libraries <a name="1.1"></a>

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
from matplotlib import cm

#Want to use version of Tensorflow > 2.0
print('Using Tensorflow version %s' % tf.__version__)

### 1.2 Download MNIST dataset <a name="1.2"></a>

In [None]:
#Load the dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Intensity of pixels ranges from 0-255. We need to scale the values
x_train, x_test = x_train / 255.0, x_test / 255.0


In [None]:
# Reshaping the array to 4-dims so that it can work with the tf.keras.layers.Conv2D API
#The 4 dims are , Num_of Events,image_height, image_width, num_of_channels
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)

#Split into training and validation samples
x_train, x_val = x_train[0:55000], x_train[55000:]
y_train, y_val = y_train[0:55000], y_train[55000:]

### 1.3 Build, train and run a simple CNN <a name="1.3"></a>

Build a simple CNN. We will output a convolution layer using 14 different filters (kernels) of size 3x3 with no padding to the images. We will then apply a 3x3 max pooling filter to the convolutional later. The output we will then feed to a mlp with a hidden layer of 128 nodes and then feed that to an output layer of 10 nodes which we will apply the sofmax activation functions

<img src="images/convLayers.png" alt="mlp" width="800"/>

<img src="images/filter.png" alt="mlp" width="800"/>

In [None]:
# Creating a Sequential Model and adding the layers

model = tf.keras.models.Sequential([
    #tf.keras.layers.Conv2D(number_of_filters, kernel_size=(filter size), input_shape=(height, width, num_of_channels))
    tf.keras.layers.Conv2D(14, kernel_size=(3,3), input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPool2D(pool_size=(3, 3)),
    tf.keras.layers.Flatten(), # Flattening the 2D arrays for fully connected layers
    tf.keras.layers.Dense(128, activation=tf.nn.relu),
    #Don't need softmax here, softmax activation included in  the loss function
    #tf.keras.layers.Dense(10,activation=tf.nn.softmax)
    tf.keras.layers.Dense(10)
])

model.summary()

In [None]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

num_epochs = 20
batchSize = 1000

history = model.fit(x_train, y_train, 
          validation_data=(x_val, y_val),
          batch_size=batchSize,
          epochs=num_epochs)


### 1.4 Loss and Accuracy <a name="1.4"></a>

Lets look at the loss distribution

In [None]:
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', color='red', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

In [None]:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')

test_loss, test_acc = model.evaluate(x_val,  y_val, verbose=2)


Accuracy is high. Can you improve on it?

### 1.5 Image identification <a name="1.5"></a>

Let's look at the test sample and see if we can correctly identify the images

In [None]:
y_test_prediction = model.predict(x_test)

Let's look at the images that were incorrectly identified. Use the .predict method to identify the images in the test sample. Execute the cell below to show some of the images that were incorrectly identified. You can change only_bad=False to show the images that were correctly identified 

In [None]:
nr,nc=10,5
only_bad=True
plt.figure(figsize=(4*nc,2*nr))

image_index,plot_index=-1,0
npass,ntot=0,0
while plot_index < nc*nr and image_index < len(x_test):
    ntot += 1
    image_index += 1
    label = y_test[image_index]
    prediction_outputs = list(y_test_prediction[image_index])
    prediction = prediction_outputs.index(max(prediction_outputs))
    if prediction == label:
        npass += 1
        if only_bad: continue
    else:
        print("image #%i: label = %i, prediction = %i" % (image_index, label, prediction))
    plot_index += 1
    plt.subplot(nr,2*nc,plot_index)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_test[image_index].reshape(28, 28),cmap='Greys')
    plt.xlabel("%i" % label)

    plot_index += 1
    plt.subplot(nr,2*nc,plot_index)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    col=['b']*10
    col[label]='g'
    col[prediction]='r'
    plt.bar(range(10),prediction_outputs,width=1.0,color=col)
    plt.xlabel("%i" % prediction)

print ("failed = %d/%d, accuracy = %.4f%%" % (ntot-npass,ntot,float(npass)/float(ntot)*100.0))

### 1.6 Filters and Feature maps <a name="1.6"></a>

Let's look at the filters the training produced

In [None]:
plt.figure(figsize=(10,10)) 
#Iterate through all layers
for layer in model.layers:
    if 'conv' in layer.name:
        weights, biases = layer.get_weights()
        print(layer.name, weights.shape)
        
        #scale filters        
        f_min, f_max = weights.min(), weights.max()
        filters = (weights - f_min) /(f_max - f_min)
        #iterate through all filters
        for i in range(filters.shape[3]):
            plt.subplot(2,7,i+1)
            filt = filters[:,:,:,i]
            #Select channel
            #print(filt[:,:,0].shape)
            filt_chan = filt[:,:,0]
            plt.imshow(filt_chan,cmap='gray_r',vmin=0)

Let's look at the feature maps produced by the filters. Set up a model to visualise the filter maps. Take an image from the test sample to see how what its filter maps look like

In [None]:
successive_outputs = [layer.output for layer in model.layers[1:]]

visualization_model = tf.keras.models.Model(inputs = model.input, outputs = successive_outputs)

#Use one of the test images

img = x_test[0]

#Display image we are going to run through our Convolutional Neural Net
plt.imshow(img.reshape(28, 28),cmap='Greys')

Pass image through model and view how the filter maps look

In [None]:
#Reshape image so it fits the input format
img= img[np.newaxis, :,:,: ]

# Let's run our image through our network, thus obtaining all
# intermediate representations for this image.
successive_feature_maps = visualization_model.predict(img)

# These are the names of the layers, so can have them as part of our plot
layer_names = [layer.name for layer in model.layers]
# -----------------------------------------------------------------------
# Now let's display our representations
# -----------------------------------------------------------------------
for layer_name, feature_map in zip(layer_names, successive_feature_maps):
    print(feature_map.shape)
    if len(feature_map.shape) == 4:
    
        #-------------------------------------------
        # Just do this for the conv / maxpool layers, not the fully-connected layers
        #-------------------------------------------
        n_features = feature_map.shape[-1]  # number of features in the feature map
        size_y       = feature_map.shape[1]  # feature map shape (1, size_y, size_x, n_features)
        size_x       = feature_map.shape[2]  # feature map shape (1, size_y, size_x, n_features)
    
        # We will tile our images in this matrix
        display_grid = np.zeros((size_y, size_x * n_features))
    
        #-------------------------------------------------
        # Postprocess the feature to be visually palatable
        #-------------------------------------------------
        for i in range(n_features):
            x  = feature_map[0, :, :, i]
            x -= x.mean()
            x /= x.std ()
            x *=  64
            x += 128
            x  = np.clip(x, 0, 255).astype('uint8')
            display_grid[:, i * size_y : (i + 1) * size_x] = x # Tile each filter into a horizontal grid

        #-----------------
        # Display the grid
        #-----------------

        scale = 20. / n_features
        plt.figure( figsize=(scale * n_features, scale) )
        plt.title ( layer_name )
        plt.grid  ( False )
        plt.imshow( display_grid, aspect='auto', cmap='viridis' )

## 2. CNN Classification using the CIFAR10 Dataset <a name="2."></a>

Let's classify more images by taking the CIFAR10 dataset. The CIFAR10 dataset contains 60,000 colour images in 10 classes, with 6,000 images in each class. The classes are mutually exclusive, so there is no overlap between them. The dataset is divided into 50,000 training images and 10,000 testing images. 

### 2.1 Downloading the CIFAR10 Dataset <a name="2.1"></a>

In [None]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()

# Scale pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0


In [None]:
#Split training sample into training and validation samples
train_images, val_images = train_images[0:45000], train_images[45000:]
train_labels, val_labels = train_labels[0:45000], train_labels[45000:]

Let's take a look at some of the data

In [None]:
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i])
    # The CIFAR labels happen to be arrays, 
    # which is why you need the extra index
    plt.xlabel(class_names[train_labels[i][0]])


### 2.2 Constructing a CNN for identifying CIFAR10 colour images <a name="2.2"></a>

Our previous images were black and white but now we are working with colour images. This is straight forward, we can just feed the images into our CNN via their colour channels. The images are 32x32 pixels in size and have three colour channels (red,green,blue). So the shape of the images is 32, 32, 3

Let's build a more sophisticated neural net with more convolutional layers. The first convolutional layer will have 32 output channels using a 3x3 filter (kernel) and then reduced in size using maxpooling. The second layer will have 64 output channels, also reduced in size with maxpooling and the final convolutional layer will have a output of 64 channels. We can then feed the output to a mlp with a layer containing 64 neurons and an output of 10 neurons

In [None]:
model = tf.keras.models.Sequential([
    #tf.keras.layers.Conv2D(number_of_filters, kernel_size=(filter size), input_shape=(height, width, num_of_channels))
    tf.keras.layers.Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=(32, 32, 3)),
    tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
    tf.keras.layers.Flatten(), # Flattening the 2D arrays for fully connected layers
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10)
])

model.summary()

Compile and train the model

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))


Look at the loss

In [None]:
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', color='red', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

### 2.3 Evaluate the accuracy <a name="2.3"></a>

In [None]:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)


### 2.4 More layers, less parameters  <a name="2.4"></a>

Let's demonstrate the power of adding convolutional layers. Each convolutional layer has less information than the previous. Remember it is the final layer you feed into the mlp. So the smaller your final convolutional layer, the smaller and therefore the less parameters you have in your mlp. Of course you require filters (which require parameters) to generate convolutional layers, but the parameters required for the filters are typically more than compensated for by the reduced size of the required mlp. So the beauty of CNNs is that you can reduce the number of parameters required for your model and not necessarily reduce the performance. You can even improve the performance!

Let's build a more sophisticated neural net with more convolutional layers. The first convolutional layer will have 32 output channels using a 3x3 filter (kernel) and then reduced in size using maxpooling 2x2 filter. The second layer will have 64 output channels using a 3x3 filter, also reduced in size with maxpooling 2x2 filter and the final concolutional layer will have a output of 64 channels using a 3x3 filter. We can then feed the output to a mlp with a layer containing 64 neurons and an output of 10 neurons. Build and run this CNN in the cells below. Use the comments as instructions. Compare the number of parameters of this CNN with the previous CNN and their accuracy profiles.

In [None]:
model = tf.keras.models.Sequential([
    #tf.keras.layers.Conv2D(number_of_filters, kernel_size=(filter size), input_shape=(height, width, num_of_channels))
    tf.keras.layers.Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=(32, 32, 3)),
    tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPool2D(pool_size=(2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.Flatten(), # Flattening the 2D arrays for fully connected layers
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10)
])

model.summary()

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=20, 
                    validation_data=(val_images, val_labels))

In [None]:
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1, len(loss) + 1)

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', color='red', label='Validation loss')
plt.title('Training and validation loss')

In [None]:
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.3, 1])
plt.legend(loc='lower right')

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)


### 2.5 Evaluating the test sample  <a name="2.5"></a>

Let's look at the test sample and see if we can correctly identify the images

In [None]:
test_images_prediction = model.predict(test_images)

### 2.6 Generated filters  <a name="2.6"></a>

Let's look at some of the filters that were generated from the training

In [None]:
import math 

figureIndex = 1
#Iterate through all layers
for layer in model.layers:
    plt.figure(figureIndex, figsize=(10,10)) 
    if 'conv' in layer.name:
        weights, biases = layer.get_weights()
        print(layer.name, weights.shape)
        
        #scale filters        
        f_min, f_max = weights.min(), weights.max()
        filters = (weights - f_min) /(f_max - f_min)
        
        #iterate through all filters
        num_filters = filters.shape[3]
        nc = 8 
        nr = math.ceil(num_filters/nc)
        #Only plotting first index of filters
        for i in range(num_filters):
            plt.subplot(nr,nc,i+1)
            filt = filters[:,:,:,i]
            #Select channel
            #print(filt[:,:,0].shape)
            filt_chan = filt[:,:,0]
            plt.imshow(filt_chan,cmap='gray_r',vmin=0)
    figureIndex += 1   


Let's look at the feature maps produced by the filters. Set up a model to visualise the filter maps. Take an image from the test sample to see how what its filter maps look like

In [None]:
successive_outputs = [layer.output for layer in model.layers[1:]]

visualization_model = tf.keras.models.Model(inputs = model.input, outputs = successive_outputs)

#Use one of the test images

img = train_images[0]

#Display image we are going to run through our Convolutional Neural Net
plt.imshow(img)

In [None]:
#Reshape image so it fits the input format
img= img[np.newaxis, :,:,: ]

# Let's run our image through our network, thus obtaining all
# intermediate representations for this image.
successive_feature_maps = visualization_model.predict(img)

# These are the names of the layers, so can have them as part of our plot
layer_names = [layer.name for layer in model.layers]
# -----------------------------------------------------------------------
# Now let's display our representations
# -----------------------------------------------------------------------
for layer_name, feature_map in zip(layer_names, successive_feature_maps):
    print(feature_map.shape)
    if len(feature_map.shape) == 4:
    
        #-------------------------------------------
        # Just do this for the conv / maxpool layers, not the fully-connected layers
        #-------------------------------------------
        n_features = feature_map.shape[-1]  # number of features in the feature map
        size_y       = feature_map.shape[1]  # feature map shape (1, size_y, size_x, n_features)
        size_x       = feature_map.shape[2]  # feature map shape (1, size_y, size_x, n_features)
    
        # We will tile our images in this matrix
        display_grid = np.zeros((size_y, size_x * n_features))
    
        #-------------------------------------------------
        # Postprocess the feature to be visually palatable
        #-------------------------------------------------
        for i in range(n_features):
            x  = feature_map[0, :, :, i]
            x -= x.mean()
            x /= x.std ()
            x *=  64
            x += 128
            x  = np.clip(x, 0, 255).astype('uint8')
            display_grid[:, i * size_y : (i + 1) * size_x] = x # Tile each filter into a horizontal grid

        #-----------------
        # Display the grid
        #-----------------

        scale = 20. / n_features
        plt.figure( figsize=(scale * n_features, scale) )
        plt.title ( layer_name )
        plt.grid  ( False )
        plt.imshow( display_grid, aspect='auto', cmap='viridis' )

## 3. Exercise: classification of the fashion MNIST dataset <a name="3."></a>

This is another image dataset, but this time of fashion items. Again these are images of 28x28 pixels, black and white with pixel intensities ranging from 0 to 255.

Load the dataset. 

In [None]:
(X_train_full, y_train_full), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

Split the full training set into a validation set and a training set. Also scale the pixel intensities down to the 0-1 range and convert them to floats, by dividing by 255

In [None]:
X_valid, X_train = X_train_full[:5000] / 255., X_train_full[5000:] / 255.
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
X_test = X_test / 255.

Need to associate names with the labels

In [None]:
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
               "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

Lets look at some of the images

In [None]:
n_rows = 4
n_cols = 10
plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
for row in range(n_rows):
    for col in range(n_cols):
        index = n_cols * row + col
        plt.subplot(n_rows, n_cols, index + 1)
        plt.imshow(X_train[index], cmap="binary", interpolation="nearest")
        plt.axis('off')
        plt.title(class_names[y_train[index]], fontsize=12)
plt.subplots_adjust(wspace=0.2, hspace=0.5)


Format data so that it is suitable of a 2D convolutional neural net

In [None]:
#before
X_train.shape

In [None]:
X_train = X_train[..., np.newaxis]
X_valid = X_valid[..., np.newaxis]
X_test = X_test[..., np.newaxis]

#after
X_train.shape

Construct a 2D neural net to classify the fashion data. Again, use whatever tools you think will help, many convolutional layers, maxpooling, dropout in the mlp..... etc.