# Classifying simulated events using a Convolutional Neural Network
We've seen how we can use logistig regression or feed forward neural networks (FFNN) to classify our data,
and now you've also been introduced to what a convolutional neural network is.
In this notebook, we're going to use a convolutional neural network (CNN) to classify our data,
and even combine CNN and FFNN to expand our model.

In [None]:
# Imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from sklearn.model_selection import train_test_split
from helper_functions import normalize_image_data
%load_ext autoreload
%autoreload 2

In [None]:
# Load images and labels.
DATA_PATH = "../data/"

images = np.load(DATA_PATH+"images_training.npy")
labels = np.load(DATA_PATH+"labels_training.npy")

# Introduction
The motivation for uing a CNN to classify this data comes from their huge success in image classification
tasks in recent years. Since we can very easily (and even preferably) represent our data as images, it
is a natural choice to try a model of this type.
This notebook will be a bit more hands-on than the previous ones.

You may recall from the lectures that one of the properties of CNNs is that they use 'filters'/'kernels' to
extract 'features' from images. But how can we know what type of filters to apply? The answer is - we don't need to! Similar to how a FFNN adjusts its weights in order to increase performance, a CNN adjusts its filters and in that way 'learns' the best filters for a given task.

Some very useful documentation to keep at hand while working through this notebook:
* [tensorflow.keras.layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers)
* [Scikit-Learn model evaluation](https://scikit-learn.org/stable/modules/model_evaluation.html)


## Data preparation
Recall that when we saved our training samples in the import and exploration notebook,
we saved the images as an array with the following dimensions (samples, x_pixels, y_pixels).
This isn't necessarily a format the convolutional layer will accept, because by default image data
has one more dimension - "channels". A single image would have the dimensions (x_pixels, y_pixels, channels).
For a regular RGB picture this is (x_pixels, y_pixels, 3), since you have one channel for each color.
Our data only has one channel, which in a world of colours would be called "grayscale".

The convolutional layer we will use in our model is called Conv2D ([Doc](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D)). Take a look at the documentation to determine if you need to modify the image array in order to input it to the layer.

Recall that you can use [np.reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html) to modify your array should you need to.

In [None]:
# Reshape the image array here, if you need to



One thing that can be useful to check is if we actually saved a balanced dataset. The data is shuffled
when scikit-learn's train_test_split function splits it, so it's not guaranteed. Remember that if you
don't have a balanced dataset, you need to account for this when concidering which metrics to use
in evaluating your model.

Numpy's [np.unique](https://numpy.org/doc/1.18/reference/generated/numpy.unique.html?) works great for this.

In [None]:
# Count how many there are of each class and print the results


Lastly, we split the training data into a training set and a validation set, using indices.
Working with only 10000 you can also split the image array itself.
As in the previous notebooks, [Scikit-Learn's train_test_split](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) will do the job.

In [None]:
# Split the data into training and validation sets



# Model
Having a common API for machine learning operations between multiple frameworks is great.
It means you don't need to re-learn how to build a model every time you want to try out something new!
Using the [Sequential](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential) model, you simply
add each layer in the order you want, and compile it. Tensorflow takes care of everything.

## Build and compile
We're going to start with just one layer. Take another look at the [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D) to see all the configuration options for the Conv2D layer. As a first step we will stick to defining only the necessary arguments to the layer,
leaving the rest as default.

In [None]:
# Initialize the Sequential model, and add one Conv2D layer to it
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(
        filters=32,
        kernel_size=3,
        activation='relu',
        padding='same',
        input_shape=images.shape[1:] # Shape of a single image
    )
)


# Add an output layer to the model
# Recall the shape of the input to a dense neural network.
# What is the shape of the output from the Conv2D layer?


# Once the model is defined, we need to compile it. 
# This is where we specify the loss function, optimizer, and metrics if we want.
model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
)
model.summary()

## Training
We are ready to train the model. Remember to normalize your inputs if you haven't already done that!
You can specify the validation data using the keyword 'validation_data'. Take a look at the [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#fit) for the fit() function to see how it
expect the validation to be formatted.

In [None]:
# Training parameters
batch_size = 32
epochs = 20

In [None]:
# Train model, save output



In [None]:
# Side by side plots of losses and accuracies
history = training_output.history
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
ax[0].plot(history['loss'], label='training')
ax[0].plot(history['val_loss'], label='validation')
ax[0].set_title("Model losses")
ax[0].set_xlabel("Epoch")
ax[0].set_ylabel("Loss")
ax[0].legend()

ax[1].plot(history['accuracy'], label='training')
ax[1].plot(history['val_accuracy'], label='validation')
ax[1].set_title("Model accuracy")
ax[1].set_xlabel("Epoch")
ax[1].set_ylabel("Accuracy")
ax[1].legend()

## Expand and improve the model
Just like with a feed forward neural network, the architecture is just one of the properties
we can tune to improve performance. If you look closer at the model.summary() printout, you can
see that the output after the Conv2D layer has changed shape. It's no longer the same size as our
original image. Why is this?

In the introduction we mentioned that the filters/kernels in a CNN
extract features in the images. One thing we could do, then, is to input those features to an FFNN. Just
like in the dense_neural_network notebook.

In a way we've already done this, except the dense part of our model has only one node, the output node.
Try to add another dense layer before the output. Perhaps you have a dense model from the previous notebook
you can weave in here?

Of course, you can also add another Conv2D layer instead, or increase the number of filters.
In between convolutional layers you can add MaxPool2D layers, and eventually you may start looking at
regularizations like Dropout if necessary.

# Visualize a filter
We can extract a filter's 'weights' and visualize what one (or all, if you wish) filter extracts
from an image.

In [None]:
# import convolve2d
from scipy.signal import convolve2d


# Get filters from the first Conv2D layer
filters, biases = model.layers[0].get_weights()
print(filters.shape)

In [None]:
So we have 32 3x3 filters, as expected since this is what we set the Conv2D layer to use.
Notice, however, that the filters array doesn't have the the number of filters on the first axis. To extract
a single filter from this array, we need to index like `tmp = filters[:,:,:,0]` to get the first filter.
The next step is to pick a filter from the 32 filters we have available, and aplly it to a sample image.
We can use SciPy's [convolve2](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.convolve2d.html) to do this.

NB! `filter` is a keyword in python, so if you extract a single filter, name the variable something else.

In [None]:
# Choose a filter and an image
filter_idx = 0
img_idx = 0

# We need to reshape the filters and images for 
filtr = filters[:,:,:, filter_idx]
image = images[img_idx]
conv_image = convolve2d(
    image.reshape(16, 16),
    filtr.reshape(3, 3)
)

# Plot the images and filter
fig, ax = plt.subplots(1, 3, figsize=(12, 4))
ax[0].imshow(image.reshape(16, 16), cmap='gray')
ax[0].set_title("Original image")

ax[1].imshow(conv_image, cmap='gray')
ax[1].set_title("Convolved image")

sns.heatmap(filtr.reshape(3, 3), cmap='gray', square=True, annot=True, ax=ax[2])
ax[2].set_title("Filter")