# Convolutional Neural Networks (CNNs)
Convolutional neural networks are the workhorses of image recognition.

The filters (a.k.a, image kernels) and the weights in fully connected layers are learned using stochastic gradient descent (minimizing the cost function -- the measure of difference between expected and actual), but in this explanation, we will assume we are using a pretrained network where these values are already learned.

## Features
First principle: an image can be made up of a number of features. For example, an X could be thought of as being made up of upper_left-to-lower_right diagonals, upper_right-to-lower_left diagonals, and a cross in the middle.

<img src="images/cnn1.jpg" height=400, width=500>

These filters to detect certain features can be called __filters__ or __kernels__.

Now, every pixel in our image has some value -- for example, a grayscale value (or, in the case of a coloured image, 3 values -- R, G, and B.)

<img src="images/cnn2.jpg" height=300, width=300>

## The layers in a CNN
In a CNN, we essentially have 4 types of layer:

1. Convolution layer (detect features)
2. ReLU layer (turn -ves into 0s to stop math blowing up)
3. Max pooling layer (shrink the image resolution to save memory/increase speed)
4. Fully connected layer (multiply pixel values by weights to guess what the object is)

Exactly how you organize these layers and how many of them you have -- i.e., organizing the hyperparameters -- is kind of an art.

### 1. Convolution layer
In the convolution layer, we take our filters and move them across every pixel in the image.
At each position, we:

1. Multiply each kernel value by each corresponding pixel value
2. Add up all of these multiplied values
3. Divide by the number of pixels (at least some CNNs seem to do this, others don't)

This will leave us with a stack of filtered images, where we have a value for each pixel. We should be able to see the patterns in these matched images. In the filtered images, high activations will map to where the feature appeared. So already we've started to detect parts of the images, such as edges.

In [5]:
# For example to detect horizontal lines, we might use a filter (kernel) like this:
import numpy as np

horizontal_line_kernel = np.array([[1, 1, 1],
                                   [0, 0, 0,],
                                   [-1, -1, -1]])

horizontal_line_kernel

array([[ 1,  1,  1],
       [ 0,  0,  0],
       [-1, -1, -1]])

If you think about it, applying this kernel will give high activations where we have a horizontal line with blank space underneath. If we have a vertical line, the 1s on the top row will be cancelled out by the -1s on the bottom row, so vertical lines will be ignored.

We can see an example of this filter being used in conv_example.ods.

<img src="images/cnn3.jpg" height=500, width=500>


In [6]:
# Likewise we could detect vertical lines by using a kernal like this:

vertical_line_kernel = np.array([[1, 0, -1],
                                 [1, 0, -1],
                                 [1, 0, -1]])

vertical_line_kernel

array([[ 1,  0, -1],
       [ 1,  0, -1],
       [ 1,  0, -1]])

So now we could detect vertical lines too:

<img src="images/cnn4.jpg" height=500, width=500>

So in our first layer, a convolution layer, we have picked out the horizontal and vertical lines from our images.