# Convolutions

### Introduction

What if we could extract information about where the edges occur in each image, and then use that information as our features, instead of raw pixels?

A convolution applies  a kernel across an image. A kernel is a little matrix, such as the 3×

In [6]:
from torch import tensor
top_edge = tensor([[-1,-1,-1],
                   [ 0, 0, 0],
                   [ 1, 1, 1]]).float()

Convolution: Perform element-wise multiplication, and after multiply together, then add them up.

In [None]:
df = pd.DataFrame(im3_t[:10,:20])
df.style.set_properties(**{'font-size':'6pt'}).background_gradient('Greys')

> returning a high number where the 3×3-pixel square represents a top edge (i.e., where there are low values at the top of the square, and high values immediately underneath). That's because the -1 values in our kernel have little impact in that case, but the 1 values have a lot.

*  Putting the 1s and -1s in columns versus rows would give us filters that detect vertical edges.

In [12]:
# im3_t = tensor(im3)
# im3_t[0:3,0:3] * top_edge

In [7]:
def apply_kernel(row, col, kernel):
    return (im3_t[row-1:row+2,col-1:col+2] * kernel).sum()

In [8]:
[[(i,j) for j in range(1,5)] for i in range(1,5)]

[[(1, 1), (1, 2), (1, 3), (1, 4)],
 [(2, 1), (2, 2), (2, 3), (2, 4)],
 [(3, 1), (3, 2), (3, 3), (3, 4)],
 [(4, 1), (4, 2), (4, 3), (4, 4)]]

In [14]:
rng = range(1,27)
left_edge = tensor([[-1,1,0],
                    [-1,1,0],
                    [-1,1,0]]).float()

# left_edge3 = tensor([[apply_kernel(i,j,left_edge) for j in rng] for i in rng])

# show_image(left_edge3);

###  Convolution Thoughts

* So convolution can just tell us where the edges are.
* The output of a convolution is called a channel
* And maybe one channel finds top edges, and another finds left, and then combines to find top left.

<img src="./padding.png" width="40%">

Now, we'll want different kernels.  And we'll want them finding different features, so we want to create a 3x3x3 kernel.

But ultimately, this 3x3x3 will create just one number, because elementwise multiplication and then add them all up.

* Strides

Now lot's of times, we might not go over every single pixel, because our memory will go out of control.  So we hop over pixels.  And if hop over two pixels after after convolution, called a stride 2 convolution.

(2, 2), (2, 4), (2, 6), (2, 8)

* And this leads to half as many convolutions, so then can double the number of kernels.  So now h/2 and w/2.

Padding should be kernel size divide by 2.

1. 3x3 kernel can't really handle edges, because can't go any further.

### Pytorch Convolution

 The PyTorch docs tell us that it includes these parameters:

* input:: input tensor of shape (minibatch, in_channels, iH, iW)
* weight:: filters of shape (out_channels, in_channels, kH, kW)

* iH,iW is the height and width of the image (i.e., 28,28), and kH,kW is the height and width of our kernel (3,3).

> Pytorch Tricks

1. The first trick is that PyTorch can apply a convolution to multiple images at the same time. That means we can call it on every item in a batch at once!

2. The second trick is that PyTorch can apply multiple kernels at the same time. So let's create the diagonal-edge kernels too, and then stack all four of our edge kernels into a single tensor:

In [15]:
import torch
diag1_edge = tensor([[ 0,-1, 1],
                     [-1, 1, 0],
                     [ 1, 0, 0]]).float()
diag2_edge = tensor([[ 1,-1, 0],
                     [ 0, 1,-1],
                     [ 0, 0, 1]]).float()

edge_kernels = torch.stack([left_edge, top_edge, diag1_edge, diag2_edge])
edge_kernels.shape

torch.Size([4, 3, 3])

> PyTorch represents an image as a rank-3 tensor, with dimensions [channels, rows, columns].

### Average Pooling

Take the average of each (phase?) or convolution, and call Average2d with output size of 1.

Each of the convolutions represents a different feature.

### CNN Arithemtic

[cnn arithemetic](http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html)

### Resources

[Guide to Convolution](https://arxiv.org/abs/1603.07285)

[Setosa image kernels](https://setosa.io/ev/image-kernels/)

[Fast ai youtube CNN](https://youtu.be/hkBa9pU-H48?list=PLfYUBJiXbdtSIJb-Qd3pw0cqCbkGeS0xn&t=4265)