# PyTorch Convolutions

## <a name="overview"></a> Overview

In this notebook, we show how to create and use convolution objects in PyTorch. Mainly we use the <a href="https://pytorch.org/docs/master/generated/torch.nn.Conv2d.html#torch.nn.Conv2d">Conv2d</a> class

## <a name="ekf"></a> PyTorch convolutions

---
**Remark**

```Conv2d``` layers expect input with the following shape

```
(n_samples, channels, height, width)
```

where 

- ```n_samples``` is how many samples we use. Typically this will be the batch size
- ```channels``` is the number of colors in the image; a full color image will have three channels red, green and blue
- ```height``` is the height of the image 
- ```width``` is the width of the image 

A convolution object works by using a kernel. The given kernel should be rank-4 tensors

```
(channels_in, features_out, rows, columns)
```

---

### <a name="test_case_1"></a> Example: One channel image

In [17]:
import torch
import torch.nn as nn

Below, we construct a ```Conv2d``` instance that applies a $3\times 3$ kernel. We can also apply non-square kernels by using $kernel_size=(x,y)$. 

In [18]:
conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3)

Create a fake image of size $5\times 5$ with only one channel

In [19]:
image = torch.zeros(1, 1, 5,5)
image[0,0:,2]=1
image

tensor([[[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]]])

Apply the convolution using the following kernel

In [20]:
conv.weight

Parameter containing:
tensor([[[[-0.2185,  0.2629, -0.3321],
          [-0.2577, -0.0972,  0.2869],
          [-0.0022,  0.0432,  0.2632]]]], requires_grad=True)

Note that PyTorch randomly initializes the kernel parameters

In [21]:
z = conv(image)

In [22]:
z

tensor([[[[ 0.4743,  0.4743,  0.4743],
          [ 0.1021,  0.1021,  0.1021],
          [-0.1175, -0.1175, -0.1175]]]], grad_fn=<MkldnnConvolutionBackward>)

One other parameter we can set is ```stride```. This variable determines the size of the jump the convolution operation performs

In [23]:
conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, stride=2)

In [24]:
z = conv(image)
z

tensor([[[[-0.4493, -0.4493],
          [-0.2696, -0.2696]]]], grad_fn=<MkldnnConvolutionBackward>)

We can also specify ```padding```. Padding specifies the number of additional columns and rows of zeros to use

In [25]:
conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, stride=2, padding=1)

When using padding, the size $N$ of the image is given by: 

$$N = N_{old} + 2 \times \text{padding}$$

### <a name="test_case_2"></a> Example: Multiple channel image

With multiple channel image, we simply simply have a kernel for every output channel: 

In [26]:
conv = nn.Conv2d(in_channels=1, out_channels=3, kernel_size=3)

In [27]:
image1 = torch.zeros(1, 1, 5,5)
image1[0,0:,2]=1
image

tensor([[[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [1., 1., 1., 1., 1.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]]])

Let's now see what happens when we have multiple input channels e.g. an image with RGB data


In [28]:
conv = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=3)

In [29]:
image = torch.zeros(1, 2, 5,5)
image[0, 0, 2,:]= -2
image[0, 1, 2,:]= 1

### Multiple input and multiple outputs

In [30]:
conv = nn.Conv2d(in_channels=2, out_channels=3, kernel_size=3)

## <a name="refs"></a> References