# 1.2. Pytorch Conv1d and Conv2d
**Goal:** Explain, and demo how to use the `nn.Conv1d` and `nn.Conv2d` modules.

We will also explore the different hyperparamters, mainly
 - `stride`
 - `dilation`
 - `kernel_size`

And explain what the following are
 - channels

## Conv1D and channels

In [3]:
import torch
import torch.nn as nn
import torch.nn.functional as F

import matplotlib.pyplot as plt

Below I create a convolution kernel $k_{1}$ with a kernel size of 5 and a 0 padding of 2 and convolve
it a batch of 128 inputs, each of length 100.

Note the following:
 - The input tensor is 3D. The first dimension is the batch size, the second is the 'number of channels'
 and the third dimension is the input length.
 - How the padding size is computed.
 - The weights are 3D. The shape of `weights` is `(1, 1, 5)` and not just `(5)`.
 - There is a bias.

In [51]:
kernel_size = 5

k1 = nn.Conv1d(
    in_channels=1,
    out_channels=1,
    kernel_size=kernel_size,
    padding=kernel_size//2,
)

x1 = torch.randn(128, 1, 100)

print(f'Batch size: {x1.shape[0]}, num. channels: {x1.shape[1]}, input_length: {x1.shape[2]}\n')

weights, biases = k1.named_parameters()

print(weights)
print(f'weights.shape: {weights[1].shape}\n')

print(biases)
print(f'biases.shape: {biases[1].shape}\n')

with torch.no_grad():
    y1 = k1(x1)

print(f'y1.shape {y1.shape}. Identical shape to x1.')

Batch size: 128, num. channels: 1, input_length: 100

('weight', Parameter containing:
tensor([[[0.0807, 0.3159, 0.2228, 0.2667, 0.0645]]], requires_grad=True))
weights.shape: torch.Size([1, 1, 5])

('bias', Parameter containing:
tensor([0.2995], requires_grad=True))
biases.shape: torch.Size([1])

y1.shape torch.Size([128, 1, 100]). Identical shape to x1.


How is the padding computed? 
 - Padding is computed using `kernel_size//2` as this ensure that there is enough cells containing
 0 on either side of the input tensor.
 - You'll generally find that kernels have odd dimensions to facilitate this.

What are channels?
 - Samples often have multiple channels.
 - An example is more easily given in 2D. A batch of images may have 3 channels (RGB) so a batch of
 256x256 images may have shape `(128, 3, 256)`. A grayscale image would only need a single channel so
 it would have shape `(128, 1, 256)`.
 - In 1D, you may financial timeseries datasets that have a time and price. This would yield 2 channels,
 giving a shape such as `(128, 2, X)`.

Why is the kernel shape `(1, 1, 5)`
 - The shape of the kernel in pytorch is `(out_channels, in_channels, kernel_size)`, as you can see
 in the example below.

In [66]:
k2 = nn.Conv1d(
    in_channels=3,
    out_channels=2,
    kernel_size=kernel_size,
    padding=kernel_size//2,
)

weights, biases = k2.named_parameters()

print(f'weights.shape: {weights[1].shape}\n')

print(biases)
print(f'biases.shape: {biases[1].shape}\n')

weights.shape: torch.Size([2, 3, 5])

('bias', Parameter containing:
tensor([ 0.1754, -0.2114], requires_grad=True))
biases.shape: torch.Size([2])



And the bias?
 - Yes, there will be as many biases as there are `out_channels`, as you can see in the above twoo
 examples.

## Conv2d
You can create a 2D convolutional kernel in a very similar way to a 1D kernel.

In [73]:
kernel_size = 3

k3 = nn.Conv2d(
    in_channels=1,
    out_channels=2,
    kernel_size=kernel_size,
    padding=kernel_size//2,
)

weights, biases = k3.named_parameters()

print(f'weights.shape: {weights[1].shape}')
print(f'biases.shape: {biases[1].shape}')

weights.shape: torch.Size([2, 1, 3, 3])
biases.shape: torch.Size([2])


In this case, the size of the kernel becomes `(out_channels, in_channels, kernel_size, kernel_size)`
and the number of biases remains as 1 per `out_channel`.

## Hyperparameters
The main hyperparameters for a convolutional layer are the `stride`, `dilation` and `kernel_size` (which
has been sufficiently explored.)

The stride is simply 

In [76]:
k4 = nn.Conv2d(
    in_channels=1,
    out_channels=2,
    kernel_size=kernel_size,
    padding=kernel_size//2,
)

weights, biases = k3.named_parameters()

print(f'weights.shape: {weights[1].shape}')
print(f'biases.shape: {biases[1].shape}')

weights.shape: torch.Size([2, 1, 3, 3])
biases.shape: torch.Size([2])
