https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md

In [2]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

In [3]:
import torch
import torch.nn as nn
import numpy as np
import pandas as pd

create 2 channels 

In [92]:
data = torch.ones((2,15))
data[1] += 1
data

tensor([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]])

## Basic 1d Conv

Look at what the basic output looks like

1 channel in and 1 channel out, and a kernel length of 5,

no extra striding (stride =1), no extra padding (paadding = 0), no dilation (dilation =1)

note that we are only required to give `in_channels, out_channels, kernel_size`. left the others there with their defaults (except bias, we're turning that off for now)

In [70]:
model = torch.nn.Conv1d(in_channels = 1, out_channels = 1, kernel_size =5,
                        stride=1, padding=0, dilation=1, groups=1, bias=False, padding_mode='zeros')

print(model.weight)

Parameter containing:
tensor([[[-0.2142, -0.1719,  0.0711, -0.0663, -0.3345]]], requires_grad=True)


Lets grab the first channel and add 2 dummy dimensions to it since that's what's expected

In [88]:
inp = data[0].view(1,1,-1)
# inp[0,0,-1] = 4
# inp[0,0,4] = 4
inp

tensor([[[1., 1., 1., 1., 4., 1., 1., 1., 1., 1., 1., 1., 1., 1., 4.]]])

Pass it into the model and see that it reduced the sequence length by 4 (we'll get to this when we talk about padding but just realize that only overlays the kernel over the sequence and not passed it, so the first value given is actually at the 3 index of the sequence)

In [89]:
out = model(inp)
print(out.shape)
out

torch.Size([1, 1, 11])


tensor([[[-1.7193, -0.9145, -0.5025, -1.2315, -1.3582, -0.7157, -0.7157,
          -0.7157, -0.7157, -0.7157, -1.7193]]], grad_fn=<SqueezeBackward1>)

Note that it's just $\sum_{i=0}^k 1*w_i$ where $k$ is kernal length and $w_i$ refers to each weight. It's mult by 1 right now since our series is just ones [1,1,1....,1]

In [61]:
model.weight.sum()

tensor(0.6312, grad_fn=<SumBackward0>)

### kernel is not a multiple of sequence length



In [90]:
model = torch.nn.Conv1d(in_channels = 1, out_channels = 1, kernel_size =4,
                        stride=1, padding=0, dilation=1, groups=1, bias=False, padding_mode='zeros')

print(model.weight)

Parameter containing:
tensor([[[ 0.0741, -0.2244, -0.0752,  0.1084]]], requires_grad=True)


Again since the kernel is overlaid on top of the sequence, the first 3 values in the sequence don't have corresponding outputs from the model (since it is a requirement to have 3 

In [91]:
out = model(inp)
print(out.shape)
out

torch.Size([1, 1, 12])


tensor([[[-0.1170,  0.2081, -0.3425, -0.7901,  0.1054, -0.1170, -0.1170,
          -0.1170, -0.1170, -0.1170, -0.1170,  0.2081]]],
       grad_fn=<SqueezeBackward1>)

## Multiple inputs

So what happens when we just change the number of input channels but still expect 1 channel in the output. We see that the weight tensor now has 2 kernels of length 5 (so the shape is (1,2,5))

In [68]:
model = torch.nn.Conv1d(in_channels = 2, out_channels = 1, kernel_size =5,
                        stride=1, padding=0, dilation=1, groups=1, bias=False, padding_mode='zeros')

print(model.weight)

Parameter containing:
tensor([[[ 0.2719,  0.1946, -0.3042,  0.1587, -0.1256],
         [-0.1946,  0.0615, -0.1985, -0.1382,  0.2089]]], requires_grad=True)


So let's grab both channels and add 2 dummy dimensions to the tensor since that's what's expected by torch (shape = (1,2,15))

In [65]:
inp = data.view(1,2,-1)
inp

tensor([[[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
         [2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]]])

Pass it into the model

In [66]:
model(inp)

tensor([[[-0.6352, -0.6352, -0.6352, -0.6352, -0.6352, -0.6352, -0.6352,
          -0.6352, -0.6352, -0.6352, -0.6352]]], grad_fn=<SqueezeBackward1>)

Note that it's just $\sum_{i=0}^k 1*w_i$ where $k$ is kernal length and $w_i$ refers to each weight. It's mult by 1 right now since our series is just ones [1,1,1....,1]

In [61]:
model.weight.sum()

tensor(0.6312, grad_fn=<SumBackward0>)

In [19]:
a.view(1,1,-1)

tensor([[[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 2., 2.,
          2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2., 2.]]])

stride is how many steps to take before making a new calculation (if you say 0 it will just stay in one spot and obviously kill the kernel you're working on)

In [46]:
model = torch.nn.Conv1d(in_channels = 1, out_channels = 1, kernel_size =5,
                stride=1, padding=0, dilation=2, groups=1, 
                bias=False, padding_mode='zeros')

print(model.weight)
model(a.view(1,1,-1))

Parameter containing:
tensor([[[-0.4267,  0.4077, -0.1108,  0.2961,  0.1737]]], requires_grad=True)


tensor([[[0.8097, 0.8097, 0.8097, 0.8097, 0.8097, 0.6989, 0.6989, 0.6989,
          0.6989, 0.6989]]], grad_fn=<SqueezeBackward1>)

In [39]:
model.weight[0,0,1:].sum()

tensor(0.4398, grad_fn=<SumBackward0>)

In [33]:
model.weight[].sum()*2


tensor(-0.0668, grad_fn=<MulBackward0>)

In [47]:
torch.tensor([ 0.2745, -0.1809,  0.0806, -0.3044, -0.2521]).sum()

tensor(-0.3823)

In [48]:
torch.tensor([ 0.2901,  0.1385, -0.2273, -0.1210,  0.2087]).sum()

tensor(0.2890)

In [50]:
-0.3823 + 0.2890*2

0.19569999999999999

In [54]:
torch.tensor([0.0807,  0.1279, -0.1774, -0.0982, -0.0345]).sum() + 2 * torch.tensor([-0.0052,  0.0016,  0.1569,  0.2386,  0.1325]).sum()

tensor(0.9473)