
**This ipynb file is a lecture note. And the link can be found below:**

https://www.bilibili.com/video/BV1ce411K7XC?p=14&vd_source=7cca4a20f2401942703a8c8eff4d7492

## load library

In [2]:
import torch
import torch.nn as nn

## ```torch.nn.Conv2d()```

$$\text{input image}\xrightarrow{kernal}\text{feature map}$$
- $f$ kernal size
- $n_c$ number of channels
- $s$ stride
- $p$ padding

### descriptions

```torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)```

### formulas

$$W_{out}=\text{floor}\left(\frac{W_{in}+2p-f}{s}\right)+1$$
$$H_{out}=\text{floor}\left(\frac{H_{in}+2p-f}{s}\right)+1$$


### dilated convolution
*Dilation is a method to reduce the size of feature map. One advantage of dilation is to keep the number of learnable parameters instead of increasing it.*
### group convolution

### Example : Easy convolution layer

In [7]:
c1 = nn.Conv2d(3, 1, 3)
print('C1 weights', c1.weight.shape)
print(c1.weight) # noted that different weights lies in different channels

C1 weights torch.Size([1, 3, 3, 3])
Parameter containing:
tensor([[[[ 0.1085,  0.0888, -0.1721],
          [ 0.1190,  0.0385, -0.1808],
          [ 0.1075, -0.0761,  0.1911]],

         [[ 0.1308, -0.0104, -0.0102],
          [-0.1293, -0.1312, -0.1629],
          [ 0.1627, -0.0639, -0.0252]],

         [[ 0.1694,  0.1515, -0.1533],
          [-0.1015, -0.1515, -0.1795],
          [ 0.1819,  0.0053,  0.0895]]]], requires_grad=True)


In [10]:
input = torch.ones(1, 3, 5, 5)
output = c1(input)
print(input.shape)
print(output.shape)

torch.Size([1, 3, 5, 5])
torch.Size([1, 1, 3, 3])


### Example : load pre-trained weights

In [12]:
c2 = nn.Conv2d(3, 1, 3, bias = False)
w2 = c2.weight
c2.weight = nn.Parameter(torch.ones_like(w2), requires_grad=True) # load pre-trained weights
print(c2.weight)
output = c2(input)
print(output)

Parameter containing:
tensor([[[[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],

         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]],

         [[1., 1., 1.],
          [1., 1., 1.],
          [1., 1., 1.]]]], requires_grad=True)
tensor([[[[27., 27., 27.],
          [27., 27., 27.],
          [27., 27., 27.]]]], grad_fn=<ConvolutionBackward0>)


## ```torch.ConvTranspose```

*It had better add command to define output_size to make the output size match with input size*

In [20]:
input = torch.ones(1, 1, 5, 5)
print('The size of input is\n', input.size())
c3 = nn.Conv2d(1, 2, 3, stride = 2, padding = 1)
downsample = c3(input)
print('The size of intermediate result is\n', downsample.shape)

u4 = nn.ConvTranspose2d(2, 1, 3, stride = 2, padding = 1)
upsample = u4(downsample, output_size = input.size())
print('The size of output is\n', upsample.shape)

The size of input is
 torch.Size([1, 1, 5, 5])
The size of intermediate result is
 torch.Size([1, 2, 3, 3])
The size of output is
 torch.Size([1, 1, 5, 5])


## Fold and Unfold

$$\text{Conv2D}=\text{Unfold}(2)+\text{Matmul}+\text{Fold}$$

In Pytorch, a convolution layer can be implemented by above three operations:
1. Unfold kernel and input feature map to two matrix
2. Do matrix multiplication
3. Fold the resultant matrix