### <span style="color:#3bbfa0; font-size:2em;">*If You're Always Using the 'Same' Padding, You're Probably Doing It Wrong*</span><br><span style="color:lightgray; font-size:2em;">*What is Padding Really and How to Select it*</span>
<br>
<img src="./asset/img/padding.gif" width="800">

In [9]:
import torch
import torch.nn as nn
# Initialize a random tensor
x = torch.randn(1, 1, 5, 5)
x

tensor([[[[-0.7949,  1.0612, -0.7323, -0.2455, -0.1001],
          [-2.7019,  0.3012,  0.0639, -1.2704, -0.3334],
          [ 0.2886, -0.3720, -0.1025,  0.5043, -0.7222],
          [-0.5054,  0.8228, -1.0236, -1.2244, -1.9280],
          [ 0.3289, -0.3378,  0.7619,  0.6806, -0.5579]]]])

In [10]:
# Convolutional layer with padding
conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, padding=1)
out = conv(x)
print(out.shape)  # Prints: torch.Size([1, 1, 5, 5])
out

torch.Size([1, 1, 5, 5])


tensor([[[[-0.5859, -0.0851, -1.1073, -0.2818, -0.4121],
          [-1.4088,  1.0385, -0.3710, -0.8102,  0.0723],
          [-0.4765, -0.7008, -0.1255, -1.0180, -1.0310],
          [-0.2687,  0.2851, -0.7900, -0.0779, -0.4261],
          [-0.4424, -0.0440,  0.2546, -0.0673, -0.9075]]]],
       grad_fn=<ConvolutionBackward0>)

### <span style="color:#3bbfa0; font-size:1.2em;">*1.1 Constant Padding:*</span>

In [11]:
# Constant padding
constant_pad = nn.ConstantPad2d(padding=1, value=77)
out = constant_pad(x)
print(out.shape)  # Prints: torch.Size([1, 1, 7, 7])
out

torch.Size([1, 1, 7, 7])


tensor([[[[ 7.7000e+01,  7.7000e+01,  7.7000e+01,  7.7000e+01,  7.7000e+01,
            7.7000e+01,  7.7000e+01],
          [ 7.7000e+01, -7.9491e-01,  1.0612e+00, -7.3232e-01, -2.4549e-01,
           -1.0012e-01,  7.7000e+01],
          [ 7.7000e+01, -2.7019e+00,  3.0118e-01,  6.3919e-02, -1.2704e+00,
           -3.3337e-01,  7.7000e+01],
          [ 7.7000e+01,  2.8865e-01, -3.7205e-01, -1.0249e-01,  5.0432e-01,
           -7.2220e-01,  7.7000e+01],
          [ 7.7000e+01, -5.0544e-01,  8.2278e-01, -1.0236e+00, -1.2244e+00,
           -1.9280e+00,  7.7000e+01],
          [ 7.7000e+01,  3.2894e-01, -3.3782e-01,  7.6193e-01,  6.8059e-01,
           -5.5793e-01,  7.7000e+01],
          [ 7.7000e+01,  7.7000e+01,  7.7000e+01,  7.7000e+01,  7.7000e+01,
            7.7000e+01,  7.7000e+01]]]])

### <span style="color:#3bbfa0; font-size:1.2em;">*1.2 Reflection Padding:*</span>

```For example, let's say we have a simple one-dimensional tensor [1, 2, 3] and we want to 
apply reflection padding of size 2. The reflected tensor would look like this: [3, 2, 1, 2, 3, 2, 1].```

In [12]:
# Reflection padding
reflection_pad = nn.ReflectionPad2d(padding=1)
out = reflection_pad(x)
print(out.shape)  # Prints: torch.Size([1, 1, 7, 7])
out

torch.Size([1, 1, 7, 7])


tensor([[[[ 0.3012, -2.7019,  0.3012,  0.0639, -1.2704, -0.3334, -1.2704],
          [ 1.0612, -0.7949,  1.0612, -0.7323, -0.2455, -0.1001, -0.2455],
          [ 0.3012, -2.7019,  0.3012,  0.0639, -1.2704, -0.3334, -1.2704],
          [-0.3720,  0.2886, -0.3720, -0.1025,  0.5043, -0.7222,  0.5043],
          [ 0.8228, -0.5054,  0.8228, -1.0236, -1.2244, -1.9280, -1.2244],
          [-0.3378,  0.3289, -0.3378,  0.7619,  0.6806, -0.5579,  0.6806],
          [ 0.8228, -0.5054,  0.8228, -1.0236, -1.2244, -1.9280, -1.2244]]]])

### <span style="color:#3bbfa0; font-size:1.2em;">*1.3 Replication Padding:*</span>

```For example, if we have a simple one-dimensional tensor [1, 2, 3] and we want to apply
replication padding of size 2, the padded tensor would look like this: [1, 1, 1, 2, 3, 3, 3]. 
The values at the beginning and end of the tensor are "replicated" to create padding.```

In [14]:
# Replication padding
replication_pad = nn.ReplicationPad2d(padding=1)
out = replication_pad(x)
print(out.shape)  # Prints: torch.Size([1, 1, 7, 7])
out

torch.Size([1, 1, 7, 7])


tensor([[[[-0.7949, -0.7949,  1.0612, -0.7323, -0.2455, -0.1001, -0.1001],
          [-0.7949, -0.7949,  1.0612, -0.7323, -0.2455, -0.1001, -0.1001],
          [-2.7019, -2.7019,  0.3012,  0.0639, -1.2704, -0.3334, -0.3334],
          [ 0.2886,  0.2886, -0.3720, -0.1025,  0.5043, -0.7222, -0.7222],
          [-0.5054, -0.5054,  0.8228, -1.0236, -1.2244, -1.9280, -1.9280],
          [ 0.3289,  0.3289, -0.3378,  0.7619,  0.6806, -0.5579, -0.5579],
          [ 0.3289,  0.3289, -0.3378,  0.7619,  0.6806, -0.5579, -0.5579]]]])

<br><br><br><br><br><br>

### <span style="color:#3bbfa0; font-size:1.2em;">*2.1 Receptive Fields:*</span>
____


1. Fawaz, H. I., Lucas, B., Forestier, G., Pelletier, C., Schmidt, D. F., Weber, J., Webb, G. I., Idoumghar, 
L., Muller, P.-A., & Petitjean, F. (2019). InceptionTime: Finding AlexNet for Time Series Classification. arXiv. https://doi.org/10.48550/ARXIV.1909.04939


<img src="./asset/img/paper.png" width="800">

<img src="./asset/img/rf.png" width="800">
<br><br><br>

<br><br><br>
<br><br><br>

### <span style="color:#3bbfa0; font-size:1.2em;">*2.2 Receptive Fields:*</span>


2. Luo, W., Li, Y., Urtasun, R., & Zemel, R. (2017). Understanding the Effective Receptive Field in Deep Convolutional Neural Networks (Version 2). arXiv. https://doi.org/10.48550/ARXIV.1701.04128

<img src="./asset/img/paper2.png" width="970">

1. **Increasing the Receptive Field Size**: The authors suggest that subsampling and dilated convolutions can be effective ways to increase the receptive field size quickly. This can help to ensure that the network is able to capture more contextual information from the input.

2. **Modifying Network Architecture**: The authors also suggest that modifications to the network architecture, such as the use of skip connections, can help to control the size of the ERF. They note that skip connections tend to make the ERF smaller, which can be beneficial in certain contexts.

3. **Adjusting Training Techniques**: The authors observed that the ERF changes during the training of deep CNNs on real datasets. They found that as the network learns, the ERF gets bigger, and at the end of training, it is significantly larger than the initial ERF. This suggests that adjusting training techniques could also be a way to manage the size of the ERF.

4. **Balancing the Receptive Field and Resolution**: The authors suggest that there is a trade-off between the size of the receptive field and the resolution of the feature maps. They propose that this trade-off should be carefully considered when designing network architectures and training techniques.

5. **Understanding the Impact of Nonlinear Activations**: The authors also discuss the effect of nonlinear activations on the ERF. Understanding these effects can help to design more effective network architectures and training techniques.

<img src="./asset/img/activations.png" width="800">

<img src="./asset/img/c10.png" width="800">