### Padding and stride
- when we apply a convolutional layer to an image, we lose some information at the edges of the image
  - typically pad with zeros
  - add ph and pw zeros to the height and width of the input image on each side
  - output shape = $(n + ph - kh + 1) x (n + pw - kw + 1)$
  - sometimes we set $ph = kh - 1$ and $pw = kw - 1$ to keep the spatial dimensions the same of input and output (on each side)
  - cnns use kernels with odd height and width values, such as 1, 3, 5, 7, etc. 
- stride is used to reduce the spatial size of the output
  - sometimes we want to skip some of the input when applying the kernel
  - number of rows,cols we traverse per slide is called stride
  - output shape = $(n + ph + sh - kh)/sh + (n + pw + sw - kw)/sw $
  - set ph & pw = $k - 1$ again, so simplifies to $(n + sh - 1)/sh + (n + sw -1)/sw$, if h & w divisble by s then output simplifies to $(nh/sh) * (nw/sh)$

NOTE: for above equatins its 2*p

In [1]:
import torch
from torch import nn

In [5]:
# padding - K = 3x3 and in = 8x8 and add 1 to all sides thus out will be 8x8

# We define a helper function to calculate convolutions. It initializes the
# convolutional layer weights and performs corresponding dimensionality
# elevations and reductions on the input and output
def comp_conv2d(conv2d, X):
    # (1, 1) indicates that batch size and the number of channels are both 1
    X = X.reshape((1, 1) + X.shape) # 1,1,8,8
    print(X.shape)
    Y = conv2d(X)
    # strip first 2 dimensions: examples and channels
    return Y.reshape(Y.shape[2:])

# 1 row and column is padded on either side, so a total of 2 rows or columns
# are added
conv2d = nn.LazyConv2d(1, kernel_size=3, padding=1)
X = torch.rand(size=(8, 8))

# output is same size as input here
comp_conv2d(conv2d, X).shape

torch.Size([1, 1, 8, 8])


torch.Size([8, 8])

In [10]:
# We use a convolution kernel with height 5 and width 3. The padding on either
# side of the height and width are 2 and 1, respectively (5 - 1)/2 and (3 - 2)/2
conv2d = nn.LazyConv2d(1, kernel_size=(5, 3), padding=(2, 1))
comp_conv2d(conv2d, X).shape

torch.Size([1, 1, 8, 8])


torch.Size([8, 8])

In [15]:
# Stride

# stride = 2
conv2d = nn.LazyConv2d(1, kernel_size=3, padding=1, stride=2)
comp_conv2d(conv2d, X).shape 

torch.Size([1, 1, 8, 8])


torch.Size([4, 4])

In [22]:
#  |(8 -3 + 0 + 3)/3| , |(8-5+1+4)/4|
conv2d = nn.LazyConv2d(1, kernel_size=(3, 5), padding=(0, 1), stride=(3, 4))
comp_conv2d(conv2d, X).shape

torch.Size([1, 1, 8, 8])


torch.Size([2, 2])

In [None]:

conv2d = nn.LazyConv2d(1, kernel_size=(5, 5), padding=(1, 1), stride=(1, 2))
comp_conv2d(conv2d, X).shape