假设输入input的height,width为$n_h,n_w$，kernel的height,width为$k_h, k_w$,则输出形状为

$$\left( n_h - k_h + 1 \right) \times \left( n_w - k_w + 1\right)$$

**考虑填充padding**

假设height方向添加了共$p_h$行，width方向共添加了共$p_w$列，输出的形状为

$$\left( n_h + p_h - k_h + 1 \right) \times \left( n_w + p_w - k_w + 1\right)$$

一般选择将$p_h = k_h - 1, p_w = k_w - 1$来使得input和output形状一致．

In [1]:
import torch
from torch import nn

def compute_conv2d(conv2d, X):
    '''
        对四维的Tensor进行卷积计算．重点在于对输入输出进行降维
    '''
    X = X.view((1, 1) + X.shape)  # (1,1)分别代表batch,channel
    Y = conv2d(X)
    return Y.view(Y.shape[2:])  # 将前两维排除

conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, padding=1)

# 输入\输出维度保持一致
X = torch.rand(8, 8)
compute_conv2d(conv2d, X).shape


torch.Size([8, 8])

In [3]:
# Kernel不一致，padding可以做相应调整，使输出大小不变
conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=(5,3), padding=(2,1))
compute_conv2d(conv2d, X).shape

torch.Size([8, 8])

**考虑步幅stride**

假设height方向的步幅为$s_h$, width方向的步幅为$s_w$,那么经过卷积后图像的大小为

$$ \lfloor{\frac{n_h + p_h - k_h + s_h}{s_h}}\rfloor \times \lfloor \frac{n_w + p_w - k_w + s_w}{s_w}\rfloor $$

**注意**：*这里的分子跟上面是不一致的*

In [5]:
#　例子(步幅为２，大小减半)
conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, padding=1, stride=2)
compute_conv2d(conv2d, X).shape

torch.Size([4, 4])

In [6]:
# 一个复杂的例子
conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=(3,5), padding=(0,1), stride=(3,4))
compute_conv2d(conv2d, X).shape

torch.Size([2, 2])