# 转置卷积
* 卷积不会增大输入高和宽，通常要么不变要么减半
* 转置卷积则可以用来增大输入高宽

语义分割需要做像素级别的输出，卷积会压缩像素，因此需要转置卷积增大输入高宽<br>
<img src='./image/transposedConv.jpg' alt='transposedConv' width=400><br>

为什么称之为转置<br>
将卷积变换为矩阵乘法是一种优化方式
* 对于卷积$Y = X \star W$
  * 可以对W构造一个V使得卷积等价于矩阵乘法$Y' = VX'$ 假设形状为$n \ n*m \ m$
  * 这里$Y', X'$是X,Y对应的向量版本
* 转置卷积则等价于$Y' = V^TX'$ 则这里的形状就是$m \ m*n \ n$
* 如果卷积将输入从(h, w)变成了(h', w')
  * 同样的超参数的转置卷积则从(h', w')变为(h, w)  
  
卷积做下采样，转置卷积做上采样<br>
重新排列输入和核<br>
<img src='./image/transConv1.jpg' alt='transConv1' width=400><br>
<img src='./image/transConv2.jpg' alt='transConv2' width=400><br>
<img src='./image/transConv3.jpg' alt='transConv3' width=400><br>

实现基本的转置卷积运算

In [None]:
import torch
from torch import nn
from d2l import torch as d2l

def trans_conv(X, K): # padding = 0 stride = 1
    h, w = K.shape
    Y = torch.zeros((X.shape[0] + h - 1, X.shape[1] + w -1))
    for i in range(X.shape[0]):
        for j in range(X.shape[1]):
            Y[i:i+h, j:j+2] += X[i, j] * K
    return Y



In [4]:
X = torch.tensor([[0.0, 1.0], [2.0, 3.0]])
K = torch.tensor([[0.0, 1.0], [2.0, 3.0]])
trans_conv(X, K)

tensor([[ 0.,  0.,  1.],
        [ 0.,  4.,  6.],
        [ 4., 12.,  9.]])

torch API


In [5]:
X, K = X.reshape(1, 1, 2, 2), K.reshape(1, 1, 2, 2)
tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, bias=False)
tconv.weight.data = K
tconv(X)

tensor([[[[ 0.,  0.,  1.],
          [ 0.,  4.,  6.],
          [ 4., 12.,  9.]]]], grad_fn=<ConvolutionBackward0>)

填充步幅和多通道

In [8]:
tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, padding=1, bias=False)
tconv.weight.data = K # 填充回使变小
tconv(X) # 等价于对于输出加上1的填充后，进行卷积得到的输入矩阵

tensor([[[[4.]]]], grad_fn=<ConvolutionBackward0>)

In [None]:
tconv = nn.ConvTranspose2d(1, 1, kernel_size=2, stride=2, bias=False)
tconv.weight.data = K
tconv(X) # 同样的逆向

tensor([[[[0., 0., 0., 1.],
          [0., 0., 2., 3.],
          [0., 2., 0., 3.],
          [4., 6., 6., 9.]]]], grad_fn=<ConvolutionBackward0>)

In [11]:
X = torch.rand(size=(1, 10, 16, 16))
conv = nn.Conv2d(10, 20, kernel_size=5, padding=2, stride=3)
tconv = nn.ConvTranspose2d(20, 10, kernel_size=5, padding=2, stride=3)
tconv(conv(X)).shape == X.shape # 卷积的逆运算

True

与矩阵变换的联系

In [13]:
X = torch.arange(9.0).reshape(3, 3)
K = torch.tensor([[1.0, 2.0], [3.0, 4.0]])
Y = d2l.corr2d(X, K)
Y, X, K

(tensor([[27., 37.],
         [57., 67.]]),
 tensor([[0., 1., 2.],
         [3., 4., 5.],
         [6., 7., 8.]]),
 tensor([[1., 2.],
         [3., 4.]]))

In [None]:
def kernel2matrix(K): # k->v
    k, W = torch.zeros(5), torch.zeros((4, 9))
    k[:2], k[3:5] = K[0, :], K[1, :]
    print(k)
    W[0, :5], W[1, 1:6], W[2, 3:8], W[3, 4:] = k, k, k, k
    return W

W = kernel2matrix(K) # x 3*3->9  y 2*2->4
W

tensor([1., 2., 0., 3., 4.])


tensor([[1., 2., 0., 3., 4., 0., 0., 0., 0.],
        [0., 1., 2., 0., 3., 4., 0., 0., 0.],
        [0., 0., 0., 1., 2., 0., 3., 4., 0.],
        [0., 0., 0., 0., 1., 2., 0., 3., 4.]])

In [17]:
Y == torch.matmul(W, X.reshape(-1)).reshape(2, 2)
# 验证是否相同

tensor([[True, True],
        [True, True]])

In [18]:
Z = trans_conv(Y, K)
Z == torch.matmul(W.T, Y.reshape(-1)).reshape(3, 3)

tensor([[True, True, True],
        [True, True, True],
        [True, True, True]])

In [40]:
# 填充
X = torch.arange(4, dtype=torch.float32).reshape(1, 1, 2, 2)
W = torch.ones(4).reshape(1, 1, 2, 2)
X, W

(tensor([[[[0., 1.],
           [2., 3.]]]]),
 tensor([[[[1., 1.],
           [1., 1.]]]]))

In [None]:
a = nn.ConvTranspose2d(1, 1, kernel_size=2, padding=0, bias=False)
b = nn.ConvTranspose2d(1, 1, kernel_size=2, padding=1, bias=False)
a.weight.data = W
b.weight.data = W
Y1 = a(X)
Y2 = b(X)
Y1, Y2 # 其实就是删除了外围的元素

(tensor([[[[0., 1., 1.],
           [2., 6., 4.],
           [2., 5., 3.]]]], grad_fn=<ConvolutionBackward0>),
 tensor([[[[6.]]]], grad_fn=<ConvolutionBackward0>),
 tensor(2.6667, grad_fn=<DivBackward0>))

形状换算<br>
输入高宽为n，核k，填充p，步幅s<br>
* 转置卷积：n' = sn + k - 2p - s
* 卷积：n' = (n - k - 2p + s)/s
* 如果让高宽成倍增加那么k = 2p + s