# 6 CNN

## 6.2 图像卷积


### 6.2.1 互相关运算
在卷积层中，输入张量和核张量通过互相关运算产生输出张量。
因为要足够的空间在图像上“移动卷积核，可以通过在图像边界周围填充0来保证有足够的空间移动内核。从而保持输出大小不变

In [2]:
import torch
from torch import nn
from d2l import torch as d2l

def corr2d(X, K):
    h, w = K.shape
    Y = torch.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = (X[i:i + h, j:j + w] * K).sum()
    
    return Y
    

In [3]:
X = torch.tensor([[0.0, 1.0, 2.0], [3.0, 4.0, 5.0], [6.0, 7.0, 8.0]])
K = torch.tensor([[0.0, 1.0], [2.0, 3.0]])
corr2d(X, K)

tensor([[19., 25.],
        [37., 43.]])

### 6.2.2卷积层
基于上面定义的corr2d函数实现二维卷积层。在__init__构造函数中，将weight和bias声明为两个模型参数，前向传播函数调用corr2d函数并添加偏置

In [6]:
class Conv2D(nn.Module):
    def __init__(self, kernel_size):
        super().__init__()
        self.weight = nn.Parameter(torch.rand(kernel_size))
        self.bias = nn.Parameter(torch.zeors(1))
        
    def forward(self, x):
        return corr2d(x, self.weight) + self.bias

### 6.2.3 图像中目标的边缘检测


In [8]:
X = torch.ones((6, 8))
X[:, 2: 6] = 0
X

tensor([[1., 1., 0., 0., 0., 0., 1., 1.],
        [1., 1., 0., 0., 0., 0., 1., 1.],
        [1., 1., 0., 0., 0., 0., 1., 1.],
        [1., 1., 0., 0., 0., 0., 1., 1.],
        [1., 1., 0., 0., 0., 0., 1., 1.],
        [1., 1., 0., 0., 0., 0., 1., 1.]])

构造一个可以检测垂直边缘的卷积核

In [11]:
K = torch.Tensor([[1.0, -1.0]])
Y = corr2d(X, K)
Y

tensor([[ 0.,  1.,  0.,  0.,  0., -1.,  0.],
        [ 0.,  1.,  0.,  0.,  0., -1.,  0.],
        [ 0.,  1.,  0.,  0.,  0., -1.,  0.],
        [ 0.,  1.,  0.,  0.,  0., -1.,  0.],
        [ 0.,  1.,  0.,  0.,  0., -1.,  0.],
        [ 0.,  1.,  0.,  0.,  0., -1.,  0.]])

对水平边缘的检测就不太行

In [12]:
corr2d(X.t(), K)

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

### 6.2.4 学习卷积核


In [28]:
conv2d = nn.Conv2d(1, 1, kernel_size=(1, 2), bias=False)

#这个二维卷积层使用四维输入和输出格式（批量大小，通道，高度，宽度）
#其中批量大小和通道数都是1

X = X.reshape((1, 1, 6, 8))
Y = Y.reshape((1, 1, 6, 7))

for i in range(100):
    Y_hat = conv2d(X)
    l = (Y_hat - Y)  ** 2
    conv2d.zero_grad()
    l.sum().backward()
    conv2d.weight.data[:] -= 0.003 * conv2d.weight.grad
    print(f'batch {i+1}, loss {l.sum():.3f}')

batch 1, loss 18.863
batch 2, loss 17.485
batch 3, loss 16.219
batch 4, loss 15.052
batch 5, loss 13.974
batch 6, loss 12.977
batch 7, loss 12.053
batch 8, loss 11.197
batch 9, loss 10.402
batch 10, loss 9.665
batch 11, loss 8.980
batch 12, loss 8.345
batch 13, loss 7.754
batch 14, loss 7.205
batch 15, loss 6.696
batch 16, loss 6.222
batch 17, loss 5.782
batch 18, loss 5.373
batch 19, loss 4.993
batch 20, loss 4.640
batch 21, loss 4.312
batch 22, loss 4.007
batch 23, loss 3.724
batch 24, loss 3.461
batch 25, loss 3.216
batch 26, loss 2.989
batch 27, loss 2.777
batch 28, loss 2.581
batch 29, loss 2.398
batch 30, loss 2.229
batch 31, loss 2.071
batch 32, loss 1.925
batch 33, loss 1.789
batch 34, loss 1.662
batch 35, loss 1.545
batch 36, loss 1.435
batch 37, loss 1.334
batch 38, loss 1.240
batch 39, loss 1.152
batch 40, loss 1.071
batch 41, loss 0.995
batch 42, loss 0.925
batch 43, loss 0.859
batch 44, loss 0.798
batch 45, loss 0.742
batch 46, loss 0.689
batch 47, loss 0.641
batch 48, los

In [29]:
conv2d.weight.data.reshape((1, 2))

tensor([[ 0.9681, -0.9681]])

In [30]:
conv2d(X)

tensor([[[[ 0.0000,  0.9681,  0.0000,  0.0000,  0.0000, -0.9681,  0.0000],
          [ 0.0000,  0.9681,  0.0000,  0.0000,  0.0000, -0.9681,  0.0000],
          [ 0.0000,  0.9681,  0.0000,  0.0000,  0.0000, -0.9681,  0.0000],
          [ 0.0000,  0.9681,  0.0000,  0.0000,  0.0000, -0.9681,  0.0000],
          [ 0.0000,  0.9681,  0.0000,  0.0000,  0.0000, -0.9681,  0.0000],
          [ 0.0000,  0.9681,  0.0000,  0.0000,  0.0000, -0.9681,  0.0000]]]],
       grad_fn=<ThnnConv2DBackward>)