## 1.二维卷积层

### 1.1 二维互相关运算

卷积层得名于卷积运算,但在卷积层中我们通常使用的是更加直观的``互相关运算``.  
几个名词:  
二维输入数组  
二维核(卷积核,过滤器)  
卷积核窗口(卷积窗口)  


In [13]:
# 输入数组X, 核数组K, 输出数组Y.
import torch
from torch import nn
def corr2d(X, K):
    h, w = K.shape
    Y = torch.zeros((X.shape[0]-h+1, X.shape[1]-w+1))

    # 这个思路不错
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = (X[i:i+h, j:j+w]*K).sum()
    return Y

In [14]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
X = torch.tensor([[0, 1, 2], [3, 4, 5], [6, 7, 8]], device = device)
K = torch.tensor([[0, 1], [2, 3]]).to(device)
Y = corr2d(X, K).cuda()
print(Y)
Y.device

tensor([[19., 25.],
        [37., 43.]], device='cuda:0')


device(type='cuda', index=0)

### 1.2 二维卷积层

将输入和卷积核做互相关运算,并加上一个``标量偏差``来得到输出.  
模型参数包括卷积核和标量偏差.

In [15]:
class Conv2D(nn.Module):
    def __init__(self, kernel_size):
        super(Conv2D, self).__init__()
        self.weight = nn.Parameter(torch.randn(kernel_size))
        self.bias = nn.Parameter(torch.randn(1))
    def forward(self, x):
        return corr2d(x, self.weight) + self.bias

In [21]:
X = torch.ones(6, 8)
X[:, 2:6] = 0
# print(X)
K = torch.tensor([[1, -1]])
Y = corr2d(X, K)
# 实例化一个核数组形状是(1, 2)的二维卷积层
conv2d = Conv2D(kernel_size=(1, 2))

step = 20
lr = 0.01
for i in range(step):
    Y_hat = conv2d(X)
    l = ((Y_hat - Y) ** 2).sum()
    l.backward()

    # 梯度下降
    conv2d.weight.data -= lr * conv2d.weight.grad
    conv2d.bias.data -= lr * conv2d.bias.grad

    # 梯度清0
    conv2d.weight.grad.fill_(0)
    conv2d.bias.grad.fill_(0)

    if (i + 1) % 5 == 0:
        print('Setp %d, los %.3f' % (i+1, l.item()))


Setp 5, los 0.395
Setp 10, los 0.068
Setp 15, los 0.014
Setp 20, los 0.003


观察20次迭代后学习到的卷积核的参数.

In [22]:
print("weight:", conv2d.weight.data)
print("bias:", conv2d.bias.data)

weight: tensor([[ 0.9884, -0.9831]])
bias: tensor([-0.0029])


### 1.3 互相关运算和卷积运算


实际上,卷积运算与互相关运算类似.  
``为了得到卷积运算的输出,只需要将核数组左右翻转并上下翻转,再与输入数组做互相关运算.``  

为什么卷积层能够使用互相关运算替代卷积运算?  
核数组都是学习出来的,卷积层无论使用互相关运算还是卷积运算都不影响模型预测时的输出.不同的地方只在与学习出来的参数的形式不同.

## 2.填充和步幅

In [34]:
import torch
from torch import nn

def comp_conv2d(conv2d, X):
    print(X.shape)
    X = X.view((1, 1) +  X.shape)
    print(X.shape)
    Y = conv2d(X)
    return Y.view(Y.shape[2:])

conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, padding=1)
X = torch.rand(8, 8)
comp_conv2d(conv2d, X).shape

torch.Size([8, 8])
torch.Size([1, 1, 8, 8])


torch.Size([8, 8])

### 给Tensor增加维度.  
X.shape => \[8, 8\]  
X.view((1, 1) + X.shape) => \[1, 1, 8, 8\]  
X.view((1,) + X.shape+ (1,)) => \[1, 8, 8, 1\]

## 3.多输入通道和多输出通道

### 1x1卷积层

调整通道数

1x1卷积的计算主要发生在通道维度上.  
输出中的每个元素来自输入中在高和宽上相同位置的元素在不同通道之间的按权重累加.假设我们将通道维度当做特征维,将高和宽维度上的元素当成数据样本,``那么1x1卷积层的作用就与全连接层等价``.

* 1x1卷积层可以被当做保持高和宽维度形状不变的全连接层使用.这样便可以通过``调整网络层之间的通道数``来控制模型复杂度.

## 4.池化层

它的提出是为了``缓解卷积层对位置的过度敏感性``.

## 5.卷积神经网络

Lenet分为``卷积层块``和``全连接层块``连个部分.

* 卷积层块  
基本单位是``卷积层后接最大池化层``:卷积层用来识别图像里的空间模式,如线条和物体局部,之后的最大池化层则用来降低卷积层对位置的敏感性.  
Lenet的卷积层块由两个这样的基本单位重复堆叠构成.

Lenet第一个卷积层输出通道数为6,第二个卷积层输出通道数则增加到16.这是因为第二个卷积层比第一个卷积层的输入的高和宽要小,所以``增加输出通道使连个卷积层的参数尺寸类似``.

### LeNet模型
下面通过Sequential类来实现Lenet模型.

In [1]:
import time
import torch
from torch import nn, optim
import torch.utils.data as Data
import torchvision
import torchvision.transforms as transforms

import sys
sys.path.append("..")
# import d2lzh_pytorch as d2l
device = torch.cuda.device('cuda' if torch.cuda.is_available() else 'cpu')

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5),
            nn.Sigmoid(),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(6, 16, 5),
            nn.Sigmoid(),
            nn.MaxPool2d(2, 2)
        )
        self.fc = nn.Sequential(
            nn.Linear(16*4*4, 120),
            nn.Sigmoid(),
            nn.Linear(120, 84),
            nn.Sigmoid(),
            nn.Linear(84, 10)
        )

    def forward(self, img):
        feature = self.conv(img)
        output = self.fc(feature.view(img.shape[0], -1))

net = LeNet()
print(net)


LeNet(
  (conv): Sequential(
    (0): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
    (1): Sigmoid()
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
    (4): Sigmoid()
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc): Sequential(
    (0): Linear(in_features=256, out_features=120, bias=True)
    (1): Sigmoid()
    (2): Linear(in_features=120, out_features=84, bias=True)
    (3): Sigmoid()
    (4): Linear(in_features=84, out_features=10, bias=True)
  )
)


#### 获取数据和训练模型

In [7]:
mnist_train = torchvision.datasets.FashionMNIST(root='~/Datasets/FashionMNIST', train=True, download=True, transform=transforms.ToTensor())
mnist_test = torchvision.datasets.FashionMNIST(root='~/Datasets/FashionMNIST', train=False, download=True, transform=transforms.ToTensor())

In [6]:
batch_size = 256
num_workers = 4
train_iter = torch.utils.data.DataLoader(mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)

test_iter = torch.utils.data.DataLoader(mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers)