# **残差网络ResNet**

<div align=center>
<img width="400" src="../image/5.11_residual-block.svg"/>
</div>
<div align=center>图5.9 普通的网络结构（左）与加入残差连接的网络结构（右）</div>

我们把$f(x) = x$称为恒等映射。恒等映射更加易于捕捉数据的细微波动，通过残差块将恒等映射纳入网络可以使得数据更好的向前传播

ResNet采用了全$3 \times 3$的设计。残差块的结构如下所示：
- 两个相同输出通道层的$3 \times 3$的卷积层
- 每个卷积层后面都有一个bn层和relu函数
- 残差块的输入被连接到最后的relu函数之前
- 两个卷积层的输入输出设计是一样的，如果通道数不同需要使用$1 \times 1$进行改变

In [2]:
import torch
from torch import nn, optim
import torch.nn.functional as F

In [3]:
import sys
sys.path.append(r'..\utils') 
import d2lzh as d2l
device = torch.device('cuda')

In [None]:
class Residual(nn.Module):
    def __init__(self, in_channels, out_channels, use_1x1conv=False, stride=1):
        super(Residual, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, stride=stride)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1)
        if use_1x1conv:
            self.con3 = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride)
        else:
            self.conv2 = None
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.bn2 = nn.BatchNorm2d(out_channels)