# 1. 学习内容概述

<br/> 
回顾机器学习流程图，之前我们已经学习了，如何从硬盘读入数据，然后经过数据预处理模块，对数据进行裁剪，增强等一些列操作后，得到张量，接下来就是要把这些张量输入到神经网络模型中去，进行一系列复杂的运算处理，从而完成**目标检测，分割**等操作。
<br/> 
<br/> 
<img src="picture/机器学习训练步骤.png">

<br/> 
本节介绍网络模型的基本类nn.Module，nn.Module是所有网络层的基本类，它拥有8个有序字典，用于管理模型属性，本节课中将要学习如何构建一个Module。

然后通过网络结构和计算图两个角度去观察搭建一个网络模型需要两个步骤：**第一步，搭建子模块**；**第二步，拼接子模块**。

----

# 2. 网络模型创建步骤

<br/> 
模型的创建，分为 **构建网络层** 和 **拼接网络层** ，其中构建网络层就是构建子模块:比如卷积层，池化层，激活层等，接下来就是将这些子模块按照一定的拓扑结构拼接成网络，从而形成我们所熟知的一些复杂的网络，比如:Lenet, ResNet    

接下来就是模型的初始化，已经有学者提出的方法:Xavier, Kaiming, 均匀分布等    

这些全都包含在pytorch的 nn.Module模块中    

<img src="picture/nn.Module框图.png" width=550>

---


## 2.1 Lenet实例分析

<br/> 
在经典的卷积神经网络的结构中，分为如下子模块: 卷积层1，池化层1，卷积层2，池化层2，全连阶层1-2-3，共计7层结构。从输入的$32*32*3$，到最后输出一个长度为10的概率值向量。
<img src="picture/lenet01.png" width=750>

In [None]:
# 在此设置断点，step into
net = LeNet(classes=2)

# 跳转到此处
class LeNet(nn.Module):
    def __init__(self, classes):
------->super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)

此时输入self，会得到:
```python
self
*** AttributeError: 'LeNet' object has no attribute '_modules'
```
在顺序执行所有的单行代码，得到:
```python
self
LeNet(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
)
--Return--
```
此时完成了，子模块类的**构建**。

---


In [None]:
# ============================ step 2/5 模型 ============================

net = LeNet(classes=2)
net.initialize_weights()

# ============================ step 3/5 损失函数 ============================
criterion = nn.CrossEntropyLoss()                                                   # 选择损失函数

# ============================ step 4/5 优化器 ============================
optimizer = optim.SGD(net.parameters(), lr=LR, momentum=0.9)                        # 选择优化器
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)     # 设置学习率下降策略

# ============================ step 5/5 训练 ============================
train_curve = list()
valid_curve = list()

for epoch in range(MAX_EPOCH):

    loss_mean = 0.
    correct = 0.
    total = 0.

    net.train()
    for i, data in enumerate(train_loader):

        # forward
        inputs, labels = data
------->outputs = net(inputs)

        # backward
        optimizer.zero_grad()
        loss = criterion(outputs, labels)
        loss.backward()

        # update weights
        optimizer.step()

        # 统计分类情况
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).squeeze().sum().numpy()

这里完成了，net类的实例化，载入了Input，在这里step into会跳转到`module.p`的`__call__()函数`
```python
 def __call__(self, *input, **kwargs):
        for hook in self._forward_pre_hooks.values():
            result = hook(self, input)
            if result is not None:
                if not isinstance(result, tuple):
                    result = (result,)
                input = result
        if torch._C._get_tracing_state():
            result = self._slow_forward(*input, **kwargs)
        else:
------------>result = self.forward(*input, **kwargs)
```
在进入到`result = self.forward(*input, **kwargs)`，就会跳转到自己定义的`forward函数`
```python
def forward(self, x):
    out = F.relu(self.conv1(x))
    out = F.max_pool2d(out, 2)
    out = F.relu(self.conv2(out))
    out = F.max_pool2d(out, 2)
    out = out.view(out.size(0), -1)
    out = F.relu(self.fc1(out))
    out = F.relu(self.fc2(out))
    out = self.fc3(out)
    return out
```
     
</br> 
这里完成了网络的前向传播，也就是子模块间的**拼接**。

<img src="picture/模块拼接.png" width=500>

---


# 3. torch.nn 中 nn.Module学习
`torch.nn`是框架的神经网络模块，其中包含四个大类:

- nn.Paramter: 它是张量的子类，表示可以学习的参数，如weight，bias
- nn.Module: 所有网络的基类，比如Lenet是一个nn.Module类，其中的卷积层，池化层也是一个nn.Module类
- nn.functional: 函数的具体实现，比如卷积函数，池化函数
- nn.init: 提供了丰富的参数初始化方法

<img src="picture/torch.nn分类.png" width=650>

---
## 3.1 nn.Module的八个字典
接下来具体学习其中的**nn.Module**, nn.Module包含八个重要的有序字典，也称为属性。
```python
self._parameters = OrderedDict()
self._buffers = OrderedDict() 
self._backward_hooks = OrderedDict() 
self._forward_hooks = OrderedDict() 
self._forward_pre_hooks = OrderedDict() 
self._state_dict_hooks = OrderedDict() 
self._load_state_dict_pre_hooks = OrderedDict() 
self._modules = OrderedDict()
```

其中最重要的是:
- <font color=blue>parameters: 存储管理nn.Parameter类</font>
- <font color=blue>modules : 存储管理nn.Module类, 比如Lenet中的卷积，池化都属于这个类</font>
- buffers:存储管理缓冲属性，如BN层中的running_mean 
- xxx_hooks:存储管理钩子函数

接下来继续通过代码，理解nn.Module的创建过程。

```python
# Lenet 继承 nn.Module, 所以Lenet是nn.Module类
class LeNet(nn.Module):
    def __init__(self, classes):
        # step into 到继承父类的函数__init()__中
------->super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)
```
---
step into `__init__(self)`, 发现通过一个结构体初始化，在结构体中初始化之前提到的八个字典，着重关注`_self._module`和`self._parameter`
```python
def __init__(self):
    self._construct()
    # initialize self.training separately from the rest of the internal
    # state, as it is managed differently by nn.Module and ScriptModule
    self.training = True

def _construct(self):
    """
    Initializes internal Module state, shared by both nn.Module and ScriptModule.
    """
    torch._C._log_api_usage_once("python.nn_module")
    self._backend = thnn_backend
    self._parameters = OrderedDict()
    self._buffers = OrderedDict()
    self._backward_hooks = OrderedDict()
    self._forward_hooks = OrderedDict()
    self._forward_pre_hooks = OrderedDict()
    self._state_dict_hooks = OrderedDict()
    self._load_state_dict_pre_hooks = OrderedDict()
    self._modules = OrderedDict()
```

--- 

运到后return并离开`__init__()`，来到下一行:
```python
class LeNet(nn.Module):
    def __init__(self, classes):
        # step into 到继承父类的函数__init()__中
        super(LeNet, self).__init__()
------->self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)
```
step into `nn.Conv2d`, 发现`Conv2d(_ConvNd)`继承`class ConvNd`,`class ConvNd`也继承`module`, 所有大家都有八个字典的属性。
```python
class Conv2d(_ConvNd):
        def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1,
                 bias=True, padding_mode='zeros'):
        kernel_size = _pair(kernel_size)
        stride = _pair(stride)
        padding = _pair(padding)
        dilation = _pair(dilation)
        super(Conv2d, self).__init__(
            in_channels, out_channels, kernel_size, stride, padding, dilation,
            False, _pair(0), groups, bias, padding_mode)

```
跳转出来，回到Lenet类中，此时查看类的字典:
```python
# 1->调用指令
self.__dict__
# 得到
{'_backend': <torch.nn.backends.thnn.THNNFunctionBackend object at 0x125bb1438>, 
 '_parameters': OrderedDict(), 
 '_buffers': OrderedDict(), 
 '_backward_hooks': OrderedDict(), 
 '_forward_hooks': OrderedDict(), 
 '_forward_pre_hooks': OrderedDict(), 
 '_state_dict_hooks': OrderedDict(), 
 '_load_state_dict_pre_hooks': OrderedDict(), 
 '_modules': OrderedDict([('conv1', Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)))]), 
 'training': True}

# 2->调用指令
self._modules
# 得到有序字典，key值是'conv1',value是Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)))
OrderedDict([('conv1', Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)))])

# 3->self是个nn.Module类，其中的卷积层也是nn.Module类, 那么它也有相对应的8个字典属性
# 且注意到⚠️，conv1类中的'_modules'没有参数，这是因为没有子类了，卷积层是底层的类了
self._modules['conv1'].__dict__
# 得到
{'_backend': <torch.nn.backends.thnn.THNNFunctionBackend object at 0x125bb1438>, 
 '_parameters': OrderedDict([
     ('weight', Parameter containing:
     tensor([[[[ 0.0595, -0.0510, -0.0224,  0.0542, -0.1087],
              [ 0.0692, -0.0238,  0.0587,  0.0161, -0.0141],
              [ 0.0320,  0.0057,  0.0422, -0.0450, -0.0084],
              [-0.0104,  0.0167, -0.0005,  0.1009,  0.0359],
              [-0.0430, -0.0697, -0.0194, -0.0498, -0.0370]], ......blabla....
    ], requires_grad=True, 
    ('bias', Parameter containing:
    tensor([ 0.0387,  0.0632,  0.1145, -0.0121,  0.1111,  0.0742],requires_grad=True))]), 
 '_buffers': OrderedDict(), 
 '_backward_hooks': OrderedDict(), 
 '_forward_hooks': OrderedDict(), 
 '_forward_pre_hooks': OrderedDict(), 
 '_state_dict_hooks': OrderedDict(), 
 '_load_state_dict_pre_hooks': OrderedDict(), 
 '_modules': OrderedDict(), 
 'training': True, 
 'in_channels': 3, 
 'out_channels': 6, 
 'kernel_size': (5, 5), 
 'stride': (1, 1), 
 'padding': (0, 0), 
 'dilation': (1, 1), 
 'transposed': False, 
 'output_padding': (0, 0), 
 'groups': 1, 
 'padding_mode': 'zeros'}

# 4->卷积层类中的parameters字典中的weight是一个Tensor类，它有tensor的一些属性: _dtype, _data, _device
self._modules['conv1']._parameters['weight'].dtype
# 得到
torch.float32
```

---



`nn.Linear(16*5*5, 120`产生一个类，再赋值给`self.fc`之前必定会进入一个函数中先<font color=red>判定</font>:
```python
class LeNet(nn.Module):
    def __init__(self, classes):
        # step into 到继承父类的函数__init()__中
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
------->self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)
```
step into `__setattr__(self, name, value)`函数中, 这个函数会检查输入的value的属性(nn._parameters/nn._Modules/nn._buffers?)并赋值, 比如`nn.Conv2d(6, 16, 5)`属于`nn.modules类`, 不属于`nn.parameters类`, 因此`nn.Conv2d(6, 16, 5)`会存放在`Lenet`的`nn.modules`字典中保存。
```python
def __setattr__(self, name, value):
        def remove_from(*dicts):
            for d in dicts:
                if name in d:
                    del d[name]

        params = self.__dict__.get('_parameters')
        # 检查是否是实例
------->if isinstance(value, Parameter):
            if params is None:
                raise AttributeError(
                    "cannot assign parameters before Module.__init__() call")
            remove_from(self.__dict__, self._buffers, self._modules)
            self.register_parameter(name, value)
        # 检查是否有parameters
------->elif params is not None and name in params:
            if value is not None:
                raise TypeError("cannot assign '{}' as parameter '{}' "
                                "(torch.nn.Parameter or None expected)"
                                .format(torch.typename(value), name))
            self.register_parameter(name, value)
        # 检查是否是Modules
------->else:
            modules = self.__dict__.get('_modules')
            if isinstance(value, Module):
                if modules is None:
                    raise AttributeError(
                        "cannot assign module before Module.__init__() call")
                remove_from(self.__dict__, self._parameters, self._buffers)
                # 是Modules，把value(也就是卷积层)赋值给modules['conv2']
--------------->modules[name] = value
            elif modules is not None and name in modules:
                if value is not None:
                    raise TypeError("cannot assign '{}' as child module '{}' "
                                    "(torch.nn.Module or None expected)"
                                    .format(torch.typename(value), name))
                modules[name] = value
            else:
                buffers = self.__dict__.get('_buffers')
                if buffers is not None and name in buffers:
                    if value is not None and not isinstance(value, torch.Tensor):
                        raise TypeError("cannot assign '{}' as buffer '{}' "
                                        "(torch.Tensor or None expected)"
                                        .format(torch.typename(value), name))
                    buffers[name] = value
                else:
                    object.__setattr__(self, name, value)
```

# 4. 总结
nn.Module总结
- 一个module可以包含多个子module,比如LeNet是一个module,组成他的卷积层也是一个module.
- 一个module相当于一个运算，必须实现forward()函数
- 每个module都有8个字典管理它的属性