# 1.模型构造

使用自定义类继承自`nn.Module`，需要自定义的包括：

1. 在构造函数中定义需要用到的网络层
2. 自定义`forward`函数，规定前相传播的计算过程（`nn.Sequential`的计算过程是严格按照其中包含的层序进行的）


# 2.参数管理

## 2.1参数访问

对于一个层对象中的参数，通过`state_dict()`访问，返回一个包含所有参数的dict

如：`nn.Linear.state_dict()`

`weight`和`bias`中又包含两个属性：`weight`和`grad`，分别为参数的具体值和梯度值

In [28]:
import torch
from torch import nn

class MyModule(nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.linear = nn.Linear(10,1)

    def forward(self, x):
        return self.linear(x)

    def param_traverse(self):
        print(self.linear.state_dict())
        print(self.linear.weight.data)
        print(self.linear.weight.grad)

myMod = MyModule()
myMod.param_traverse()

OrderedDict([('weight', tensor([[ 0.2137, -0.0377,  0.0331, -0.0497, -0.1235,  0.2806,  0.2334,  0.2007,
         -0.2862, -0.0361]])), ('bias', tensor([0.1923]))])
tensor([[ 0.2137, -0.0377,  0.0331, -0.0497, -0.1235,  0.2806,  0.2334,  0.2007,
         -0.2862, -0.0361]])
None


## 2.2参数初始化

对一个模块使用`apply`函数，`apply`函数的参数为调用该`apply`的`nn.Module`对象及其子模块



In [29]:
def xavier(layer):
    if type(layer) == nn.Linear:
        nn.init.xavier_normal_(layer.weight)

def myInit(layer):
    if type(layer) == nn.Linear:
        # nn.init.constant_(layer.weight, 0)
        layer.weight.data = torch.zeros_like(layer.weight.data)

linear = nn.Linear(10,1)
linear.apply(xavier)
print(linear.weight.data)
linear.apply(myInit)
print(linear.weight.data)

tensor([[-0.6841,  0.3849,  0.2867,  0.2755,  0.1669, -0.8730,  0.3985, -0.3791,
          0.5592,  0.3887]])
tensor([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])


修改某些模块中的某几个数据：

In [30]:
with torch.no_grad():
    linear.weight.data[0, 0] = 42   # 第0行，第0列元素
print(linear.weight.data)

tensor([[42.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])


# 3.模型保存

一般来说，pytorch无法把网络结构保存下来，只保存权重

tensor

In [31]:
x = torch.tensor([1,2,3,4])

torch.save(x, './tmp/x-file')

x_ = torch.load('./tmp/x-file')
print(x_)

tensor([1, 2, 3, 4])


  x_ = torch.load('./tmp/x-file')


dict

In [32]:
dic_ = {
    'x': torch.arange(4),
    'y': torch.zeros(4)
}

torch.save(dic_, './tmp/dict')

dic_1 = torch.load('./tmp/dict')
dic_1

  dic_1 = torch.load('./tmp/dict')


{'x': tensor([0, 1, 2, 3]), 'y': tensor([0., 0., 0., 0.])}

假设定义一个MLP，将其各层的权重保存下来

之前说过，模型的`state_dict()`函数返回一个包含模型每一层权重参数的字典，具体格式如下：

```py
OrderedDict([
    ('layer1_name.weight', tensor(...)),
    ('layer1_name.bias', tensor(...)),
    ('layer2_name.weight', tensor(...))
])
```
所以，模型权重的保存，可以直接调用该函数

In [33]:
import torch
from torch import nn

class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.linear = nn.Linear(20, 256)
        self.relu = nn.ReLU()
        self.output = nn.Linear(256, 10)

    def forward(self, x):
        return self.output(self.relu(self.linear(x)))
    
net = MLP()

# 保存各层权重
torch.save(net.state_dict(), './tmp/mlp.params')

读取模型权重

In [39]:
clone = MLP()

clone.load_state_dict(torch.load('./tmp/mlp.params'))

# 查看模型结构
print(clone.eval())

# 给两个网络输入同一个数据，看结果是否相同
x = torch.randn(size=(2, 20))
out_net = net.forward(x)
out_clone = clone.forward(x)
out_clone == out_net

MLP(
  (linear): Linear(in_features=20, out_features=256, bias=True)
  (relu): ReLU()
  (output): Linear(in_features=256, out_features=10, bias=True)
)


  clone.load_state_dict(torch.load('./tmp/mlp.params'))


tensor([[True, True, True, True, True, True, True, True, True, True],
        [True, True, True, True, True, True, True, True, True, True]])