介绍 modules() children() parameters() state_dict() 

参考：

1. [PyTorch中的model.modules(), model.children(), model.named_children(), model.parameters(), model.nam...](https://www.jianshu.com/p/a4c745b6ea9b)

2. [PyTorch模型保存深入理解](https://www.jianshu.com/p/6c558300130f)

3. [PyTorch 101, Part 3: Going Deep with PyTorch](https://blog.paperspace.com/pytorch-101-advanced/)

modules & children()

modules() Returns an iterator over all modules in the network.

注意这个 all。modules()会迭代遍历模型的所有子层。所有子层即指torch.nn.Module子类。

主要是modules的理解，整个Net是torch.nn.Module的子类，Net 里边的层照样是torch.nn.modules的子类。


children() Returns an iterator over immediate children modules.

children会返回子层



如下图

![图1](images/modules.png)

## model.modules()

例子一

In [1]:
import torch

class Net(torch.nn.Module):

    def __init__(self, num_class=10):
        super().__init__()
        self.features = torch.nn.Sequential(
            torch.nn.Conv2d(in_channels=3, out_channels=6, kernel_size=3),
            torch.nn.BatchNorm2d(6),
            torch.nn.ReLU(inplace=True),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),
            torch.nn.Conv2d(in_channels=6, out_channels=9, kernel_size=3),
            torch.nn.BatchNorm2d(9),
            torch.nn.ReLU(inplace=True),
            torch.nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.classifier = torch.nn.Sequential(
            torch.nn.Linear(9*8*8, 128),
            torch.nn.ReLU(inplace=True),
            torch.nn.Dropout(),
            torch.nn.Linear(128, num_class)
        )

    def forward(self, x):
        output = self.features(x)
        output = output.view(output.size()[0], -1)
        output = self.classifier(output)

        return output

model = Net()    


这个网络本身有三个层次

Net:

----features

------------Conv2d

------------BatchNorm2d

------------ReLU

------------MaxPool2d

------------Conv2d

------------BatchNorm2d

------------ReLU

------------MaxPool2d


----classifier:

------------Linear

------------ReLU

------------Dropout

------------Linear
 

Net本身是一个torch.nn.Module子类，它又包含features和classfier两个由Sequential容器组成的torch.nn.Module子类。

而features和classfier各自又包含众多的网络层,他们也是torch.nn.Module的子类。

先看 model.modules

model.modules会迭代遍历模型的所有子层，注意是迭代。。。

所有子层即指torch.nn.Modules子类。

在例子一中，就是 Net features classifier torch.nn.Conv2d BatchNorm2d...等torch.nn.Modules子类

In [3]:
model_modules = [x for x in model.modules()]
model_modules

[Net(
   (features): Sequential(
     (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
     (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
     (2): ReLU(inplace=True)
     (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
     (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
     (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
     (6): ReLU(inplace=True)
     (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
   )
   (classifier): Sequential(
     (0): Linear(in_features=576, out_features=128, bias=True)
     (1): ReLU(inplace=True)
     (2): Dropout(p=0.5, inplace=False)
     (3): Linear(in_features=128, out_features=10, bias=True)
   )
 ),
 Sequential(
   (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
   (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (2): ReLU(inplace=True)
   (3): MaxPool2d(kernel_

In [4]:
len(model_modules)

15

model_modules列表一共15个元素。

首先是整个Net，然后遍历features子层，接下来是features下的所有层。

然后是classifier子层，接下来是classifier层的下边的所有层。

model.named_modules()

In [5]:
model_named_modules = [x for x in model.named_modules()]
len(model_named_modules)

15

In [6]:
model_named_modules

[('',
  Net(
    (features): Sequential(
      (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
      (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): ReLU(inplace=True)
      (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
      (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (6): ReLU(inplace=True)
      (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    )
    (classifier): Sequential(
      (0): Linear(in_features=576, out_features=128, bias=True)
      (1): ReLU(inplace=True)
      (2): Dropout(p=0.5, inplace=False)
      (3): Linear(in_features=128, out_features=10, bias=True)
    )
  )),
 ('features',
  Sequential(
    (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
    (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): R

相比model.modules(), model.names_modules() 不仅迭代返回了模型的所有子层，还会返回这些层的名字。

可以看到，除了在模型定义时有命名的features 和 classifier，其他层的名字都是Pytorch内部按一定规则自动命名的。

返回层以及层的名字的好处就是可以按名字通过迭代的方法修改特定的层，如果在模型定义的时候就给每个层起了名字，比如卷积层都是conv1,conv2...的形式，那么我们可以这样处理：

In [8]:
for name, layer in model.named_modules():
    if 'conv' in name:
        pass

在没有返回名字的情形中，可以采用isinstance()函数完成上面操作

In [9]:
for layer in model.modules():
    if isinstance(layer, torch.nn.Conv2d):
        pass

## model.children()

如果把这个网络模型Net按层次从外到内进行划分的话，features和classifier是Net的子层。

而conv2d, ReLU, BatchNorm等又是features的子层

Linear, ReLU, Dropout等又是classifier的子层。

上面的model.modules() 不但会遍历模型的子层，还会遍历子层的子层。

model.childern()只会遍历模型的子层，即是features, classifier

In [10]:
model_children = [x for x in model.children()]
len(model_children)

2

In [11]:
model_children

[Sequential(
   (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
   (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (2): ReLU(inplace=True)
   (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
   (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
   (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (6): ReLU(inplace=True)
   (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
 ),
 Sequential(
   (0): Linear(in_features=576, out_features=128, bias=True)
   (1): ReLU(inplace=True)
   (2): Dropout(p=0.5, inplace=False)
   (3): Linear(in_features=128, out_features=10, bias=True)
 )]

model.named_children()

model.named_children()就是带名字的model.children()，相比model.children() 

model.named_children() 返回的是元组。不但遍历模型子层，还会返回子层名字。

In [12]:
model_named_children = [x for x in model.named_children()]
len(model_named_children)

2

In [13]:
model_named_children

[('features',
  Sequential(
    (0): Conv2d(3, 6, kernel_size=(3, 3), stride=(1, 1))
    (1): BatchNorm2d(6, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(6, 9, kernel_size=(3, 3), stride=(1, 1))
    (5): BatchNorm2d(9, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): ReLU(inplace=True)
    (7): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )),
 ('classifier',
  Sequential(
    (0): Linear(in_features=576, out_features=128, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=128, out_features=10, bias=True)
  ))]

## model.parameters()

In [14]:
model_parameters = [x for x in model.parameters()]
len(model_parameters)

12

迭代返回模型所有参数

In [15]:
model_parameters

[Parameter containing:
 tensor([[[[-1.6315e-01, -3.4358e-02, -1.3783e-01],
           [-1.3472e-01,  3.4391e-02,  6.1666e-02],
           [-7.7754e-02,  1.7315e-01, -2.9836e-02]],
 
          [[-1.8604e-01,  8.0605e-02,  1.6610e-01],
           [ 1.8859e-01, -7.3221e-02, -1.4332e-01],
           [ 1.2083e-01,  4.6468e-02, -1.4959e-02]],
 
          [[ 1.7223e-01,  5.8234e-02, -1.4663e-02],
           [-4.0561e-02,  1.6314e-01, -1.5358e-01],
           [-1.7115e-02, -1.7322e-01, -1.9184e-01]]],
 
 
         [[[-1.2143e-01,  7.2797e-02,  2.2985e-02],
           [ 1.7642e-01,  6.5539e-03, -7.2446e-04],
           [-6.6437e-02, -1.6033e-03,  1.5135e-01]],
 
          [[ 1.6059e-02, -1.5727e-01,  6.7963e-02],
           [ 9.4197e-02,  7.8924e-02, -9.0928e-02],
           [ 1.3545e-01,  7.9872e-02,  8.8389e-02]],
 
          [[ 1.6246e-01,  1.3734e-01,  1.5227e-01],
           [-1.2483e-01, -1.1685e-01, -1.6523e-01],
           [-1.4959e-03, -8.0803e-02, -8.5100e-02]]],
 
 
         [[[-7.61

model.named_parameters()

会返回带有名字的

In [16]:
model_named_parameters = [x for x in model.named_parameters()]
len(model_named_parameters)

12

In [17]:
model_named_parameters

[('features.0.weight',
  Parameter containing:
  tensor([[[[-1.6315e-01, -3.4358e-02, -1.3783e-01],
            [-1.3472e-01,  3.4391e-02,  6.1666e-02],
            [-7.7754e-02,  1.7315e-01, -2.9836e-02]],
  
           [[-1.8604e-01,  8.0605e-02,  1.6610e-01],
            [ 1.8859e-01, -7.3221e-02, -1.4332e-01],
            [ 1.2083e-01,  4.6468e-02, -1.4959e-02]],
  
           [[ 1.7223e-01,  5.8234e-02, -1.4663e-02],
            [-4.0561e-02,  1.6314e-01, -1.5358e-01],
            [-1.7115e-02, -1.7322e-01, -1.9184e-01]]],
  
  
          [[[-1.2143e-01,  7.2797e-02,  2.2985e-02],
            [ 1.7642e-01,  6.5539e-03, -7.2446e-04],
            [-6.6437e-02, -1.6033e-03,  1.5135e-01]],
  
           [[ 1.6059e-02, -1.5727e-01,  6.7963e-02],
            [ 9.4197e-02,  7.8924e-02, -9.0928e-02],
            [ 1.3545e-01,  7.9872e-02,  8.8389e-02]],
  
           [[ 1.6246e-01,  1.3734e-01,  1.5227e-01],
            [-1.2483e-01, -1.1685e-01, -1.6523e-01],
            [-1.4959e-03, -8

## model.state_dict()

可以看出 model.state_dict()返回的是一个字典，上面返回的都是生成器。

In [18]:
model.state_dict()

OrderedDict([('features.0.weight',
              tensor([[[[-1.6315e-01, -3.4358e-02, -1.3783e-01],
                        [-1.3472e-01,  3.4391e-02,  6.1666e-02],
                        [-7.7754e-02,  1.7315e-01, -2.9836e-02]],
              
                       [[-1.8604e-01,  8.0605e-02,  1.6610e-01],
                        [ 1.8859e-01, -7.3221e-02, -1.4332e-01],
                        [ 1.2083e-01,  4.6468e-02, -1.4959e-02]],
              
                       [[ 1.7223e-01,  5.8234e-02, -1.4663e-02],
                        [-4.0561e-02,  1.6314e-01, -1.5358e-01],
                        [-1.7115e-02, -1.7322e-01, -1.9184e-01]]],
              
              
                      [[[-1.2143e-01,  7.2797e-02,  2.2985e-02],
                        [ 1.7642e-01,  6.5539e-03, -7.2446e-04],
                        [-6.6437e-02, -1.6033e-03,  1.5135e-01]],
              
                       [[ 1.6059e-02, -1.5727e-01,  6.7963e-02],
                        [ 9.4197e-02,  7

save(), load()

保存 torch.save(model.state_dict(), path)

加载 model.load_state_dict(torch.load(path))

save() 是Pytorch的存储函数，load() 函数则是读取函数。

save函数可以将各种对象保存至磁盘，包括张量，列表，ndarray，字典，模型等

load函数则可以将保存在磁盘中的对象读取出来

In [19]:
a = torch.ones(3)
a

tensor([1., 1., 1.])

In [20]:
#保存张量
torch.save(a, './a.pth')
a_load = torch.load('./a.pth')
a_load

tensor([1., 1., 1.])

In [21]:
#保存字典
b = {k:v for v, k in enumerate('abc', 1)}
b

{'a': 1, 'b': 2, 'c': 3}

In [22]:
torch.save(b, './b.rar')
torch.load('./b.rar')

{'a': 1, 'b': 2, 'c': 3}

model.state_dict()

state_dict()返回字典 containing a whole state of the module.

model.state_dict().keys() 返回参数名称



In [23]:
class M(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = torch.nn.Linear(3, 2)
        self.act = torch.nn.ReLU()
        self.output = torch.nn.Linear(2, 1)

    def forward(self, x):
        a = self.act(self.hidden(x))
        return self.output(a)

net1 = M()

In [24]:
net1.state_dict()

OrderedDict([('hidden.weight',
              tensor([[ 0.1791, -0.2266,  0.0098],
                      [-0.1239, -0.0592,  0.4088]])),
             ('hidden.bias', tensor([0.3017, 0.1844])),
             ('output.weight', tensor([[0.5228, 0.2847]])),
             ('output.bias', tensor([-0.6877]))])

In [25]:
net1.state_dict().keys()

odict_keys(['hidden.weight', 'hidden.bias', 'output.weight', 'output.bias'])

In [26]:
[x for x in net1.named_parameters()]

[('hidden.weight',
  Parameter containing:
  tensor([[ 0.1791, -0.2266,  0.0098],
          [-0.1239, -0.0592,  0.4088]], requires_grad=True)),
 ('hidden.bias',
  Parameter containing:
  tensor([0.3017, 0.1844], requires_grad=True)),
 ('output.weight',
  Parameter containing:
  tensor([[0.5228, 0.2847]], requires_grad=True)),
 ('output.bias',
  Parameter containing:
  tensor([-0.6877], requires_grad=True))]

In [27]:
#记载预训练字典
pretrained_dict = torch.load(log_dir)

#加载模型当前状态字典
model_state_dict = model.state_dict()

#过滤出当前模型没有的键值对
pretrained_dict1 = {k:v for k, v in pretrained_dict if k in model_state_dict}

#用筛选出来的参数键值对更新model_state_dict变量
model_state_dict.update(pretrained_dict1)
model.load_state_dict(model_state_dict)

NameError: name 'log_dir' is not defined