- 本节我们来讨论PyTorch**如何保存和读取训练好的模型**。
- 另外，在很多场景下我们都会使用多GPU训练。这种情况下，模型会分布于各个GPU上（参加2.3节分布数据式训练，这里暂不考虑分布模型式训练），**模型的保存和读取与单GPU训练情景下是否有所不同**？

----------
- PyTorch的模型结构
- PyTorch模型存储的内容
- 单卡与多卡训练下模型的保存与加载方法
----------
#### 5.4.1 模型存储格式
- PyTorch存储模型主要采用pkl，pt，pth三种格式。“
- 就使用层面来说没有区别，这里不做具体的讨论。
----------
#### 5.4.2 模型存储内容
- 包括两部分：**模型结构**和**模型权重**
    - 其中模型是继承nn.Module的类，权重的数据结构是一个字典（key是层名，value是权重向量）。
    - 存储也由此分为两种形式：**存储整个模型（包括结构和权重）**，或**只存储模型权重**。
    - 对于PyTorch而言，pt, pth和pkl**三种数据格式均支持模型权重和整个模型的存储**，因此使用上没有差别。

In [8]:
from torchvision import models 
import torch
model = models.resnet152(pretrained = True)

save_dir = './data/checkpoint.pkl'
# 保存整个模型
torch.save(model, save_dir)
# 保存模型权重
save_dir = './data/checkpoint1.pkl'
torch.save(model.state_dict, save_dir)

### 单卡和多卡模型存储的区别
- PyTorch中将模型和数据放到GPU上有两种方式——.cuda()和.to(device)，本节后续内容针对前一种方式进行讨论。
- 如果要使用多卡训练的话，需要对模型使用torch.nn.DataParallel。

In [10]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2,3'
# model = model.cuda()   单卡
model = torch.nn.DataParallel(model).cuda()   #多卡

#### 5.4.4 情况分类讨论
- 由于训练和测试所使用的硬件条件不同，在模型的保存和加载过程中可能因为单GPU和多GPU环境的不同带来模型不匹配等问题。
- 这里对PyTorch框架下单卡/多卡下模型的保存和加载问题进行排列组合（=4），样例模型是torchvision中预训练模型resnet152。
--------
- 单卡保存+单卡加载
    - 在使用os.envision命令指定使用的GPU后，即可进行模型保存读取操作。注意这里即便保存和读取时使用的GPU不同也无妨。

In [11]:
import os
import torch
from torchvision import models

os.environ['CUDA_VISIBLE_DIVICES'] = '0'     #这里替换成希望使用的GPU编号
model = models.resnet152(pretrained = True)
model.cuda()

# 保存+读取整个模型
torch.save(model, save_dir)
loaded_model = torch.load(save_dir)
loaded_model.cuda()

# 保存+读取模型权重
torch.save(model.state_dict(), save_dir)
loaded_dict = torch.load(save_dir)
loaded_model = models.resnet152()    # 注意这里需要对模型结构有定义
loaded_model.state_dict = loaded_dict
loaded_model.cuda()

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

##### 单卡保存+多卡加载
- 这种情况的处理比较简单，
    - 读取单卡保存的模型后，
    - 使用nn.DataParallel函数进行分布式训练设置即可（相当于3.1代码中.cuda()替换一下）：

In [14]:
import os
import torch
from torchvision import models
from torch import nn

os.environ['CUDA_VISIBLE_DEVICES'] = '0'    
model = models.resnet152(pretrained = True)
model.cuda()

# 保存+读取整个模型
torch.save(model, save_dir)

os.environ['CUDA_VISIBLE_DEVICES'] = '1, 2'
loaded_model = torch.load(save_dir)
loaded_model = nn.DataParallel(loaded_model).cuda()

# 保存+读取模型权重
torch.save(model.state_dict(), save_dir)

os.environ['CUDA_VISIBLE_DEVICES'] = '1,2'
loaded_dict = torch.load(save_dir)
loaded_model.state_dict = loaded_dict
loaded_model = nn.DataParallel(loaded_model).cuda()



- 多卡保存+单卡加载
    - 这种情况下的**核心问题**是：**如何去掉权重字典键名中的"module"**，以保证模型的统一性。

In [16]:
import os
import torch
from torchvision import models

os.environ['CUDA_VISIBLE_DEVICES'] = '1,2'

model = models.resnet152(pretrained=True)
model = nn.DataParallel(model).cuda()

# 保存 + 读取整个模型
torch.save(model, save_dir)

os.environ['CUDA_VISIBLE_DEVICES'] = '0' 
loaded_model = torch.load(save_dir)
loaded_model = loaded_model.module

- 对于加载模型权重，有以下几种思路：

    - **去除字典里的module麻烦，往model里添加module简单（推荐）**
        - 这样即便是单卡，也可以开始训练了（相当于分布到单卡上(还使用DataParallel)）
    - 方法2:去除module字段
        - 方法2.2:使用replace去除module字段

In [20]:
import os
import torch
from torchvision import models

os.environ['CUDA_VISIBLE_DEVICES'] = '0, 1,2,3'

model = models.resnet152(pretrained = True)
model = nn.DataParallel(model).cuda()

# 保存模型权重

torch.save(model.state_dict(), save_dir)

# 读取模型权重
os.environ['CUDA_VISIBLE_DEVICES'] = '0'   #单卡读取
loaded_dict = torch.load(save_dir)
loaded_model = models.resnet152()   #这里需要对模型结构有定义
loaded_model = nn.DataParallel(loaded_model).cuda()
loaded_model.state_dict = loaded_dict

In [21]:
# 方法2:
from collections import OrderedDict
os.environ['CUDA_VISIBLE_DEVICES'] = '0'   #这里替换成希望使用的GPU编号

loaded_dict = torch.load(save_dir)

new_state_dict = OrderedDict()
for k, v in loaded_dict.items():
    name = k[7:] # module字段在最前面，从第7个字符开始就可以去掉module
    new_state_dict[name] = v #新字典的key值对应的value一一对应

loaded_model = models.resnet152()   #注意这里需要对模型结构有定义
loaded_model.state_dict = new_state_dict
loaded_model = loaded_model.cuda()

In [22]:
# 方法2.2
loaded_model = models.resnet152()    
loaded_dict = torch.load(save_dir)
loaded_model.load_state_dict({k.replace('module.', ''): v for k, v in loaded_dict.items()})

<All keys matched successfully>

- 由于是模型保存和加载都使用的是多卡，因此不存在模型层名前缀不同的问题。但多卡状态下存在一个device（使用的GPU）匹配的问题，即**保存整个模型**时会同时保存所使用的GPU id等信息，读取时若这些信息和当前使用的GPU信息不符则可能会报错或者程序不按预定状态运行。具体表现为以下两点：
    - **读取整个模型再使用nn.DataParallel进行分布式训练设置**
        - 这种情况很可能会造成保存的整个模型中GPU id和读取环境下设置的GPU id不符，训练时数据所在device和模型所在device不一致而报错。
    - **读取整个模型而不使用nn.DataParallel进行分布式训练设置**
        - 这种情况可能不会报错，测试中发现程序会自动使用设备的前n个GPU进行训练（n是保存的模型使用的GPU个数）。此时如果指定的GPU个数少于n，则会报错。在这种情况下，只有保存模型时环境的device id和读取模型时环境的device id一致，程序才会按照预期在指定的GPU上进行分布式训练。
--------
- 相比之下，读取模型权重，之后再使用nn.DataParallel进行分布式训练设置则没有问题。因此**多卡模式下建议使用权重的方式存储和读取模型**：

In [25]:
import os
import torch
from torchvision import models

os.environ['CUDA_VISIBLE_DEVICES'] = '0, 1, 2'

model = models.resnet152(pretrained=True)
model = nn.DataParallel(model).cuda()

# 保存+读取模型权重，强烈推荐这第三种方式
torch.save(model.state_dict(), save_dir)
loaded_dict = torch.load(save_dir)
loaded_model = models.resnet152()     #这里需要对模型结构有定义
loaded_model = nn.DataParallel(loaded_model).cuda()
loaded_model.state_dict = loaded_dict


In [26]:
# 如果只有保存的整个模型，也可以采用提取权重的方式构建新的模型
# 读取整个模型
loaded_whole_model = torch.load(save_dir)
loaded_model = models.resnet152()    
loaded_model.state_dict = loaded_whole_model.state_dict
loaded_model = nn.DataParallel(loaded_model).cuda()

AttributeError: 'collections.OrderedDict' object has no attribute 'state_dict'

In [27]:
loaded_model.load_state_dict(loaded_dict)

RuntimeError: Error(s) in loading state_dict for ResNet:
	Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean", "bn1.running_var", "layer1.0.conv1.weight", "layer1.0.bn1.weight", "layer1.0.bn1.bias", "layer1.0.bn1.running_mean", "layer1.0.bn1.running_var", "layer1.0.conv2.weight", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.0.conv3.weight", "layer1.0.bn3.weight", "layer1.0.bn3.bias", "layer1.0.bn3.running_mean", "layer1.0.bn3.running_var", "layer1.0.downsample.0.weight", "layer1.0.downsample.1.weight", "layer1.0.downsample.1.bias", "layer1.0.downsample.1.running_mean", "layer1.0.downsample.1.running_var", "layer1.1.conv1.weight", "layer1.1.bn1.weight", "layer1.1.bn1.bias", "layer1.1.bn1.running_mean", "layer1.1.bn1.running_var", "layer1.1.conv2.weight", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer1.1.conv3.weight", "layer1.1.bn3.weight", "layer1.1.bn3.bias", "layer1.1.bn3.running_mean", "layer1.1.bn3.running_var", "layer1.2.conv1.weight", "layer1.2.bn1.weight", "layer1.2.bn1.bias", "layer1.2.bn1.running_mean", "layer1.2.bn1.running_var", "layer1.2.conv2.weight", "layer1.2.bn2.weight", "layer1.2.bn2.bias", "layer1.2.bn2.running_mean", "layer1.2.bn2.running_var", "layer1.2.conv3.weight", "layer1.2.bn3.weight", "layer1.2.bn3.bias", "layer1.2.bn3.running_mean", "layer1.2.bn3.running_var", "layer2.0.conv1.weight", "layer2.0.bn1.weight", "layer2.0.bn1.bias", "layer2.0.bn1.running_mean", "layer2.0.bn1.running_var", "layer2.0.conv2.weight", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.conv3.weight", "layer2.0.bn3.weight", "layer2.0.bn3.bias", "layer2.0.bn3.running_mean", "layer2.0.bn3.running_var", "layer2.0.downsample.0.weight", "layer2.0.downsample.1.weight", "layer2.0.downsample.1.bias", "layer2.0.downsample.1.running_mean", "layer2.0.downsample.1.running_var", "layer2.1.conv1.weight", "layer2.1.bn1.weight", "layer2.1.bn1.bias", "layer2.1.bn1.running_mean", "layer2.1.bn1.running_var", "layer2.1.conv2.weight", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2.1.conv3.weight", "layer2.1.bn3.weight", "layer2.1.bn3.bias", "layer2.1.bn3.running_mean", "layer2.1.bn3.running_var", "layer2.2.conv1.weight", "layer2.2.bn1.weight", "layer2.2.bn1.bias", "layer2.2.bn1.running_mean", "layer2.2.bn1.running_var", "layer2.2.conv2.weight", "layer2.2.bn2.weight", "layer2.2.bn2.bias", "layer2.2.bn2.running_mean", "layer2.2.bn2.running_var", "layer2.2.conv3.weight", "layer2.2.bn3.weight", "layer2.2.bn3.bias", "layer2.2.bn3.running_mean", "layer2.2.bn3.running_var", "layer2.3.conv1.weight", "layer2.3.bn1.weight", "layer2.3.bn1.bias", "layer2.3.bn1.running_mean", "layer2.3.bn1.running_var", "layer2.3.conv2.weight", "layer2.3.bn2.weight", "layer2.3.bn2.bias", "layer2.3.bn2.running_mean", "layer2.3.bn2.running_var", "layer2.3.conv3.weight", "layer2.3.bn3.weight", "layer2.3.bn3.bias", "layer2.3.bn3.running_mean", "layer2.3.bn3.running_var", "layer2.4.conv1.weight", "layer2.4.bn1.weight", "layer2.4.bn1.bias", "layer2.4.bn1.running_mean", "layer2.4.bn1.running_var", "layer2.4.conv2.weight", "layer2.4.bn2.weight", "layer2.4.bn2.bias", "layer2.4.bn2.running_mean", "layer2.4.bn2.running_var", "layer2.4.conv3.weight", "layer2.4.bn3.weight", "layer2.4.bn3.bias", "layer2.4.bn3.running_mean", "layer2.4.bn3.running_var", "layer2.5.conv1.weight", "layer2.5.bn1.weight", "layer2.5.bn1.bias", "layer2.5.bn1.running_mean", "layer2.5.bn1.running_var", "layer2.5.conv2.weight", "layer2.5.bn2.weight", "layer2.5.bn2.bias", "layer2.5.bn2.running_mean", "layer2.5.bn2.running_var", "layer2.5.conv3.weight", "layer2.5.bn3.weight", "layer2.5.bn3.bias", "layer2.5.bn3.running_mean", "layer2.5.bn3.running_var", "layer2.6.conv1.weight", "layer2.6.bn1.weight", "layer2.6.bn1.bias", "layer2.6.bn1.running_mean", "layer2.6.bn1.running_var", "layer2.6.conv2.weight", "layer2.6.bn2.weight", "layer2.6.bn2.bias", "layer2.6.bn2.running_mean", "layer2.6.bn2.running_var", "layer2.6.conv3.weight", "layer2.6.bn3.weight", "layer2.6.bn3.bias", "layer2.6.bn3.running_mean", "layer2.6.bn3.running_var", "layer2.7.conv1.weight", "layer2.7.bn1.weight", "layer2.7.bn1.bias", "layer2.7.bn1.running_mean", "layer2.7.bn1.running_var", "layer2.7.conv2.weight", "layer2.7.bn2.weight", "layer2.7.bn2.bias", "layer2.7.bn2.running_mean", "layer2.7.bn2.running_var", "layer2.7.conv3.weight", "layer2.7.bn3.weight", "layer2.7.bn3.bias", "layer2.7.bn3.running_mean", "layer2.7.bn3.running_var", "layer3.0.conv1.weight", "layer3.0.bn1.weight", "layer3.0.bn1.bias", "layer3.0.bn1.running_mean", "layer3.0.bn1.running_var", "layer3.0.conv2.weight", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.conv3.weight", "layer3.0.bn3.weight", "layer3.0.bn3.bias", "layer3.0.bn3.running_mean", "layer3.0.bn3.running_var", "layer3.0.downsample.0.weight", "layer3.0.downsample.1.weight", "layer3.0.downsample.1.bias", "layer3.0.downsample.1.running_mean", "layer3.0.downsample.1.running_var", "layer3.1.conv1.weight", "layer3.1.bn1.weight", "layer3.1.bn1.bias", "layer3.1.bn1.running_mean", "layer3.1.bn1.running_var", "layer3.1.conv2.weight", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3.1.conv3.weight", "layer3.1.bn3.weight", "layer3.1.bn3.bias", "layer3.1.bn3.running_mean", "layer3.1.bn3.running_var", "layer3.2.conv1.weight", "layer3.2.bn1.weight", "layer3.2.bn1.bias", "layer3.2.bn1.running_mean", "layer3.2.bn1.running_var", "layer3.2.conv2.weight", "layer3.2.bn2.weight", "layer3.2.bn2.bias", "layer3.2.bn2.running_mean", "layer3.2.bn2.running_var", "layer3.2.conv3.weight", "layer3.2.bn3.weight", "layer3.2.bn3.bias", "layer3.2.bn3.running_mean", "layer3.2.bn3.running_var", "layer3.3.conv1.weight", "layer3.3.bn1.weight", "layer3.3.bn1.bias", "layer3.3.bn1.running_mean", "layer3.3.bn1.running_var", "layer3.3.conv2.weight", "layer3.3.bn2.weight", "layer3.3.bn2.bias", "layer3.3.bn2.running_mean", "layer3.3.bn2.running_var", "layer3.3.conv3.weight", "layer3.3.bn3.weight", "layer3.3.bn3.bias", "layer3.3.bn3.running_mean", "layer3.3.bn3.running_var", "layer3.4.conv1.weight", "layer3.4.bn1.weight", "layer3.4.bn1.bias", "layer3.4.bn1.running_mean", "layer3.4.bn1.running_var", "layer3.4.conv2.weight", "layer3.4.bn2.weight", "layer3.4.bn2.bias", "layer3.4.bn2.running_mean", "layer3.4.bn2.running_var", "layer3.4.conv3.weight", "layer3.4.bn3.weight", "layer3.4.bn3.bias", "layer3.4.bn3.running_mean", "layer3.4.bn3.running_var", "layer3.5.conv1.weight", "layer3.5.bn1.weight", "layer3.5.bn1.bias", "layer3.5.bn1.running_mean", "layer3.5.bn1.running_var", "layer3.5.conv2.weight", "layer3.5.bn2.weight", "layer3.5.bn2.bias", "layer3.5.bn2.running_mean", "layer3.5.bn2.running_var", "layer3.5.conv3.weight", "layer3.5.bn3.weight", "layer3.5.bn3.bias", "layer3.5.bn3.running_mean", "layer3.5.bn3.running_var", "layer3.6.conv1.weight", "layer3.6.bn1.weight", "layer3.6.bn1.bias", "layer3.6.bn1.running_mean", "layer3.6.bn1.running_var", "layer3.6.conv2.weight", "layer3.6.bn2.weight", "layer3.6.bn2.bias", "layer3.6.bn2.running_mean", "layer3.6.bn2.running_var", "layer3.6.conv3.weight", "layer3.6.bn3.weight", "layer3.6.bn3.bias", "layer3.6.bn3.running_mean", "layer3.6.bn3.running_var", "layer3.7.conv1.weight", "layer3.7.bn1.weight", "layer3.7.bn1.bias", "layer3.7.bn1.running_mean", "layer3.7.bn1.running_var", "layer3.7.conv2.weight", "layer3.7.bn2.weight", "layer3.7.bn2.bias", "layer3.7.bn2.running_mean", "layer3.7.bn2.running_var", "layer3.7.conv3.weight", "layer3.7.bn3.weight", "layer3.7.bn3.bias", "layer3.7.bn3.running_mean", "layer3.7.bn3.running_var", "layer3.8.conv1.weight", "layer3.8.bn1.weight", "layer3.8.bn1.bias", "layer3.8.bn1.running_mean", "layer3.8.bn1.running_var", "layer3.8.conv2.weight", "layer3.8.bn2.weight", "layer3.8.bn2.bias", "layer3.8.bn2.running_mean", "layer3.8.bn2.running_var", "layer3.8.conv3.weight", "layer3.8.bn3.weight", "layer3.8.bn3.bias", "layer3.8.bn3.running_mean", "layer3.8.bn3.running_var", "layer3.9.conv1.weight", "layer3.9.bn1.weight", "layer3.9.bn1.bias", "layer3.9.bn1.running_mean", "layer3.9.bn1.running_var", "layer3.9.conv2.weight", "layer3.9.bn2.weight", "layer3.9.bn2.bias", "layer3.9.bn2.running_mean", "layer3.9.bn2.running_var", "layer3.9.conv3.weight", "layer3.9.bn3.weight", "layer3.9.bn3.bias", "layer3.9.bn3.running_mean", "layer3.9.bn3.running_var", "layer3.10.conv1.weight", "layer3.10.bn1.weight", "layer3.10.bn1.bias", "layer3.10.bn1.running_mean", "layer3.10.bn1.running_var", "layer3.10.conv2.weight", "layer3.10.bn2.weight", "layer3.10.bn2.bias", "layer3.10.bn2.running_mean", "layer3.10.bn2.running_var", "layer3.10.conv3.weight", "layer3.10.bn3.weight", "layer3.10.bn3.bias", "layer3.10.bn3.running_mean", "layer3.10.bn3.running_var", "layer3.11.conv1.weight", "layer3.11.bn1.weight", "layer3.11.bn1.bias", "layer3.11.bn1.running_mean", "layer3.11.bn1.running_var", "layer3.11.conv2.weight", "layer3.11.bn2.weight", "layer3.11.bn2.bias", "layer3.11.bn2.running_mean", "layer3.11.bn2.running_var", "layer3.11.conv3.weight", "layer3.11.bn3.weight", "layer3.11.bn3.bias", "layer3.11.bn3.running_mean", "layer3.11.bn3.running_var", "layer3.12.conv1.weight", "layer3.12.bn1.weight", "layer3.12.bn1.bias", "layer3.12.bn1.running_mean", "layer3.12.bn1.running_var", "layer3.12.conv2.weight", "layer3.12.bn2.weight", "layer3.12.bn2.bias", "layer3.12.bn2.running_mean", "layer3.12.bn2.running_var", "layer3.12.conv3.weight", "layer3.12.bn3.weight", "layer3.12.bn3.bias", "layer3.12.bn3.running_mean", "layer3.12.bn3.running_var", "layer3.13.conv1.weight", "layer3.13.bn1.weight", "layer3.13.bn1.bias", "layer3.13.bn1.running_mean", "layer3.13.bn1.running_var", "layer3.13.conv2.weight", "layer3.13.bn2.weight", "layer3.13.bn2.bias", "layer3.13.bn2.running_mean", "layer3.13.bn2.running_var", "layer3.13.conv3.weight", "layer3.13.bn3.weight", "layer3.13.bn3.bias", "layer3.13.bn3.running_mean", "layer3.13.bn3.running_var", "layer3.14.conv1.weight", "layer3.14.bn1.weight", "layer3.14.bn1.bias", "layer3.14.bn1.running_mean", "layer3.14.bn1.running_var", "layer3.14.conv2.weight", "layer3.14.bn2.weight", "layer3.14.bn2.bias", "layer3.14.bn2.running_mean", "layer3.14.bn2.running_var", "layer3.14.conv3.weight", "layer3.14.bn3.weight", "layer3.14.bn3.bias", "layer3.14.bn3.running_mean", "layer3.14.bn3.running_var", "layer3.15.conv1.weight", "layer3.15.bn1.weight", "layer3.15.bn1.bias", "layer3.15.bn1.running_mean", "layer3.15.bn1.running_var", "layer3.15.conv2.weight", "layer3.15.bn2.weight", "layer3.15.bn2.bias", "layer3.15.bn2.running_mean", "layer3.15.bn2.running_var", "layer3.15.conv3.weight", "layer3.15.bn3.weight", "layer3.15.bn3.bias", "layer3.15.bn3.running_mean", "layer3.15.bn3.running_var", "layer3.16.conv1.weight", "layer3.16.bn1.weight", "layer3.16.bn1.bias", "layer3.16.bn1.running_mean", "layer3.16.bn1.running_var", "layer3.16.conv2.weight", "layer3.16.bn2.weight", "layer3.16.bn2.bias", "layer3.16.bn2.running_mean", "layer3.16.bn2.running_var", "layer3.16.conv3.weight", "layer3.16.bn3.weight", "layer3.16.bn3.bias", "layer3.16.bn3.running_mean", "layer3.16.bn3.running_var", "layer3.17.conv1.weight", "layer3.17.bn1.weight", "layer3.17.bn1.bias", "layer3.17.bn1.running_mean", "layer3.17.bn1.running_var", "layer3.17.conv2.weight", "layer3.17.bn2.weight", "layer3.17.bn2.bias", "layer3.17.bn2.running_mean", "layer3.17.bn2.running_var", "layer3.17.conv3.weight", "layer3.17.bn3.weight", "layer3.17.bn3.bias", "layer3.17.bn3.running_mean", "layer3.17.bn3.running_var", "layer3.18.conv1.weight", "layer3.18.bn1.weight", "layer3.18.bn1.bias", "layer3.18.bn1.running_mean", "layer3.18.bn1.running_var", "layer3.18.conv2.weight", "layer3.18.bn2.weight", "layer3.18.bn2.bias", "layer3.18.bn2.running_mean", "layer3.18.bn2.running_var", "layer3.18.conv3.weight", "layer3.18.bn3.weight", "layer3.18.bn3.bias", "layer3.18.bn3.running_mean", "layer3.18.bn3.running_var", "layer3.19.conv1.weight", "layer3.19.bn1.weight", "layer3.19.bn1.bias", "layer3.19.bn1.running_mean", "layer3.19.bn1.running_var", "layer3.19.conv2.weight", "layer3.19.bn2.weight", "layer3.19.bn2.bias", "layer3.19.bn2.running_mean", "layer3.19.bn2.running_var", "layer3.19.conv3.weight", "layer3.19.bn3.weight", "layer3.19.bn3.bias", "layer3.19.bn3.running_mean", "layer3.19.bn3.running_var", "layer3.20.conv1.weight", "layer3.20.bn1.weight", "layer3.20.bn1.bias", "layer3.20.bn1.running_mean", "layer3.20.bn1.running_var", "layer3.20.conv2.weight", "layer3.20.bn2.weight", "layer3.20.bn2.bias", "layer3.20.bn2.running_mean", "layer3.20.bn2.running_var", "layer3.20.conv3.weight", "layer3.20.bn3.weight", "layer3.20.bn3.bias", "layer3.20.bn3.running_mean", "layer3.20.bn3.running_var", "layer3.21.conv1.weight", "layer3.21.bn1.weight", "layer3.21.bn1.bias", "layer3.21.bn1.running_mean", "layer3.21.bn1.running_var", "layer3.21.conv2.weight", "layer3.21.bn2.weight", "layer3.21.bn2.bias", "layer3.21.bn2.running_mean", "layer3.21.bn2.running_var", "layer3.21.conv3.weight", "layer3.21.bn3.weight", "layer3.21.bn3.bias", "layer3.21.bn3.running_mean", "layer3.21.bn3.running_var", "layer3.22.conv1.weight", "layer3.22.bn1.weight", "layer3.22.bn1.bias", "layer3.22.bn1.running_mean", "layer3.22.bn1.running_var", "layer3.22.conv2.weight", "layer3.22.bn2.weight", "layer3.22.bn2.bias", "layer3.22.bn2.running_mean", "layer3.22.bn2.running_var", "layer3.22.conv3.weight", "layer3.22.bn3.weight", "layer3.22.bn3.bias", "layer3.22.bn3.running_mean", "layer3.22.bn3.running_var", "layer3.23.conv1.weight", "layer3.23.bn1.weight", "layer3.23.bn1.bias", "layer3.23.bn1.running_mean", "layer3.23.bn1.running_var", "layer3.23.conv2.weight", "layer3.23.bn2.weight", "layer3.23.bn2.bias", "layer3.23.bn2.running_mean", "layer3.23.bn2.running_var", "layer3.23.conv3.weight", "layer3.23.bn3.weight", "layer3.23.bn3.bias", "layer3.23.bn3.running_mean", "layer3.23.bn3.running_var", "layer3.24.conv1.weight", "layer3.24.bn1.weight", "layer3.24.bn1.bias", "layer3.24.bn1.running_mean", "layer3.24.bn1.running_var", "layer3.24.conv2.weight", "layer3.24.bn2.weight", "layer3.24.bn2.bias", "layer3.24.bn2.running_mean", "layer3.24.bn2.running_var", "layer3.24.conv3.weight", "layer3.24.bn3.weight", "layer3.24.bn3.bias", "layer3.24.bn3.running_mean", "layer3.24.bn3.running_var", "layer3.25.conv1.weight", "layer3.25.bn1.weight", "layer3.25.bn1.bias", "layer3.25.bn1.running_mean", "layer3.25.bn1.running_var", "layer3.25.conv2.weight", "layer3.25.bn2.weight", "layer3.25.bn2.bias", "layer3.25.bn2.running_mean", "layer3.25.bn2.running_var", "layer3.25.conv3.weight", "layer3.25.bn3.weight", "layer3.25.bn3.bias", "layer3.25.bn3.running_mean", "layer3.25.bn3.running_var", "layer3.26.conv1.weight", "layer3.26.bn1.weight", "layer3.26.bn1.bias", "layer3.26.bn1.running_mean", "layer3.26.bn1.running_var", "layer3.26.conv2.weight", "layer3.26.bn2.weight", "layer3.26.bn2.bias", "layer3.26.bn2.running_mean", "layer3.26.bn2.running_var", "layer3.26.conv3.weight", "layer3.26.bn3.weight", "layer3.26.bn3.bias", "layer3.26.bn3.running_mean", "layer3.26.bn3.running_var", "layer3.27.conv1.weight", "layer3.27.bn1.weight", "layer3.27.bn1.bias", "layer3.27.bn1.running_mean", "layer3.27.bn1.running_var", "layer3.27.conv2.weight", "layer3.27.bn2.weight", "layer3.27.bn2.bias", "layer3.27.bn2.running_mean", "layer3.27.bn2.running_var", "layer3.27.conv3.weight", "layer3.27.bn3.weight", "layer3.27.bn3.bias", "layer3.27.bn3.running_mean", "layer3.27.bn3.running_var", "layer3.28.conv1.weight", "layer3.28.bn1.weight", "layer3.28.bn1.bias", "layer3.28.bn1.running_mean", "layer3.28.bn1.running_var", "layer3.28.conv2.weight", "layer3.28.bn2.weight", "layer3.28.bn2.bias", "layer3.28.bn2.running_mean", "layer3.28.bn2.running_var", "layer3.28.conv3.weight", "layer3.28.bn3.weight", "layer3.28.bn3.bias", "layer3.28.bn3.running_mean", "layer3.28.bn3.running_var", "layer3.29.conv1.weight", "layer3.29.bn1.weight", "layer3.29.bn1.bias", "layer3.29.bn1.running_mean", "layer3.29.bn1.running_var", "layer3.29.conv2.weight", "layer3.29.bn2.weight", "layer3.29.bn2.bias", "layer3.29.bn2.running_mean", "layer3.29.bn2.running_var", "layer3.29.conv3.weight", "layer3.29.bn3.weight", "layer3.29.bn3.bias", "layer3.29.bn3.running_mean", "layer3.29.bn3.running_var", "layer3.30.conv1.weight", "layer3.30.bn1.weight", "layer3.30.bn1.bias", "layer3.30.bn1.running_mean", "layer3.30.bn1.running_var", "layer3.30.conv2.weight", "layer3.30.bn2.weight", "layer3.30.bn2.bias", "layer3.30.bn2.running_mean", "layer3.30.bn2.running_var", "layer3.30.conv3.weight", "layer3.30.bn3.weight", "layer3.30.bn3.bias", "layer3.30.bn3.running_mean", "layer3.30.bn3.running_var", "layer3.31.conv1.weight", "layer3.31.bn1.weight", "layer3.31.bn1.bias", "layer3.31.bn1.running_mean", "layer3.31.bn1.running_var", "layer3.31.conv2.weight", "layer3.31.bn2.weight", "layer3.31.bn2.bias", "layer3.31.bn2.running_mean", "layer3.31.bn2.running_var", "layer3.31.conv3.weight", "layer3.31.bn3.weight", "layer3.31.bn3.bias", "layer3.31.bn3.running_mean", "layer3.31.bn3.running_var", "layer3.32.conv1.weight", "layer3.32.bn1.weight", "layer3.32.bn1.bias", "layer3.32.bn1.running_mean", "layer3.32.bn1.running_var", "layer3.32.conv2.weight", "layer3.32.bn2.weight", "layer3.32.bn2.bias", "layer3.32.bn2.running_mean", "layer3.32.bn2.running_var", "layer3.32.conv3.weight", "layer3.32.bn3.weight", "layer3.32.bn3.bias", "layer3.32.bn3.running_mean", "layer3.32.bn3.running_var", "layer3.33.conv1.weight", "layer3.33.bn1.weight", "layer3.33.bn1.bias", "layer3.33.bn1.running_mean", "layer3.33.bn1.running_var", "layer3.33.conv2.weight", "layer3.33.bn2.weight", "layer3.33.bn2.bias", "layer3.33.bn2.running_mean", "layer3.33.bn2.running_var", "layer3.33.conv3.weight", "layer3.33.bn3.weight", "layer3.33.bn3.bias", "layer3.33.bn3.running_mean", "layer3.33.bn3.running_var", "layer3.34.conv1.weight", "layer3.34.bn1.weight", "layer3.34.bn1.bias", "layer3.34.bn1.running_mean", "layer3.34.bn1.running_var", "layer3.34.conv2.weight", "layer3.34.bn2.weight", "layer3.34.bn2.bias", "layer3.34.bn2.running_mean", "layer3.34.bn2.running_var", "layer3.34.conv3.weight", "layer3.34.bn3.weight", "layer3.34.bn3.bias", "layer3.34.bn3.running_mean", "layer3.34.bn3.running_var", "layer3.35.conv1.weight", "layer3.35.bn1.weight", "layer3.35.bn1.bias", "layer3.35.bn1.running_mean", "layer3.35.bn1.running_var", "layer3.35.conv2.weight", "layer3.35.bn2.weight", "layer3.35.bn2.bias", "layer3.35.bn2.running_mean", "layer3.35.bn2.running_var", "layer3.35.conv3.weight", "layer3.35.bn3.weight", "layer3.35.bn3.bias", "layer3.35.bn3.running_mean", "layer3.35.bn3.running_var", "layer4.0.conv1.weight", "layer4.0.bn1.weight", "layer4.0.bn1.bias", "layer4.0.bn1.running_mean", "layer4.0.bn1.running_var", "layer4.0.conv2.weight", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.conv3.weight", "layer4.0.bn3.weight", "layer4.0.bn3.bias", "layer4.0.bn3.running_mean", "layer4.0.bn3.running_var", "layer4.0.downsample.0.weight", "layer4.0.downsample.1.weight", "layer4.0.downsample.1.bias", "layer4.0.downsample.1.running_mean", "layer4.0.downsample.1.running_var", "layer4.1.conv1.weight", "layer4.1.bn1.weight", "layer4.1.bn1.bias", "layer4.1.bn1.running_mean", "layer4.1.bn1.running_var", "layer4.1.conv2.weight", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4.1.conv3.weight", "layer4.1.bn3.weight", "layer4.1.bn3.bias", "layer4.1.bn3.running_mean", "layer4.1.bn3.running_var", "layer4.2.conv1.weight", "layer4.2.bn1.weight", "layer4.2.bn1.bias", "layer4.2.bn1.running_mean", "layer4.2.bn1.running_var", "layer4.2.conv2.weight", "layer4.2.bn2.weight", "layer4.2.bn2.bias", "layer4.2.bn2.running_mean", "layer4.2.bn2.running_var", "layer4.2.conv3.weight", "layer4.2.bn3.weight", "layer4.2.bn3.bias", "layer4.2.bn3.running_mean", "layer4.2.bn3.running_var", "fc.weight", "fc.bias". 
	Unexpected key(s) in state_dict: "module.conv1.weight", "module.bn1.weight", "module.bn1.bias", "module.bn1.running_mean", "module.bn1.running_var", "module.bn1.num_batches_tracked", "module.layer1.0.conv1.weight", "module.layer1.0.bn1.weight", "module.layer1.0.bn1.bias", "module.layer1.0.bn1.running_mean", "module.layer1.0.bn1.running_var", "module.layer1.0.bn1.num_batches_tracked", "module.layer1.0.conv2.weight", "module.layer1.0.bn2.weight", "module.layer1.0.bn2.bias", "module.layer1.0.bn2.running_mean", "module.layer1.0.bn2.running_var", "module.layer1.0.bn2.num_batches_tracked", "module.layer1.0.conv3.weight", "module.layer1.0.bn3.weight", "module.layer1.0.bn3.bias", "module.layer1.0.bn3.running_mean", "module.layer1.0.bn3.running_var", "module.layer1.0.bn3.num_batches_tracked", "module.layer1.0.downsample.0.weight", "module.layer1.0.downsample.1.weight", "module.layer1.0.downsample.1.bias", "module.layer1.0.downsample.1.running_mean", "module.layer1.0.downsample.1.running_var", "module.layer1.0.downsample.1.num_batches_tracked", "module.layer1.1.conv1.weight", "module.layer1.1.bn1.weight", "module.layer1.1.bn1.bias", "module.layer1.1.bn1.running_mean", "module.layer1.1.bn1.running_var", "module.layer1.1.bn1.num_batches_tracked", "module.layer1.1.conv2.weight", "module.layer1.1.bn2.weight", "module.layer1.1.bn2.bias", "module.layer1.1.bn2.running_mean", "module.layer1.1.bn2.running_var", "module.layer1.1.bn2.num_batches_tracked", "module.layer1.1.conv3.weight", "module.layer1.1.bn3.weight", "module.layer1.1.bn3.bias", "module.layer1.1.bn3.running_mean", "module.layer1.1.bn3.running_var", "module.layer1.1.bn3.num_batches_tracked", "module.layer1.2.conv1.weight", "module.layer1.2.bn1.weight", "module.layer1.2.bn1.bias", "module.layer1.2.bn1.running_mean", "module.layer1.2.bn1.running_var", "module.layer1.2.bn1.num_batches_tracked", "module.layer1.2.conv2.weight", "module.layer1.2.bn2.weight", "module.layer1.2.bn2.bias", "module.layer1.2.bn2.running_mean", "module.layer1.2.bn2.running_var", "module.layer1.2.bn2.num_batches_tracked", "module.layer1.2.conv3.weight", "module.layer1.2.bn3.weight", "module.layer1.2.bn3.bias", "module.layer1.2.bn3.running_mean", "module.layer1.2.bn3.running_var", "module.layer1.2.bn3.num_batches_tracked", "module.layer2.0.conv1.weight", "module.layer2.0.bn1.weight", "module.layer2.0.bn1.bias", "module.layer2.0.bn1.running_mean", "module.layer2.0.bn1.running_var", "module.layer2.0.bn1.num_batches_tracked", "module.layer2.0.conv2.weight", "module.layer2.0.bn2.weight", "module.layer2.0.bn2.bias", "module.layer2.0.bn2.running_mean", "module.layer2.0.bn2.running_var", "module.layer2.0.bn2.num_batches_tracked", "module.layer2.0.conv3.weight", "module.layer2.0.bn3.weight", "module.layer2.0.bn3.bias", "module.layer2.0.bn3.running_mean", "module.layer2.0.bn3.running_var", "module.layer2.0.bn3.num_batches_tracked", "module.layer2.0.downsample.0.weight", "module.layer2.0.downsample.1.weight", "module.layer2.0.downsample.1.bias", "module.layer2.0.downsample.1.running_mean", "module.layer2.0.downsample.1.running_var", "module.layer2.0.downsample.1.num_batches_tracked", "module.layer2.1.conv1.weight", "module.layer2.1.bn1.weight", "module.layer2.1.bn1.bias", "module.layer2.1.bn1.running_mean", "module.layer2.1.bn1.running_var", "module.layer2.1.bn1.num_batches_tracked", "module.layer2.1.conv2.weight", "module.layer2.1.bn2.weight", "module.layer2.1.bn2.bias", "module.layer2.1.bn2.running_mean", "module.layer2.1.bn2.running_var", "module.layer2.1.bn2.num_batches_tracked", "module.layer2.1.conv3.weight", "module.layer2.1.bn3.weight", "module.layer2.1.bn3.bias", "module.layer2.1.bn3.running_mean", "module.layer2.1.bn3.running_var", "module.layer2.1.bn3.num_batches_tracked", "module.layer2.2.conv1.weight", "module.layer2.2.bn1.weight", "module.layer2.2.bn1.bias", "module.layer2.2.bn1.running_mean", "module.layer2.2.bn1.running_var", "module.layer2.2.bn1.num_batches_tracked", "module.layer2.2.conv2.weight", "module.layer2.2.bn2.weight", "module.layer2.2.bn2.bias", "module.layer2.2.bn2.running_mean", "module.layer2.2.bn2.running_var", "module.layer2.2.bn2.num_batches_tracked", "module.layer2.2.conv3.weight", "module.layer2.2.bn3.weight", "module.layer2.2.bn3.bias", "module.layer2.2.bn3.running_mean", "module.layer2.2.bn3.running_var", "module.layer2.2.bn3.num_batches_tracked", "module.layer2.3.conv1.weight", "module.layer2.3.bn1.weight", "module.layer2.3.bn1.bias", "module.layer2.3.bn1.running_mean", "module.layer2.3.bn1.running_var", "module.layer2.3.bn1.num_batches_tracked", "module.layer2.3.conv2.weight", "module.layer2.3.bn2.weight", "module.layer2.3.bn2.bias", "module.layer2.3.bn2.running_mean", "module.layer2.3.bn2.running_var", "module.layer2.3.bn2.num_batches_tracked", "module.layer2.3.conv3.weight", "module.layer2.3.bn3.weight", "module.layer2.3.bn3.bias", "module.layer2.3.bn3.running_mean", "module.layer2.3.bn3.running_var", "module.layer2.3.bn3.num_batches_tracked", "module.layer2.4.conv1.weight", "module.layer2.4.bn1.weight", "module.layer2.4.bn1.bias", "module.layer2.4.bn1.running_mean", "module.layer2.4.bn1.running_var", "module.layer2.4.bn1.num_batches_tracked", "module.layer2.4.conv2.weight", "module.layer2.4.bn2.weight", "module.layer2.4.bn2.bias", "module.layer2.4.bn2.running_mean", "module.layer2.4.bn2.running_var", "module.layer2.4.bn2.num_batches_tracked", "module.layer2.4.conv3.weight", "module.layer2.4.bn3.weight", "module.layer2.4.bn3.bias", "module.layer2.4.bn3.running_mean", "module.layer2.4.bn3.running_var", "module.layer2.4.bn3.num_batches_tracked", "module.layer2.5.conv1.weight", "module.layer2.5.bn1.weight", "module.layer2.5.bn1.bias", "module.layer2.5.bn1.running_mean", "module.layer2.5.bn1.running_var", "module.layer2.5.bn1.num_batches_tracked", "module.layer2.5.conv2.weight", "module.layer2.5.bn2.weight", "module.layer2.5.bn2.bias", "module.layer2.5.bn2.running_mean", "module.layer2.5.bn2.running_var", "module.layer2.5.bn2.num_batches_tracked", "module.layer2.5.conv3.weight", "module.layer2.5.bn3.weight", "module.layer2.5.bn3.bias", "module.layer2.5.bn3.running_mean", "module.layer2.5.bn3.running_var", "module.layer2.5.bn3.num_batches_tracked", "module.layer2.6.conv1.weight", "module.layer2.6.bn1.weight", "module.layer2.6.bn1.bias", "module.layer2.6.bn1.running_mean", "module.layer2.6.bn1.running_var", "module.layer2.6.bn1.num_batches_tracked", "module.layer2.6.conv2.weight", "module.layer2.6.bn2.weight", "module.layer2.6.bn2.bias", "module.layer2.6.bn2.running_mean", "module.layer2.6.bn2.running_var", "module.layer2.6.bn2.num_batches_tracked", "module.layer2.6.conv3.weight", "module.layer2.6.bn3.weight", "module.layer2.6.bn3.bias", "module.layer2.6.bn3.running_mean", "module.layer2.6.bn3.running_var", "module.layer2.6.bn3.num_batches_tracked", "module.layer2.7.conv1.weight", "module.layer2.7.bn1.weight", "module.layer2.7.bn1.bias", "module.layer2.7.bn1.running_mean", "module.layer2.7.bn1.running_var", "module.layer2.7.bn1.num_batches_tracked", "module.layer2.7.conv2.weight", "module.layer2.7.bn2.weight", "module.layer2.7.bn2.bias", "module.layer2.7.bn2.running_mean", "module.layer2.7.bn2.running_var", "module.layer2.7.bn2.num_batches_tracked", "module.layer2.7.conv3.weight", "module.layer2.7.bn3.weight", "module.layer2.7.bn3.bias", "module.layer2.7.bn3.running_mean", "module.layer2.7.bn3.running_var", "module.layer2.7.bn3.num_batches_tracked", "module.layer3.0.conv1.weight", "module.layer3.0.bn1.weight", "module.layer3.0.bn1.bias", "module.layer3.0.bn1.running_mean", "module.layer3.0.bn1.running_var", "module.layer3.0.bn1.num_batches_tracked", "module.layer3.0.conv2.weight", "module.layer3.0.bn2.weight", "module.layer3.0.bn2.bias", "module.layer3.0.bn2.running_mean", "module.layer3.0.bn2.running_var", "module.layer3.0.bn2.num_batches_tracked", "module.layer3.0.conv3.weight", "module.layer3.0.bn3.weight", "module.layer3.0.bn3.bias", "module.layer3.0.bn3.running_mean", "module.layer3.0.bn3.running_var", "module.layer3.0.bn3.num_batches_tracked", "module.layer3.0.downsample.0.weight", "module.layer3.0.downsample.1.weight", "module.layer3.0.downsample.1.bias", "module.layer3.0.downsample.1.running_mean", "module.layer3.0.downsample.1.running_var", "module.layer3.0.downsample.1.num_batches_tracked", "module.layer3.1.conv1.weight", "module.layer3.1.bn1.weight", "module.layer3.1.bn1.bias", "module.layer3.1.bn1.running_mean", "module.layer3.1.bn1.running_var", "module.layer3.1.bn1.num_batches_tracked", "module.layer3.1.conv2.weight", "module.layer3.1.bn2.weight", "module.layer3.1.bn2.bias", "module.layer3.1.bn2.running_mean", "module.layer3.1.bn2.running_var", "module.layer3.1.bn2.num_batches_tracked", "module.layer3.1.conv3.weight", "module.layer3.1.bn3.weight", "module.layer3.1.bn3.bias", "module.layer3.1.bn3.running_mean", "module.layer3.1.bn3.running_var", "module.layer3.1.bn3.num_batches_tracked", "module.layer3.2.conv1.weight", "module.layer3.2.bn1.weight", "module.layer3.2.bn1.bias", "module.layer3.2.bn1.running_mean", "module.layer3.2.bn1.running_var", "module.layer3.2.bn1.num_batches_tracked", "module.layer3.2.conv2.weight", "module.layer3.2.bn2.weight", "module.layer3.2.bn2.bias", "module.layer3.2.bn2.running_mean", "module.layer3.2.bn2.running_var", "module.layer3.2.bn2.num_batches_tracked", "module.layer3.2.conv3.weight", "module.layer3.2.bn3.weight", "module.layer3.2.bn3.bias", "module.layer3.2.bn3.running_mean", "module.layer3.2.bn3.running_var", "module.layer3.2.bn3.num_batches_tracked", "module.layer3.3.conv1.weight", "module.layer3.3.bn1.weight", "module.layer3.3.bn1.bias", "module.layer3.3.bn1.running_mean", "module.layer3.3.bn1.running_var", "module.layer3.3.bn1.num_batches_tracked", "module.layer3.3.conv2.weight", "module.layer3.3.bn2.weight", "module.layer3.3.bn2.bias", "module.layer3.3.bn2.running_mean", "module.layer3.3.bn2.running_var", "module.layer3.3.bn2.num_batches_tracked", "module.layer3.3.conv3.weight", "module.layer3.3.bn3.weight", "module.layer3.3.bn3.bias", "module.layer3.3.bn3.running_mean", "module.layer3.3.bn3.running_var", "module.layer3.3.bn3.num_batches_tracked", "module.layer3.4.conv1.weight", "module.layer3.4.bn1.weight", "module.layer3.4.bn1.bias", "module.layer3.4.bn1.running_mean", "module.layer3.4.bn1.running_var", "module.layer3.4.bn1.num_batches_tracked", "module.layer3.4.conv2.weight", "module.layer3.4.bn2.weight", "module.layer3.4.bn2.bias", "module.layer3.4.bn2.running_mean", "module.layer3.4.bn2.running_var", "module.layer3.4.bn2.num_batches_tracked", "module.layer3.4.conv3.weight", "module.layer3.4.bn3.weight", "module.layer3.4.bn3.bias", "module.layer3.4.bn3.running_mean", "module.layer3.4.bn3.running_var", "module.layer3.4.bn3.num_batches_tracked", "module.layer3.5.conv1.weight", "module.layer3.5.bn1.weight", "module.layer3.5.bn1.bias", "module.layer3.5.bn1.running_mean", "module.layer3.5.bn1.running_var", "module.layer3.5.bn1.num_batches_tracked", "module.layer3.5.conv2.weight", "module.layer3.5.bn2.weight", "module.layer3.5.bn2.bias", "module.layer3.5.bn2.running_mean", "module.layer3.5.bn2.running_var", "module.layer3.5.bn2.num_batches_tracked", "module.layer3.5.conv3.weight", "module.layer3.5.bn3.weight", "module.layer3.5.bn3.bias", "module.layer3.5.bn3.running_mean", "module.layer3.5.bn3.running_var", "module.layer3.5.bn3.num_batches_tracked", "module.layer3.6.conv1.weight", "module.layer3.6.bn1.weight", "module.layer3.6.bn1.bias", "module.layer3.6.bn1.running_mean", "module.layer3.6.bn1.running_var", "module.layer3.6.bn1.num_batches_tracked", "module.layer3.6.conv2.weight", "module.layer3.6.bn2.weight", "module.layer3.6.bn2.bias", "module.layer3.6.bn2.running_mean", "module.layer3.6.bn2.running_var", "module.layer3.6.bn2.num_batches_tracked", "module.layer3.6.conv3.weight", "module.layer3.6.bn3.weight", "module.layer3.6.bn3.bias", "module.layer3.6.bn3.running_mean", "module.layer3.6.bn3.running_var", "module.layer3.6.bn3.num_batches_tracked", "module.layer3.7.conv1.weight", "module.layer3.7.bn1.weight", "module.layer3.7.bn1.bias", "module.layer3.7.bn1.running_mean", "module.layer3.7.bn1.running_var", "module.layer3.7.bn1.num_batches_tracked", "module.layer3.7.conv2.weight", "module.layer3.7.bn2.weight", "module.layer3.7.bn2.bias", "module.layer3.7.bn2.running_mean", "module.layer3.7.bn2.running_var", "module.layer3.7.bn2.num_batches_tracked", "module.layer3.7.conv3.weight", "module.layer3.7.bn3.weight", "module.layer3.7.bn3.bias", "module.layer3.7.bn3.running_mean", "module.layer3.7.bn3.running_var", "module.layer3.7.bn3.num_batches_tracked", "module.layer3.8.conv1.weight", "module.layer3.8.bn1.weight", "module.layer3.8.bn1.bias", "module.layer3.8.bn1.running_mean", "module.layer3.8.bn1.running_var", "module.layer3.8.bn1.num_batches_tracked", "module.layer3.8.conv2.weight", "module.layer3.8.bn2.weight", "module.layer3.8.bn2.bias", "module.layer3.8.bn2.running_mean", "module.layer3.8.bn2.running_var", "module.layer3.8.bn2.num_batches_tracked", "module.layer3.8.conv3.weight", "module.layer3.8.bn3.weight", "module.layer3.8.bn3.bias", "module.layer3.8.bn3.running_mean", "module.layer3.8.bn3.running_var", "module.layer3.8.bn3.num_batches_tracked", "module.layer3.9.conv1.weight", "module.layer3.9.bn1.weight", "module.layer3.9.bn1.bias", "module.layer3.9.bn1.running_mean", "module.layer3.9.bn1.running_var", "module.layer3.9.bn1.num_batches_tracked", "module.layer3.9.conv2.weight", "module.layer3.9.bn2.weight", "module.layer3.9.bn2.bias", "module.layer3.9.bn2.running_mean", "module.layer3.9.bn2.running_var", "module.layer3.9.bn2.num_batches_tracked", "module.layer3.9.conv3.weight", "module.layer3.9.bn3.weight", "module.layer3.9.bn3.bias", "module.layer3.9.bn3.running_mean", "module.layer3.9.bn3.running_var", "module.layer3.9.bn3.num_batches_tracked", "module.layer3.10.conv1.weight", "module.layer3.10.bn1.weight", "module.layer3.10.bn1.bias", "module.layer3.10.bn1.running_mean", "module.layer3.10.bn1.running_var", "module.layer3.10.bn1.num_batches_tracked", "module.layer3.10.conv2.weight", "module.layer3.10.bn2.weight", "module.layer3.10.bn2.bias", "module.layer3.10.bn2.running_mean", "module.layer3.10.bn2.running_var", "module.layer3.10.bn2.num_batches_tracked", "module.layer3.10.conv3.weight", "module.layer3.10.bn3.weight", "module.layer3.10.bn3.bias", "module.layer3.10.bn3.running_mean", "module.layer3.10.bn3.running_var", "module.layer3.10.bn3.num_batches_tracked", "module.layer3.11.conv1.weight", "module.layer3.11.bn1.weight", "module.layer3.11.bn1.bias", "module.layer3.11.bn1.running_mean", "module.layer3.11.bn1.running_var", "module.layer3.11.bn1.num_batches_tracked", "module.layer3.11.conv2.weight", "module.layer3.11.bn2.weight", "module.layer3.11.bn2.bias", "module.layer3.11.bn2.running_mean", "module.layer3.11.bn2.running_var", "module.layer3.11.bn2.num_batches_tracked", "module.layer3.11.conv3.weight", "module.layer3.11.bn3.weight", "module.layer3.11.bn3.bias", "module.layer3.11.bn3.running_mean", "module.layer3.11.bn3.running_var", "module.layer3.11.bn3.num_batches_tracked", "module.layer3.12.conv1.weight", "module.layer3.12.bn1.weight", "module.layer3.12.bn1.bias", "module.layer3.12.bn1.running_mean", "module.layer3.12.bn1.running_var", "module.layer3.12.bn1.num_batches_tracked", "module.layer3.12.conv2.weight", "module.layer3.12.bn2.weight", "module.layer3.12.bn2.bias", "module.layer3.12.bn2.running_mean", "module.layer3.12.bn2.running_var", "module.layer3.12.bn2.num_batches_tracked", "module.layer3.12.conv3.weight", "module.layer3.12.bn3.weight", "module.layer3.12.bn3.bias", "module.layer3.12.bn3.running_mean", "module.layer3.12.bn3.running_var", "module.layer3.12.bn3.num_batches_tracked", "module.layer3.13.conv1.weight", "module.layer3.13.bn1.weight", "module.layer3.13.bn1.bias", "module.layer3.13.bn1.running_mean", "module.layer3.13.bn1.running_var", "module.layer3.13.bn1.num_batches_tracked", "module.layer3.13.conv2.weight", "module.layer3.13.bn2.weight", "module.layer3.13.bn2.bias", "module.layer3.13.bn2.running_mean", "module.layer3.13.bn2.running_var", "module.layer3.13.bn2.num_batches_tracked", "module.layer3.13.conv3.weight", "module.layer3.13.bn3.weight", "module.layer3.13.bn3.bias", "module.layer3.13.bn3.running_mean", "module.layer3.13.bn3.running_var", "module.layer3.13.bn3.num_batches_tracked", "module.layer3.14.conv1.weight", "module.layer3.14.bn1.weight", "module.layer3.14.bn1.bias", "module.layer3.14.bn1.running_mean", "module.layer3.14.bn1.running_var", "module.layer3.14.bn1.num_batches_tracked", "module.layer3.14.conv2.weight", "module.layer3.14.bn2.weight", "module.layer3.14.bn2.bias", "module.layer3.14.bn2.running_mean", "module.layer3.14.bn2.running_var", "module.layer3.14.bn2.num_batches_tracked", "module.layer3.14.conv3.weight", "module.layer3.14.bn3.weight", "module.layer3.14.bn3.bias", "module.layer3.14.bn3.running_mean", "module.layer3.14.bn3.running_var", "module.layer3.14.bn3.num_batches_tracked", "module.layer3.15.conv1.weight", "module.layer3.15.bn1.weight", "module.layer3.15.bn1.bias", "module.layer3.15.bn1.running_mean", "module.layer3.15.bn1.running_var", "module.layer3.15.bn1.num_batches_tracked", "module.layer3.15.conv2.weight", "module.layer3.15.bn2.weight", "module.layer3.15.bn2.bias", "module.layer3.15.bn2.running_mean", "module.layer3.15.bn2.running_var", "module.layer3.15.bn2.num_batches_tracked", "module.layer3.15.conv3.weight", "module.layer3.15.bn3.weight", "module.layer3.15.bn3.bias", "module.layer3.15.bn3.running_mean", "module.layer3.15.bn3.running_var", "module.layer3.15.bn3.num_batches_tracked", "module.layer3.16.conv1.weight", "module.layer3.16.bn1.weight", "module.layer3.16.bn1.bias", "module.layer3.16.bn1.running_mean", "module.layer3.16.bn1.running_var", "module.layer3.16.bn1.num_batches_tracked", "module.layer3.16.conv2.weight", "module.layer3.16.bn2.weight", "module.layer3.16.bn2.bias", "module.layer3.16.bn2.running_mean", "module.layer3.16.bn2.running_var", "module.layer3.16.bn2.num_batches_tracked", "module.layer3.16.conv3.weight", "module.layer3.16.bn3.weight", "module.layer3.16.bn3.bias", "module.layer3.16.bn3.running_mean", "module.layer3.16.bn3.running_var", "module.layer3.16.bn3.num_batches_tracked", "module.layer3.17.conv1.weight", "module.layer3.17.bn1.weight", "module.layer3.17.bn1.bias", "module.layer3.17.bn1.running_mean", "module.layer3.17.bn1.running_var", "module.layer3.17.bn1.num_batches_tracked", "module.layer3.17.conv2.weight", "module.layer3.17.bn2.weight", "module.layer3.17.bn2.bias", "module.layer3.17.bn2.running_mean", "module.layer3.17.bn2.running_var", "module.layer3.17.bn2.num_batches_tracked", "module.layer3.17.conv3.weight", "module.layer3.17.bn3.weight", "module.layer3.17.bn3.bias", "module.layer3.17.bn3.running_mean", "module.layer3.17.bn3.running_var", "module.layer3.17.bn3.num_batches_tracked", "module.layer3.18.conv1.weight", "module.layer3.18.bn1.weight", "module.layer3.18.bn1.bias", "module.layer3.18.bn1.running_mean", "module.layer3.18.bn1.running_var", "module.layer3.18.bn1.num_batches_tracked", "module.layer3.18.conv2.weight", "module.layer3.18.bn2.weight", "module.layer3.18.bn2.bias", "module.layer3.18.bn2.running_mean", "module.layer3.18.bn2.running_var", "module.layer3.18.bn2.num_batches_tracked", "module.layer3.18.conv3.weight", "module.layer3.18.bn3.weight", "module.layer3.18.bn3.bias", "module.layer3.18.bn3.running_mean", "module.layer3.18.bn3.running_var", "module.layer3.18.bn3.num_batches_tracked", "module.layer3.19.conv1.weight", "module.layer3.19.bn1.weight", "module.layer3.19.bn1.bias", "module.layer3.19.bn1.running_mean", "module.layer3.19.bn1.running_var", "module.layer3.19.bn1.num_batches_tracked", "module.layer3.19.conv2.weight", "module.layer3.19.bn2.weight", "module.layer3.19.bn2.bias", "module.layer3.19.bn2.running_mean", "module.layer3.19.bn2.running_var", "module.layer3.19.bn2.num_batches_tracked", "module.layer3.19.conv3.weight", "module.layer3.19.bn3.weight", "module.layer3.19.bn3.bias", "module.layer3.19.bn3.running_mean", "module.layer3.19.bn3.running_var", "module.layer3.19.bn3.num_batches_tracked", "module.layer3.20.conv1.weight", "module.layer3.20.bn1.weight", "module.layer3.20.bn1.bias", "module.layer3.20.bn1.running_mean", "module.layer3.20.bn1.running_var", "module.layer3.20.bn1.num_batches_tracked", "module.layer3.20.conv2.weight", "module.layer3.20.bn2.weight", "module.layer3.20.bn2.bias", "module.layer3.20.bn2.running_mean", "module.layer3.20.bn2.running_var", "module.layer3.20.bn2.num_batches_tracked", "module.layer3.20.conv3.weight", "module.layer3.20.bn3.weight", "module.layer3.20.bn3.bias", "module.layer3.20.bn3.running_mean", "module.layer3.20.bn3.running_var", "module.layer3.20.bn3.num_batches_tracked", "module.layer3.21.conv1.weight", "module.layer3.21.bn1.weight", "module.layer3.21.bn1.bias", "module.layer3.21.bn1.running_mean", "module.layer3.21.bn1.running_var", "module.layer3.21.bn1.num_batches_tracked", "module.layer3.21.conv2.weight", "module.layer3.21.bn2.weight", "module.layer3.21.bn2.bias", "module.layer3.21.bn2.running_mean", "module.layer3.21.bn2.running_var", "module.layer3.21.bn2.num_batches_tracked", "module.layer3.21.conv3.weight", "module.layer3.21.bn3.weight", "module.layer3.21.bn3.bias", "module.layer3.21.bn3.running_mean", "module.layer3.21.bn3.running_var", "module.layer3.21.bn3.num_batches_tracked", "module.layer3.22.conv1.weight", "module.layer3.22.bn1.weight", "module.layer3.22.bn1.bias", "module.layer3.22.bn1.running_mean", "module.layer3.22.bn1.running_var", "module.layer3.22.bn1.num_batches_tracked", "module.layer3.22.conv2.weight", "module.layer3.22.bn2.weight", "module.layer3.22.bn2.bias", "module.layer3.22.bn2.running_mean", "module.layer3.22.bn2.running_var", "module.layer3.22.bn2.num_batches_tracked", "module.layer3.22.conv3.weight", "module.layer3.22.bn3.weight", "module.layer3.22.bn3.bias", "module.layer3.22.bn3.running_mean", "module.layer3.22.bn3.running_var", "module.layer3.22.bn3.num_batches_tracked", "module.layer3.23.conv1.weight", "module.layer3.23.bn1.weight", "module.layer3.23.bn1.bias", "module.layer3.23.bn1.running_mean", "module.layer3.23.bn1.running_var", "module.layer3.23.bn1.num_batches_tracked", "module.layer3.23.conv2.weight", "module.layer3.23.bn2.weight", "module.layer3.23.bn2.bias", "module.layer3.23.bn2.running_mean", "module.layer3.23.bn2.running_var", "module.layer3.23.bn2.num_batches_tracked", "module.layer3.23.conv3.weight", "module.layer3.23.bn3.weight", "module.layer3.23.bn3.bias", "module.layer3.23.bn3.running_mean", "module.layer3.23.bn3.running_var", "module.layer3.23.bn3.num_batches_tracked", "module.layer3.24.conv1.weight", "module.layer3.24.bn1.weight", "module.layer3.24.bn1.bias", "module.layer3.24.bn1.running_mean", "module.layer3.24.bn1.running_var", "module.layer3.24.bn1.num_batches_tracked", "module.layer3.24.conv2.weight", "module.layer3.24.bn2.weight", "module.layer3.24.bn2.bias", "module.layer3.24.bn2.running_mean", "module.layer3.24.bn2.running_var", "module.layer3.24.bn2.num_batches_tracked", "module.layer3.24.conv3.weight", "module.layer3.24.bn3.weight", "module.layer3.24.bn3.bias", "module.layer3.24.bn3.running_mean", "module.layer3.24.bn3.running_var", "module.layer3.24.bn3.num_batches_tracked", "module.layer3.25.conv1.weight", "module.layer3.25.bn1.weight", "module.layer3.25.bn1.bias", "module.layer3.25.bn1.running_mean", "module.layer3.25.bn1.running_var", "module.layer3.25.bn1.num_batches_tracked", "module.layer3.25.conv2.weight", "module.layer3.25.bn2.weight", "module.layer3.25.bn2.bias", "module.layer3.25.bn2.running_mean", "module.layer3.25.bn2.running_var", "module.layer3.25.bn2.num_batches_tracked", "module.layer3.25.conv3.weight", "module.layer3.25.bn3.weight", "module.layer3.25.bn3.bias", "module.layer3.25.bn3.running_mean", "module.layer3.25.bn3.running_var", "module.layer3.25.bn3.num_batches_tracked", "module.layer3.26.conv1.weight", "module.layer3.26.bn1.weight", "module.layer3.26.bn1.bias", "module.layer3.26.bn1.running_mean", "module.layer3.26.bn1.running_var", "module.layer3.26.bn1.num_batches_tracked", "module.layer3.26.conv2.weight", "module.layer3.26.bn2.weight", "module.layer3.26.bn2.bias", "module.layer3.26.bn2.running_mean", "module.layer3.26.bn2.running_var", "module.layer3.26.bn2.num_batches_tracked", "module.layer3.26.conv3.weight", "module.layer3.26.bn3.weight", "module.layer3.26.bn3.bias", "module.layer3.26.bn3.running_mean", "module.layer3.26.bn3.running_var", "module.layer3.26.bn3.num_batches_tracked", "module.layer3.27.conv1.weight", "module.layer3.27.bn1.weight", "module.layer3.27.bn1.bias", "module.layer3.27.bn1.running_mean", "module.layer3.27.bn1.running_var", "module.layer3.27.bn1.num_batches_tracked", "module.layer3.27.conv2.weight", "module.layer3.27.bn2.weight", "module.layer3.27.bn2.bias", "module.layer3.27.bn2.running_mean", "module.layer3.27.bn2.running_var", "module.layer3.27.bn2.num_batches_tracked", "module.layer3.27.conv3.weight", "module.layer3.27.bn3.weight", "module.layer3.27.bn3.bias", "module.layer3.27.bn3.running_mean", "module.layer3.27.bn3.running_var", "module.layer3.27.bn3.num_batches_tracked", "module.layer3.28.conv1.weight", "module.layer3.28.bn1.weight", "module.layer3.28.bn1.bias", "module.layer3.28.bn1.running_mean", "module.layer3.28.bn1.running_var", "module.layer3.28.bn1.num_batches_tracked", "module.layer3.28.conv2.weight", "module.layer3.28.bn2.weight", "module.layer3.28.bn2.bias", "module.layer3.28.bn2.running_mean", "module.layer3.28.bn2.running_var", "module.layer3.28.bn2.num_batches_tracked", "module.layer3.28.conv3.weight", "module.layer3.28.bn3.weight", "module.layer3.28.bn3.bias", "module.layer3.28.bn3.running_mean", "module.layer3.28.bn3.running_var", "module.layer3.28.bn3.num_batches_tracked", "module.layer3.29.conv1.weight", "module.layer3.29.bn1.weight", "module.layer3.29.bn1.bias", "module.layer3.29.bn1.running_mean", "module.layer3.29.bn1.running_var", "module.layer3.29.bn1.num_batches_tracked", "module.layer3.29.conv2.weight", "module.layer3.29.bn2.weight", "module.layer3.29.bn2.bias", "module.layer3.29.bn2.running_mean", "module.layer3.29.bn2.running_var", "module.layer3.29.bn2.num_batches_tracked", "module.layer3.29.conv3.weight", "module.layer3.29.bn3.weight", "module.layer3.29.bn3.bias", "module.layer3.29.bn3.running_mean", "module.layer3.29.bn3.running_var", "module.layer3.29.bn3.num_batches_tracked", "module.layer3.30.conv1.weight", "module.layer3.30.bn1.weight", "module.layer3.30.bn1.bias", "module.layer3.30.bn1.running_mean", "module.layer3.30.bn1.running_var", "module.layer3.30.bn1.num_batches_tracked", "module.layer3.30.conv2.weight", "module.layer3.30.bn2.weight", "module.layer3.30.bn2.bias", "module.layer3.30.bn2.running_mean", "module.layer3.30.bn2.running_var", "module.layer3.30.bn2.num_batches_tracked", "module.layer3.30.conv3.weight", "module.layer3.30.bn3.weight", "module.layer3.30.bn3.bias", "module.layer3.30.bn3.running_mean", "module.layer3.30.bn3.running_var", "module.layer3.30.bn3.num_batches_tracked", "module.layer3.31.conv1.weight", "module.layer3.31.bn1.weight", "module.layer3.31.bn1.bias", "module.layer3.31.bn1.running_mean", "module.layer3.31.bn1.running_var", "module.layer3.31.bn1.num_batches_tracked", "module.layer3.31.conv2.weight", "module.layer3.31.bn2.weight", "module.layer3.31.bn2.bias", "module.layer3.31.bn2.running_mean", "module.layer3.31.bn2.running_var", "module.layer3.31.bn2.num_batches_tracked", "module.layer3.31.conv3.weight", "module.layer3.31.bn3.weight", "module.layer3.31.bn3.bias", "module.layer3.31.bn3.running_mean", "module.layer3.31.bn3.running_var", "module.layer3.31.bn3.num_batches_tracked", "module.layer3.32.conv1.weight", "module.layer3.32.bn1.weight", "module.layer3.32.bn1.bias", "module.layer3.32.bn1.running_mean", "module.layer3.32.bn1.running_var", "module.layer3.32.bn1.num_batches_tracked", "module.layer3.32.conv2.weight", "module.layer3.32.bn2.weight", "module.layer3.32.bn2.bias", "module.layer3.32.bn2.running_mean", "module.layer3.32.bn2.running_var", "module.layer3.32.bn2.num_batches_tracked", "module.layer3.32.conv3.weight", "module.layer3.32.bn3.weight", "module.layer3.32.bn3.bias", "module.layer3.32.bn3.running_mean", "module.layer3.32.bn3.running_var", "module.layer3.32.bn3.num_batches_tracked", "module.layer3.33.conv1.weight", "module.layer3.33.bn1.weight", "module.layer3.33.bn1.bias", "module.layer3.33.bn1.running_mean", "module.layer3.33.bn1.running_var", "module.layer3.33.bn1.num_batches_tracked", "module.layer3.33.conv2.weight", "module.layer3.33.bn2.weight", "module.layer3.33.bn2.bias", "module.layer3.33.bn2.running_mean", "module.layer3.33.bn2.running_var", "module.layer3.33.bn2.num_batches_tracked", "module.layer3.33.conv3.weight", "module.layer3.33.bn3.weight", "module.layer3.33.bn3.bias", "module.layer3.33.bn3.running_mean", "module.layer3.33.bn3.running_var", "module.layer3.33.bn3.num_batches_tracked", "module.layer3.34.conv1.weight", "module.layer3.34.bn1.weight", "module.layer3.34.bn1.bias", "module.layer3.34.bn1.running_mean", "module.layer3.34.bn1.running_var", "module.layer3.34.bn1.num_batches_tracked", "module.layer3.34.conv2.weight", "module.layer3.34.bn2.weight", "module.layer3.34.bn2.bias", "module.layer3.34.bn2.running_mean", "module.layer3.34.bn2.running_var", "module.layer3.34.bn2.num_batches_tracked", "module.layer3.34.conv3.weight", "module.layer3.34.bn3.weight", "module.layer3.34.bn3.bias", "module.layer3.34.bn3.running_mean", "module.layer3.34.bn3.running_var", "module.layer3.34.bn3.num_batches_tracked", "module.layer3.35.conv1.weight", "module.layer3.35.bn1.weight", "module.layer3.35.bn1.bias", "module.layer3.35.bn1.running_mean", "module.layer3.35.bn1.running_var", "module.layer3.35.bn1.num_batches_tracked", "module.layer3.35.conv2.weight", "module.layer3.35.bn2.weight", "module.layer3.35.bn2.bias", "module.layer3.35.bn2.running_mean", "module.layer3.35.bn2.running_var", "module.layer3.35.bn2.num_batches_tracked", "module.layer3.35.conv3.weight", "module.layer3.35.bn3.weight", "module.layer3.35.bn3.bias", "module.layer3.35.bn3.running_mean", "module.layer3.35.bn3.running_var", "module.layer3.35.bn3.num_batches_tracked", "module.layer4.0.conv1.weight", "module.layer4.0.bn1.weight", "module.layer4.0.bn1.bias", "module.layer4.0.bn1.running_mean", "module.layer4.0.bn1.running_var", "module.layer4.0.bn1.num_batches_tracked", "module.layer4.0.conv2.weight", "module.layer4.0.bn2.weight", "module.layer4.0.bn2.bias", "module.layer4.0.bn2.running_mean", "module.layer4.0.bn2.running_var", "module.layer4.0.bn2.num_batches_tracked", "module.layer4.0.conv3.weight", "module.layer4.0.bn3.weight", "module.layer4.0.bn3.bias", "module.layer4.0.bn3.running_mean", "module.layer4.0.bn3.running_var", "module.layer4.0.bn3.num_batches_tracked", "module.layer4.0.downsample.0.weight", "module.layer4.0.downsample.1.weight", "module.layer4.0.downsample.1.bias", "module.layer4.0.downsample.1.running_mean", "module.layer4.0.downsample.1.running_var", "module.layer4.0.downsample.1.num_batches_tracked", "module.layer4.1.conv1.weight", "module.layer4.1.bn1.weight", "module.layer4.1.bn1.bias", "module.layer4.1.bn1.running_mean", "module.layer4.1.bn1.running_var", "module.layer4.1.bn1.num_batches_tracked", "module.layer4.1.conv2.weight", "module.layer4.1.bn2.weight", "module.layer4.1.bn2.bias", "module.layer4.1.bn2.running_mean", "module.layer4.1.bn2.running_var", "module.layer4.1.bn2.num_batches_tracked", "module.layer4.1.conv3.weight", "module.layer4.1.bn3.weight", "module.layer4.1.bn3.bias", "module.layer4.1.bn3.running_mean", "module.layer4.1.bn3.running_var", "module.layer4.1.bn3.num_batches_tracked", "module.layer4.2.conv1.weight", "module.layer4.2.bn1.weight", "module.layer4.2.bn1.bias", "module.layer4.2.bn1.running_mean", "module.layer4.2.bn1.running_var", "module.layer4.2.bn1.num_batches_tracked", "module.layer4.2.conv2.weight", "module.layer4.2.bn2.weight", "module.layer4.2.bn2.bias", "module.layer4.2.bn2.running_mean", "module.layer4.2.bn2.running_var", "module.layer4.2.bn2.num_batches_tracked", "module.layer4.2.conv3.weight", "module.layer4.2.bn3.weight", "module.layer4.2.bn3.bias", "module.layer4.2.bn3.running_mean", "module.layer4.2.bn3.running_var", "module.layer4.2.bn3.num_batches_tracked", "module.fc.weight", "module.fc.bias". 