# BN-Inception

关于该网络：

+ 基于GoogleNet, 加入了BN层,

+ GoogLeNet的基础块Inception,结构如下：

![img](https://tangshusen.me/Dive-into-DL-PyTorch/img/chapter05/5.9_inception.svg)




In [1]:
import torch
import time 
from torch import nn,optim
import torch.nn.functional as F 

import sys
sys.path.append('..')

from script import d2l_pytorch as d2l 


In [2]:
class baseConv2d(nn.Module):
    def __init__(self,in_channels,out_channels,**args):
        super(baseConv2d,self).__init__()
        self.conv=nn.Conv2d(in_channels,out_channels,**args)
        self.bn=nn.BatchNorm2d(out_channels)
    def forward(self,x):
        x=self.conv(x)
        x=self.bn(x)
        return x

In [3]:
class Inception(nn.Module):
    # 4条线路
    def __init__(self,in_c,c1,c2,c3,c4,batch_norm=False):
        super(Inception,self).__init__()
        if batch_norm==False:
            # 线路1，共1层，就是最左侧的那条
            ## 1*1的卷积层，用来减少通道数
            self.p1_1=nn.Conv2d(in_channels=in_c,out_channels=c1,kernel_size=1)
            # 线路2，共2层
            ## 1*1的卷积层
            self.p2_1=nn.Conv2d(in_channels=in_c,out_channels=c2[0],kernel_size=1)
            ## 3*3的卷积层
            self.p2_2=nn.Conv2d(c2[0],c2[1],kernel_size=3,padding=1)
            # 线路3，共2层
            ## 1*1的卷积层
            self.p3_1=nn.Conv2d(in_c,c3[0],kernel_size=1)
            ## 5*5的卷积层
            self.p3_2=nn.Conv2d(c3[0],c3[1],kernel_size=5,padding=2)
            # 线路4，共2层
            ## 3*3的最大池化层
            self.p4_1=nn.MaxPool2d(kernel_size=3,stride=1,padding=1)
            self.p4_2=nn.Conv2d(in_c,c4,kernel_size=1)
        else:
            self.p1_1=baseConv2d(in_c,c1,kernel_size=1)
            # 线路2，共2层
            self.p2_1=baseConv2d(in_c,c2[0],kernel_size=1)
            self.p2_2=baseConv2d(c2[0],c2[1],kernel_size=3,padding=1)
            
            # 线路3，共2层
            self.p3_1=baseConv2d(in_c,c3[0],kernel_size=1)
            self.p3_2=baseConv2d(c3[0],c3[1],kernel_size=5,padding=2)
            
            # 线路4，共2层
            self.p4_1=baseConv2d(kernel_size=3,stride=1,padding=1)
            self.p4_2=baseConv2d(in_c,c4,kernel_size=1)

        # relu
    def forward(self,x):
        # 线路1
        p1=F.relu(self.p1_1(x))
        # 线路2
        p2=F.relu(self.p2_2(F.relu(self.p2_1(x))))
        # 线路3
        p3=F.relu(self.p3_2(F.relu(self.p3_1(x))))
        # 线路4
        p4=F.relu(self.p4_2(self.p4_1(x)))
        # 将四条线路的输出，在通道维上连结
        # out=[p1,p2,p3,p4];torch.cat(out,1)
        return torch.cat((p1,p2,p3,p4),dim=1)  

+ GoogLeNet模型，在主体部分使用5个模块。

    每个模块之间使用步幅为2的3\*3池化层来减小输出的高和宽，

    每一个模块使用通道数为7\*7的卷积层。

模块1和模块2如图：

<img src="https://gitee.com/changyv/md-pic/raw/master/20210317211018.png" style="zoom:50%;" />

In [4]:
b1=nn.Sequential(
    # 3-->1
    nn.Conv2d(1,64,kernel_size=7,stride=2,padding=3),
    nn.ReLU(),
    nn.MaxPool2d(kernel_size=3,stride=2,padding=1)
)
b2=nn.Sequential(
    # 1*1的卷积层
    nn.Conv2d(64,64,kernel_size=1),
    # 3*3的卷积层，将通道数增加3倍
    nn.Conv2d(64,192,kernel_size=3,padding=1),
    # 池化层
    nn.MaxPool2d(kernel_size=3,stride=2,padding=1)
)

第3个模块，串联2个完整的Inception块，

第一个模块输出通道数为：64+128+32+32=256

第二个模块输出通道数为：128+192+96+64=480

如图：

<img src="https://gitee.com/changyv/md-pic/raw/master/20210317212459.png" alt="image-20210317212456643" style="zoom:50%;" />



In [5]:
b3=nn.Sequential(
    Inception(192,64,(96,128),(16,32),32),
    Inception(256,128,(128,192),(32,96),64),
    nn.MaxPool2d(kernel_size=3,stride=2,padding=1)
)

第四个模块，串联了5个Inception块，输出通道数分别为：

+ 192+208+48+64=512

+ 160+224+64+64=512

+ 128+256+64+64=512、

+ 112+288+64+64=528

+ 256+320+128+128=832

In [6]:
b4 = nn.Sequential(
    Inception(480, 192, (96, 208), (16, 48), 64),
    Inception(512, 160, (112, 224), (24, 64), 64),
    Inception(512, 128, (128, 256), (24, 64), 64),
    Inception(512, 112, (144, 288), (32, 64), 64),
    Inception(528, 256, (160, 320), (32, 128), 128),
    nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
)

第5个模块，串联了2个Inception模块，其输出通道分别为：

+ 256+320+128+128=832

+ 384+384+128+128=1024

第5个模块后面，跟着输出层，输出层使用全局平均池化层，将每个通道的高和宽变为1，最后将输出变为二维数组，后接一个输出个数为类别的全连接层。

In [7]:
b5 = nn.Sequential(
    Inception(832, 256, (160, 320), (32, 128), 128),
    Inception(832, 384, (192, 384), (48, 128), 128),
    d2l.GlobalAvgPool2d()
)

输出层

In [8]:
net = nn.Sequential(
    b1, b2, b3, b4, b5, 
    d2l.FlattenLayer(), 
    nn.Linear(1024, 10)
)

In [9]:
print(net)

Sequential(
  (0): Sequential(
    (0): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
    (1): ReLU()
    (2): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  )
  (1): Sequential(
    (0): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
    (1): Conv2d(64, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (2): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  )
  (2): Sequential(
    (0): Inception(
      (p1_1): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1))
      (p2_1): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1))
      (p2_2): Conv2d(96, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (p3_1): Conv2d(192, 16, kernel_size=(1, 1), stride=(1, 1))
      (p3_2): Conv2d(16, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
      (p4_1): MaxPool2d(kernel_size=3, stride=1, padding=1, dilation=1, ceil_mode=False)
      (p4_2): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1))


In [10]:
# test
# 将输入的高和宽从224降到96，简化计算

# X=torch.rand(1,3,96,96)
X=torch.rand(1,1,96,96)

for blk in net.children():
    X=blk(X)
    print('out shape:',X.shape)

out shape: torch.Size([1, 64, 24, 24])
out shape: torch.Size([1, 192, 12, 12])
out shape: torch.Size([1, 480, 6, 6])
out shape: torch.Size([1, 832, 3, 3])
out shape: torch.Size([1, 1024, 1, 1])
out shape: torch.Size([1, 1024])
out shape: torch.Size([1, 10])


In [11]:
# 获取数据，训练模型

batch_size=64

train_iter,test_iter=d2l.load_data_fashion_mnist_resize(batch_size,resize=96)

lr,num_epochs=0.001,2
optimizer=torch.optim.Adam(net.parameters(),lr=lr)

In [12]:
x,y=iter(train_iter).next()

In [13]:
x[:1].shape

torch.Size([1, 1, 96, 96])

In [14]:
device=torch.device('cuda' if torch.cuda.is_available() else 'cpu')

d2l.train(net,train_iter,test_iter,batch_size,optimizer,device,num_epochs)

training on  cpu
epoch 0/2, iter 0/937, loss 2.303
epoch 0/2, iter 1/937, loss 2.309
epoch 0/2, iter 2/937, loss 2.302
epoch 0/2, iter 3/937, loss 2.310
epoch 0/2, iter 4/937, loss 2.320


## 小结

+ Inception有4个并行的子网络，使用不同窗口的卷积层和最大池化层来并行的抽取信息，使用1\*1的卷积层减少通道数，

+ GoogLeNet将都哦个Inception块与其它块串联起来，每个Inception块的通道分配数的比值，实在ImageNet上大量实验获得，
