# NiN
## NiN块

卷积层的输入和输出通常是四维数组`（样本，通道，高，宽）`，而全连接层的输入和输出则通常是二维数组`（样本，特征）`。

如果想**在全连接层后再接上卷积层，则需要将全连接层的输出变换为四维**。

NiN使用`1×1`卷积层来替代全连接层，从而使空间信息能够自然传递到后面的层中去。

卷积层:

`第一层:kernel_size`            

`第二层:1x1`          

`第三层:kernel_size`            

`第四层:1x1`

<img style="float: center;" src="./pics/4.nin.png" width=400 height=400>

In [1]:
import time
import torch
from torch import nn,optim

import sys
sys.path.append("..")
import d2lzh_pytorch as d2l
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(device)

def nin_block(in_channels,out_channels,kernel_size,stride,padding):
    blk = nn.Sequential(nn.Conv2d(in_channels,out_channels,kernel_size,stride,padding),
                        nn.ReLU(),
                        nn.Conv2d(out_channels,out_channels,kernel_size = 1),
                        nn.ReLU(),
                        nn.Conv2d(out_channels,out_channels,kernel_size = 1),
                        nn.ReLU()
                       )
    return blk

D:\Anaconda\envs\torch\lib\site-packages\numpy\.libs\libopenblas.JPIJNSWNNAN3CE6LLI5FWSPHUT2VXMTH.gfortran-win_amd64.dll
D:\Anaconda\envs\torch\lib\site-packages\numpy\.libs\libopenblas.XWYDX2IKJW2NMTWSFYNGFUWKQU3LYTCZ.gfortran-win_amd64.dll
  stacklevel=1)


cuda


## NiN模型
NiN使用卷积窗口形状分别为`11×11`、`5×5`和`×3`的卷积层。每个NiN块后接一个`步幅为2`、`窗口形状为3×3`的最大池化层。

NiN去掉了AlexNet最后的3个全连接层.                

NiN使用了`输出通道数等于标签类别数的NiN块`，然后使用`全局平均池化层`对每个通道中所有元素求平均并直接用于分类。

这里的全局平均池化层即窗口形状等于输入空间维形状的平均池化层。NiN的这个设计的好处是可以`显著减小模型参数尺寸，从而缓解过拟合`。

In [2]:
import torch.nn.functional as F
class GlobalAvgPool2d(nn.Module):
    def __init__(self):
        super(GlobalAvgPool2d,self).__init__()
    def forward(self,x):
        return F.avg_pool2d(x,kernel_size = x.size()[2:])
    
net = nn.Sequential(
    nin_block(1,96,kernel_size = 11,stride = 4,padding = 0),
    nn.MaxPool2d(kernel_size = 3 ,stride = 2),
    nin_block(96, 256, kernel_size=5, stride=1, padding=2),
    nn.MaxPool2d(kernel_size=3, stride=2),
    nin_block(256, 384, kernel_size=3, stride=1, padding=1),
    nn.MaxPool2d(kernel_size=3, stride=2), 
    nn.Dropout(0.5),
    # 标签类别数是10
    nin_block(384, 10, kernel_size=3, stride=1, padding=1),
    GlobalAvgPool2d(), 
    # 将四维的输出转成二维的输出，其形状为(批量大小, 10)
    d2l.FlattenLayer()
)

In [3]:
X = torch.rand(1, 1, 224, 224)
for name, blk in net.named_children(): 
    X = blk(X)
    print(name, 'output shape: ', X.shape)

0 output shape:  torch.Size([1, 96, 54, 54])
1 output shape:  torch.Size([1, 96, 26, 26])
2 output shape:  torch.Size([1, 256, 26, 26])
3 output shape:  torch.Size([1, 256, 12, 12])
4 output shape:  torch.Size([1, 384, 12, 12])
5 output shape:  torch.Size([1, 384, 5, 5])
6 output shape:  torch.Size([1, 384, 5, 5])
7 output shape:  torch.Size([1, 10, 5, 5])
8 output shape:  torch.Size([1, 10, 1, 1])
9 output shape:  torch.Size([1, 10])


In [4]:
batch_size = 128
# 如出现“out of memory”的报错信息，可减小batch_size或resize
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size, resize=224,root = "D:\JupyterFile\Machine_Learning\Pytorch\Datasets\FashionMNIST")

lr, num_epochs = 0.002, 5
optimizer = torch.optim.Adam(net.parameters(), lr=lr)
d2l.train_ch5(net, train_iter, test_iter, batch_size, optimizer, device, num_epochs)

  0%|                                                                                          | 0/469 [00:00<?, ?it/s]

training on  cuda


100%|████████████████████████████████████████████████████████████████████████████████| 469/469 [01:27<00:00,  5.38it/s]
  0%|                                                                                          | 0/469 [00:00<?, ?it/s]

epoch 1, loss 1.4093, train acc 0.463, test acc 0.755, time 95.1 sec


100%|████████████████████████████████████████████████████████████████████████████████| 469/469 [01:24<00:00,  5.53it/s]
  0%|                                                                                          | 0/469 [00:00<?, ?it/s]

epoch 2, loss 0.5745, train acc 0.792, test acc 0.814, time 92.0 sec


100%|████████████████████████████████████████████████████████████████████████████████| 469/469 [01:23<00:00,  5.60it/s]
  0%|                                                                                          | 0/469 [00:00<?, ?it/s]

epoch 3, loss 0.4624, train acc 0.832, test acc 0.845, time 91.1 sec


100%|████████████████████████████████████████████████████████████████████████████████| 469/469 [01:22<00:00,  5.67it/s]
  0%|                                                                                          | 0/469 [00:00<?, ?it/s]

epoch 4, loss 0.4127, train acc 0.849, test acc 0.850, time 90.2 sec


100%|████████████████████████████████████████████████████████████████████████████████| 469/469 [01:22<00:00,  5.66it/s]


epoch 5, loss 0.3731, train acc 0.864, test acc 0.864, time 90.4 sec
