### 深度卷积神经网络(AlexNet)

神经网络可以直接基于图像的原始像素进行分类，这种方法称为端到端(end-to-end)。

使用较干净的数据集和较有效的特征甚至比机器学习模型的选择对图像分类结果的影响更大。

### AlexNet

![image.png](attachment:image.png)

AlexNet使用了5层卷积层和2层全连接隐藏层。同时将sigmoid激活函数改成了更加简单的ReLU激活函数。

AlexNet的卷积层部分：
* 第一层的卷积形状是11 * 11,第二层卷积形状是5 * 5,之后全采用3 * 3。此外第一、第二和第五层卷积层之后都使用了窗口为3 * 3,步幅为2的最大池化层。

AlexNet的全连接层:
* 隐藏层使用了丢弃法来控制权连接层的模型复杂度。

同时，AlexNet引入了大量的图像增广，如翻转、裁剪和颜色变化，从而进一步扩大数据集来缓解过拟合。


**AlexNet跟LeNet结构类似，但使用了更多的卷积层和更大的参数空间来拟合大规模数据集ImageNet。它是浅层神经网络和深度神经网络的分界线。**

### 简单实现

In [1]:
import torch
from torch import nn
import utils
import time

In [2]:
#读取数据
batch_size=128
train_iter,test_iter=utils.load_data_fashion_mnist(batch_size,resize=224)


In [3]:
#定义模型
class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet,self).__init__()
        self.conv=nn.Sequential(
            nn.Conv2d(1,96,11,4),
            nn.ReLU(),
            nn.MaxPool2d(3,2),
            nn.Conv2d(96,256,5,1,2),
            nn.ReLU(),
            nn.MaxPool2d(3,2),
            nn.Conv2d(256,384,3,1,1),
            nn.ReLU(),
            nn.Conv2d(384,384,3,1,1),
            nn.ReLU(),
            nn.Conv2d(384,256,3,1,1),
            nn.ReLU(),
            nn.MaxPool2d(3,2)
        )
        self.fc=nn.Sequential(
            nn.Linear(256*5*5,4096),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(4096,4096),
            nn.ReLU(),
            nn.Dropout(0.5),
            #输出层
            nn.Linear(4096,10),
        )
    def forward(self,img):
        feature=self.conv(img)
        output=self.fc(feature.view(img.shape[0],-1))
        return output

In [9]:
#定义评估函数
def evaluate_accuracy(data_iter,net,device=None):
    if device is None and isinstance(net,nn.Module):
        device=list(net.parameters())[0].device
    acc_sum,n=0.0,0
    with torch.no_grad():
        for X,y in data_iter:
            net.eval()
            acc_sum+=(net(X.to(device)).argmax(dim=1)==y.to(device)).float().sum().cpu().item()
            net.train()
            n+=y.shape[0]
    return acc_sum/n

In [10]:
#定义训练函数
def train_ch5(net,train_iter,test_iter,num_epochs,device,optimizer):
    net=net.to(device)
    loss=nn.CrossEntropyLoss()
    print('train on',device)
    for epoch in range(num_epochs):
        train_l_sum,train_acc_sum,n,batch_count,start=0.0,0.0,0,0,time.time()
        for X,y in train_iter:
            X=X.to(device)
            y=y.to(device)
            y_hat=net(X)
            l=loss(y_hat,y)
            optimizer.zero_grad()
            l.backward()
            optimizer.step()
            
            train_l_sum+=l.cpu().item()
            train_acc_sum+=(y_hat.argmax(dim=1)==y).float().sum().cpu().item()
            n+=y.shape[0]
            batch_count+=1
        test_acc=evaluate_accuracy(test_iter,net)
        print('epoch %d,train loss %.4f,train acc %.4f,test acc %.4f,time %.1f sec'
              %(epoch+1,train_l_sum/batch_count,train_acc_sum/n,test_acc,time.time()-start))
        

In [11]:
#训练
lr,num_epochs=0.001,5
net=AlexNet()
optimizer=torch.optim.Adam(net.parameters(),lr=lr)
device=torch.device('cuda' if torch.cuda.is_available else 'cpu')

train_ch5(net,train_iter,test_iter,num_epochs,device,optimizer)

train on cuda
epoch 1,train loss 0.6138,train acc 0.7664,test acc 0.8431,time 56.7 sec
epoch 2,train loss 0.3416,train acc 0.8723,test acc 0.8837,time 55.5 sec
epoch 3,train loss 0.2991,train acc 0.8880,test acc 0.8991,time 55.1 sec
epoch 4,train loss 0.2694,train acc 0.8993,test acc 0.9042,time 55.7 sec
epoch 5,train loss 0.2505,train acc 0.9073,test acc 0.9040,time 56.1 sec
