AlexNet由八层组成：五个卷积层、两个全连接隐藏层和一个全连接输出层。 其次，AlexNet使用ReLU而不是sigmoid作为其激活函数。

In [1]:
import torch
from torch import nn
from d2l import torch as d2l

In [2]:
net = nn.Sequential(nn.Conv2d(1,96,kernel_size=11,stride=4,padding=1),
                   nn.ReLU(),
                   nn.MaxPool2d(kernel_size=3,stride=2),
                   
                   nn.Conv2d(96,256,kernel_size=5,padding=2),
                   nn.ReLU(),
                   nn.MaxPool2d(kernel_size=3,stride=2),
                    
                   nn.Conv2d(256,384,kernel_size=3,padding=1),
                   nn.ReLU(), 
                   nn.Conv2d(384,384,kernel_size=3,padding=1),
                   nn.ReLU(), 
                   nn.Conv2d(384,256,kernel_size=3,padding=1),
                   nn.ReLU(),
                   nn.MaxPool2d(kernel_size=3,stride=2),
                #  使用dropout层来减轻过拟合
                   nn.Flatten(), 
                   nn.Linear(6400,4096), # why 6400   256*5*5
                   nn.ReLU(),
                   nn.Dropout(p=0.5), 
                   nn.Linear(4096,4096),
                   nn.ReLU(),
                   nn.Dropout(p=0.5), 
                #  最后是输出层。由于这里使用Fashion-MNIST，所以用类别数为10，而非论文中的1000
                   nn.Linear(4096,10) 
                   )

# 造一个高度和宽度都为224的单通道数据，来观察每一层输出的形状
X=torch.randn(1,1,224,224)
for layer in net:
    X=layer(X)
    print(layer.__class__.__name__,'output shape:\t ',X.shape)

Conv2d output shape:	  torch.Size([1, 96, 54, 54])
ReLU output shape:	  torch.Size([1, 96, 54, 54])
MaxPool2d output shape:	  torch.Size([1, 96, 26, 26])
Conv2d output shape:	  torch.Size([1, 256, 26, 26])
ReLU output shape:	  torch.Size([1, 256, 26, 26])
MaxPool2d output shape:	  torch.Size([1, 256, 12, 12])
Conv2d output shape:	  torch.Size([1, 384, 12, 12])
ReLU output shape:	  torch.Size([1, 384, 12, 12])
Conv2d output shape:	  torch.Size([1, 384, 12, 12])
ReLU output shape:	  torch.Size([1, 384, 12, 12])
Conv2d output shape:	  torch.Size([1, 256, 12, 12])
ReLU output shape:	  torch.Size([1, 256, 12, 12])
MaxPool2d output shape:	  torch.Size([1, 256, 5, 5])
Flatten output shape:	  torch.Size([1, 6400])
Linear output shape:	  torch.Size([1, 4096])
ReLU output shape:	  torch.Size([1, 4096])
Dropout output shape:	  torch.Size([1, 4096])
Linear output shape:	  torch.Size([1, 4096])
ReLU output shape:	  torch.Size([1, 4096])
Dropout output shape:	  torch.Size([1, 4096])
Linear output sh

In [3]:
batch_size,lr,num_epochs=128,0.01,10

# device=torch.device('mps')
device=torch.device('cpu')

# Fashion-MNIST图像的分辨率 28 x 28（像素）低于ImageNet图像。
# 将它们增加到224 x 224
train_iter,test_iter=d2l.load_data_fashion_mnist(batch_size,resize=224)

In [None]:
d2l.train_ch6(net,train_iter,test_iter,num_epochs,lr,device)

training on cpu


- AlexNet的架构与LeNet相似，但使用了更多的卷积层和更多的参数来拟合大规模的ImageNet数据集。

- 今天，AlexNet已经被更有效的架构所超越，但它是从浅层网络到深层网络的关键一步。

- 尽管AlexNet的代码只比LeNet多出几行，但学术界花了很多年才接受深度学习这一概念，并应用其出色的实验结果。这也是由于缺乏有效的计算工具。

- Dropout、ReLU和预处理是提升计算机视觉任务性能的其他关键步骤。