### AlexNet

[ImageNet Classification with Deep Convolutional Neural Networks](https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)

*2024/11/12*

AlexNet made significant contributions to the field of deep learning, particularly in image recognition. It was the first deep neural network to achieve a substantial breakthrough on the ImageNet dataset, significantly reducing error rates. The model demonstrated the effectiveness of deep convolutional neural networks through its multi-layer architecture, including 5 convolutional layers and 3 fully connected layers. 

AlexNet introduced the **ReLU** activation function to accelerate training and utilized **Dropout** to prevent overfitting. The model also pioneered the use of GPUs for accelerating deep learning computations. However, from a modern perspective, AlexNet has several limitations. It uses a large 11x11 convolutional kernel with a stride of 4, causing significant information loss, especially at the edges of images, and increasing the parameter count. Additionally, the large fully connected layers further increase the parameter count. These factors lead to high computational and storage costs.

*Code*

In [6]:
import torch
from torchvision.models import alexnet
from torchsummary import summary

model = alexnet().cuda()
x = torch.randn(1, 3, 224, 224).cuda()
summary(model, (x.squeeze(dim=0).shape))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 55, 55]          23,296
              ReLU-2           [-1, 64, 55, 55]               0
         MaxPool2d-3           [-1, 64, 27, 27]               0
            Conv2d-4          [-1, 192, 27, 27]         307,392
              ReLU-5          [-1, 192, 27, 27]               0
         MaxPool2d-6          [-1, 192, 13, 13]               0
            Conv2d-7          [-1, 384, 13, 13]         663,936
              ReLU-8          [-1, 384, 13, 13]               0
            Conv2d-9          [-1, 256, 13, 13]         884,992
             ReLU-10          [-1, 256, 13, 13]               0
           Conv2d-11          [-1, 256, 13, 13]         590,080
             ReLU-12          [-1, 256, 13, 13]               0
        MaxPool2d-13            [-1, 256, 6, 6]               0
AdaptiveAvgPool2d-14            [-1, 25