## The architecture of AlexNet

The AlexNet has 8 layers, consisting of **5 convolutional layers** and **3 fully-connected layers**. To make training easier, AlexNet uses the **ReLU** activation function instead of sigmoid.

AlexNet controls the model complexity of the fully-connected layer by **dropout**. 

![](alexnet.svg)

In [1]:
import torch
from torch import nn
import torchvision

## AlexNet Model

In [2]:
alexnet = nn.Sequential(
    nn.Conv2d(1, 96, kernel_size=11, stride=4, padding=1), nn.ReLU(), 
    nn.MaxPool2d(kernel_size=3, stride=2 ),
    nn.Conv2d(96, 256, kernel_size=5, padding=2), nn.ReLU(),
    nn.MaxPool2d(kernel_size=3, stride=2),
    nn.Conv2d(256, 384, kernel_size=3, padding=1), nn.ReLU(),
    nn.Conv2d(384, 384, kernel_size=3, padding=1), nn.ReLU(),
    nn.Conv2d(384, 256, kernel_size=3, padding=1), nn.ReLU(),
    nn.MaxPool2d(kernel_size=3, stride=2),
    nn.Flatten(),
    nn.Linear(6400, 4096), nn.ReLU(), nn.Dropout(p=0.5), 
    nn.Linear(4096, 4096), nn.ReLU(), nn.Dropout(p=0.5),
    nn.Linear(4096, 10))        

In [3]:
X = torch.randn(1, 1, 224, 224)
for layer in alexnet:
    X = layer(X)
    print(layer.__class__.__name__, 'output shape:\t', X.shape)

Conv2d output shape:	 torch.Size([1, 96, 54, 54])
ReLU output shape:	 torch.Size([1, 96, 54, 54])
MaxPool2d output shape:	 torch.Size([1, 96, 26, 26])
Conv2d output shape:	 torch.Size([1, 256, 26, 26])
ReLU output shape:	 torch.Size([1, 256, 26, 26])
MaxPool2d output shape:	 torch.Size([1, 256, 12, 12])
Conv2d output shape:	 torch.Size([1, 384, 12, 12])
ReLU output shape:	 torch.Size([1, 384, 12, 12])
Conv2d output shape:	 torch.Size([1, 384, 12, 12])
ReLU output shape:	 torch.Size([1, 384, 12, 12])
Conv2d output shape:	 torch.Size([1, 256, 12, 12])
ReLU output shape:	 torch.Size([1, 256, 12, 12])
MaxPool2d output shape:	 torch.Size([1, 256, 5, 5])
Flatten output shape:	 torch.Size([1, 6400])
Linear output shape:	 torch.Size([1, 4096])
ReLU output shape:	 torch.Size([1, 4096])
Dropout output shape:	 torch.Size([1, 4096])
Linear output shape:	 torch.Size([1, 4096])
ReLU output shape:	 torch.Size([1, 4096])
Dropout output shape:	 torch.Size([1, 4096])
Linear output shape:	 torch.Size([1,

## Reading the dataset
Still we use the FashionMNIST dataset to train and test the AlexNet model.

We need to adjust our dataset loader to resize the images to $224 \times 224$.

In [4]:
import sys
sys.path.append("../dlutils")
from dataset import load_fashion_mnist_dataset

batch_size=256
train_loader, test_loader = load_fashion_mnist_dataset(batch_size, resize=224)

## Optimizer

In [8]:
lr = 0.05
trainer = torch.optim.SGD(alexnet.parameters(), lr=lr)

## Loss

In [6]:
loss = torch.nn.CrossEntropyLoss()

## Traning

In [None]:
from train import train_3ch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
alexnet.to(device)
loss.to(device)
num_epochs = 10
train_3ch(alexnet, loss, num_epochs, train_loader, trainer, test_loader, device)

epoch 0, training loss 0.002482, training accuracy 0.773767, testing loss 0.002657, testing accuracy 0.765400
epoch 1, training loss 0.002101, training accuracy 0.801417, testing loss 0.002350, testing accuracy 0.788900
epoch 2, training loss 0.001853, training accuracy 0.816150, testing loss 0.002068, testing accuracy 0.802100
