# AlexNet
This notebook was motivated by
[1] Alex Krizhevsky, Ilya Sutskever and Geoffrey E Hinton. ‘ImageNet Classification with Deep Convolutional Neural Networks’. In: (2012). url: https://proceedings.neurips.cc/paper/2012/file/c 399862d3b9d6b76c8436e924a68c45b-Paper.pdf.

https://papers.nips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html

Implementation: Oleh Bakumenko, Univerity of Duisburg-Essen



In [4]:
import sys
sys.path.append("../")
import os
import numpy as np
import time
import matplotlib.pyplot as plt
import torch, torch.nn as nn
import torchvision, torchvision.transforms as tt
from torchsummary import summary
from torch.multiprocessing import Manager
torch.multiprocessing.set_sharing_strategy("file_system")
from pathlib import Path

from utility import utils as uu
from utility.eval import evaluate_classifier_model
from utility.confusion_matrix import calculate_confusion_matrix
from utility.trainLoopClassifier import *
from utility.plotImageModel import *

# Data augmentations
Data augmentation is a technique used to artificially increase the size of a dataset by transforming existing data points to create new, similar instances. This can help prevent overfitting in machine learning models, as well as improve their ability to generalize to unseen data. Common types of data augmentation include flipping, rotation, scaling, and adding noise to images.
We can generate the augmentation list with torchvision.transforms module

Random croping of the image to the size of 224 will be excluded for other models.

In [None]:
data_augments = torchvision.transforms.Compose([ 
    torchvision.transforms.RandomHorizontalFlip(p = .5),
    torchvision.transforms.RandomVerticalFlip(p = .5),
    torchvision.transforms.ColorJitter(brightness=(0.5,1.5), contrast=(1), hue=(-0.1,0.1)),
    torchvision.transforms.RandomCrop((224, 224)),
    ])


Load the dataset from utils

In [None]:
cur_path = Path("plots_and_graphs.ipynb")
parent_dir = cur_path.parent.absolute()
masterThesis_folder = str(parent_dir.parent.absolute())+'/'
data_dir = masterThesis_folder+"data/Clean_LiTS/"

cache_me = False
if cache_me is True:
    cache_mgr = Manager()
    cache_mgr.data = cache_mgr.dict()
    cache_mgr.cached = cache_mgr.dict()
    for k in ["train", "val", "test"]:
        cache_mgr.data[k] = cache_mgr.dict()
        cache_mgr.cached[k] = False
# function from utils, credit: Institute for Artificial Intelligence in Medicine. url: https://mml.ikim.nrw/
# dataset outputs a tensor image (dimensions [1,256,256]) and a tensor target (0, 1 or 2)

ds = uu.LiTS_Classification_Dataset(
    data_dir=data_dir,
    transforms=data_augments,
    verbose=True,
    cache_data=cache_me,
    cache_mgr=(cache_mgr if cache_me is True else None),
    debug=True,
)

### Hyperparameters

In [None]:
# Default settings
batch_size = 32
learning_rate = 1e-4
weight_decay = 5e-5
epochs = 15
run_name = "AlexNet"
device = ("cuda" if torch.cuda.is_available() else "cpu")
time_me  = True

The `torch.utils.data.DataLoader` is a utility class in PyTorch that makes the loading and batching of data for training purposes faster. It simplifies the process by allowing us to specify the dataset, batch size (often 32), and whether the data should be shuffled before each epoch. Additionally, there are other parameters available to further customize the data loading process.

In [None]:
# Dataloader
dl = torch.utils.data.DataLoader(
    dataset = ds, 
    batch_size = batch_size, 
    num_workers = 4, 
    shuffle = True, 
    drop_last = False, 
    pin_memory = True,
    persistent_workers = (not cache_me),
    prefetch_factor = 1
    )

# AlexNet

AlexNet is a deep convolutional neural network architecture that was introduced in 2012 by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton. It was one of the first successful models to use deep convolutional neural networks for image classification and won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012. AlexNet consists of eight layers, including five convolutional layers and three fully connected layers, and uses ReLU activation functions and pooling layers to reduce the dimensionality of the input data.

Some bullet points:
1. ReLU activation function instead of tanh.
2. Using dropout to reduce overfitting.
3. Main idea: convolution followed by max pooling, and stacking of these layers; using dropout for fully connected layers.
4. Using parallel computations on multiple GPUs (not included here).
5. Local Response Normalization - not used in future networks; see torch.nn.LocalResponseNorm.

Overall architecture and channel sizes can be found in [1].

In [5]:
# AlexNet Class
#       - constructs a convolutional neuronal network as described in [1]
# Input:    Tensor: [Batch,1,Height,Width]
# Output:   Tensor: [Batch,3]
class AlexNetMLMed(torch.nn.Module):
    def __init__(self):
        super(AlexNetMLMed, self).__init__()
        self.relu = torch.nn.ReLU()
        self.conv1 = torch.nn.Conv2d(in_channels = 1, out_channels = 96, kernel_size = (11,11), stride = (4,4), padding=1)        
        self.pool1 = torch.nn.MaxPool2d(kernel_size = 3, stride = 2)
        self.responseNorm = torch.nn.LocalResponseNorm(5)
        self.conv2 = torch.nn.Conv2d(in_channels = 96, out_channels = 256, kernel_size = (5, 5), stride = (1,1), padding=1)
        self.pool2 = torch.nn.MaxPool2d(kernel_size = 3, stride = 2)

        self.conv3 = torch.nn.Conv2d(in_channels = 256, out_channels = 384, kernel_size = (3, 3), stride = (1,1), padding=1)
        
        self.conv4 = torch.nn.Conv2d(in_channels = 384, out_channels = 384, kernel_size = (3, 3), stride = (1,1), padding=1)

        self.conv5 = torch.nn.Conv2d(in_channels = 384, out_channels = 256, kernel_size = (3, 3), stride = (1,1), padding=1)

        self.pool4 = torch.nn.MaxPool2d(kernel_size = 3, stride = 2)
        
        self.dropout = torch.nn.Dropout(p=0.5) 

        self.fc1 = nn.Linear(6400,4096)
        self.fc2 = nn.Linear(4096,4096)
        self.fc3 = nn.Linear(4096,3)

    def forward(self, x):
        out = self.relu(self.conv1(x))
        out = self.pool1(out)
        out = self.responseNorm(out)

        out = self.relu(self.conv2(out))
        out = self.pool2(out)
        out = self.responseNorm(out)


        out = self.relu(self.conv3(out))
        out = self.relu(self.conv4(out))
        out = self.relu(self.conv5(out))
        out = self.pool4(out)

        out = out.flatten(start_dim=1)

        out = self.dropout(out)
        out = self.relu(self.fc1(out))
        out = self.dropout(out)
        out = self.relu(self.fc2(out))
        out = self.dropout(out)
        out = self.relu(self.fc3(out))
        
        return out

In [6]:
model = AlexNetMLMed()
summary(model, (1, 224, 224) )

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 54, 54]          11,712
              ReLU-2           [-1, 96, 54, 54]               0
         MaxPool2d-3           [-1, 96, 26, 26]               0
 LocalResponseNorm-4           [-1, 96, 26, 26]               0
            Conv2d-5          [-1, 256, 24, 24]         614,656
              ReLU-6          [-1, 256, 24, 24]               0
         MaxPool2d-7          [-1, 256, 11, 11]               0
 LocalResponseNorm-8          [-1, 256, 11, 11]               0
            Conv2d-9          [-1, 384, 11, 11]         885,120
             ReLU-10          [-1, 384, 11, 11]               0
           Conv2d-11          [-1, 384, 11, 11]       1,327,488
             ReLU-12          [-1, 384, 11, 11]               0
           Conv2d-13          [-1, 256, 11, 11]         884,992
             ReLU-14          [-1, 256,

In [None]:
for step, (data, targets) in enumerate(dl):
    data, targets = data.to(device), targets.to(device)
    if step == 1:
        break

model = model.to(device)
model(data).shape

In [None]:
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate, weight_decay = weight_decay)
criterion = nn.CrossEntropyLoss()

In [None]:
training_loop_conf_matr(
    epochs = epochs,
    optimizer = optimizer,
    model = model,
    criterion = criterion,
    ds = ds,
    dl = dl,
    batch_size = batch_size,
    run_name = run_name,
    device = device,
    time_me=True,
    time=time)

--------