This notebook was motivated by

[3] Karen Simonyan and Andrew Zisserman. ‘Very Deep Convolutional Networks for Large-Scale Image Recognition’. In: (2014). doi: 10. 48550/ARXIV.1409.1556. url: https://arxiv.org/abs/1409.1556.

Implementation: Oleh Bakumenko, University of Duisburg-Essen

In [1]:
import sys
sys.path.append("../")
import os
import numpy as np
import time
import matplotlib.pyplot as plt
import torch, torch.nn as nn
import torchvision, torchvision.transforms as tt
from torchsummary import summary
from torch.multiprocessing import Manager
torch.multiprocessing.set_sharing_strategy("file_system")
from pathlib import Path

from utility import utils as uu
from utility.eval import evaluate_classifier_model
from utility.confusion_matrix import calculate_confusion_matrix
from utility.trainLoopClassifier import *
from utility.plotImageModel import *

Data augmentation is a technique used to artificially increase the size of a dataset by transforming existing data points to create new, similar instances. This can help prevent overfitting in machine learning models, as well as improve their ability to generalize to unseen data. Common types of data augmentation include flipping, rotation, scaling, and adding noise to images.
We can generate the augmentation list with torchvision.transforms module


In [2]:
data_augments = torchvision.transforms.Compose([
    torchvision.transforms.RandomHorizontalFlip(p = .5),
    torchvision.transforms.RandomVerticalFlip(p = .5),
    torchvision.transforms.ColorJitter(brightness=(0.5,1.5), contrast=(1), hue=(-0.1,0.1)),
    #torchvision.transforms.RandomCrop((224, 224)),
    ])

Load the dataset from utils

In [3]:
cur_path = Path("plots_and_graphs.ipynb")
parent_dir = cur_path.parent.absolute()
masterThesis_folder = str(parent_dir.parent.absolute())+'/'
data_dir = masterThesis_folder+"data/Clean_LiTS/"

cache_me = False
if cache_me is True:
    cache_mgr = Manager()
    cache_mgr.data = cache_mgr.dict()
    cache_mgr.cached = cache_mgr.dict()
    for k in ["train", "val", "test"]:
        cache_mgr.data[k] = cache_mgr.dict()
        cache_mgr.cached[k] = False
# function from utils, credit: Institute for Artificial Intelligence in Medicine. url: https://mml.ikim.nrw/
# dataset outputs a tensor image (dimensions [1,256,256]) and a tensor target (0, 1 or 2)

ds = uu.LiTS_Classification_Dataset(
    data_dir=data_dir,
    transforms=data_augments,
    verbose=True,
    cache_data=cache_me,
    cache_mgr=(cache_mgr if cache_me is True else None),
    debug=True,
)

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/3039 [00:00<?, ?it/s]

  0%|          | 0/3038 [00:00<?, ?it/s]

  0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/3039 [00:00<?, ?it/s]

  0%|          | 0/3038 [00:00<?, ?it/s]

Dataset initialization complete.


# Hyperparameters


In [4]:
# Default settings
batch_size = 32
learning_rate = 1e-4
weight_decay = 1e-5
epochs = 15
run_name = "VGG19"
device = ("cuda" if torch.cuda.is_available() else "cpu")
time_me = True

The `torch.utils.data.DataLoader` is a utility class in PyTorch that makes the loading and batching of data for training purposes faster. It simplifies the process by allowing us to specify the dataset, batch size (often 32), and whether the data should be shuffled before each epoch. Additionally, there are other parameters available to further customize the data loading process.

In [5]:
# Dataloader
dl = torch.utils.data.DataLoader(
    dataset = ds, 
    batch_size = batch_size, 
    num_workers = 4, 
    shuffle = True, 
    drop_last = False, 
    pin_memory = True,
    persistent_workers = (not cache_me),
    prefetch_factor = 1
    )

# VGG

VGG19 is a convolutional neural network architecture that consists of 19 layers, including 16 convolutional layers and 3 fully connected layers. It was developed by the Visual Geometry Group at the University of Oxford and has been widely used for image classification tasks. The network's architecture is characterized by the use of small convolutional filters (3x3) with a stride of 1, and the use of max pooling layers with a filter size of 2x2 and a stride of 2. VGG19 has achieved high accuracy on benchmark image classification datasets such as ImageNet.

The original network was used in the ImageNet Challenge to classify 1000 classes. However, in our exercise, we only use 3 classes:
0: Image does not include the liver.
1: Liver is visible.
2: Liver is visible and a lesion is visible.

In [2]:
# VGGBlock Class
#       - constructs a block [conv -> relu], which we will stack in the network
# Input:    int: n_chans - number channels
# Output:   nn.Sequential() block

class VGGBlock(nn.Module):
    def __init__(self, n_chans):
        super().__init__()
        self.conv1 = nn.Conv2d(n_chans, n_chans, kernel_size=3, padding=1)
        self.relu = torch.nn.ReLU()
        torch.nn.init.kaiming_normal_(self.conv1.weight,nonlinearity='relu')
    def forward(self, x):
        out = self.relu(self.conv1(x))
        return out

In [3]:
# VGG19MLMed18 Class
#       - constructs a VGG19 as described in [3, Table 1]
# Input:    Tensor: [Batch,1,Height,Width]
# Output:   Tensor: [Batch,3]
class VGG19MLMed(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.max_pool = torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding=0)
        self.relu = torch.nn.ReLU()

        self.conv1 = torch.nn.Conv2d(in_channels = 1,   out_channels = 64, kernel_size =3, stride =1, padding=1)
        self.layer1 = VGGBlock(n_chans=64)
        self.conv2 = torch.nn.Conv2d(in_channels = 64,  out_channels = 128, kernel_size =3, stride =1, padding=1)
        self.layer2 = VGGBlock(n_chans=128)
        self.conv3 = torch.nn.Conv2d(in_channels = 128, out_channels = 256, kernel_size =3, stride =1, padding=1)
        self.layer3 = nn.Sequential(*(3 * [VGGBlock(n_chans=256)]))
        self.conv4 = torch.nn.Conv2d(in_channels = 256, out_channels = 512, kernel_size =3, stride =1, padding=1)
        self.layer4 = nn.Sequential(*(3 * [VGGBlock(n_chans=512)]))

        self.layer5 = nn.Sequential(*(4 * [VGGBlock(n_chans=512)]))

        self.avgpool = nn.AdaptiveAvgPool2d((7, 7))
        self.fc = nn.Sequential(
            nn.Linear(in_features=25088, out_features=4096, bias=True),
            torch.nn.ReLU(),
            nn.Dropout(p = 0.5),
            nn.Linear(in_features=4096, out_features=4096, bias=True),
            torch.nn.ReLU(),
            nn.Dropout(p = 0.5),
            nn.Linear(in_features=4096, out_features=3, bias=True),
        )


    def forward(self, x):
        out_1 = self.relu(self.conv1(x))
        out_1 = self.max_pool(self.layer1(out_1))

        out_2 = self.relu(self.conv2(out_1))
        out_2 = self.max_pool(self.layer2(out_2))

        out_3 = self.relu(self.conv3(out_2))
        out_3 = self.max_pool(self.layer3(out_3))

        out_4 = self.relu(self.conv4(out_3))
        out_4 = self.max_pool(self.layer4(out_4))

        out_5 = self.max_pool(self.layer5(out_4))
        out_6 = self.avgpool(out_5)
        out_6 = torch.flatten(out_6, 1)

        out_6= self.fc(out_6)

        return out_6

In [4]:
model = VGG19MLMed()

In [5]:
summary(VGG19MLMed(), (1, 256, 256))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 256, 256]             640
              ReLU-2         [-1, 64, 256, 256]               0
            Conv2d-3         [-1, 64, 256, 256]          36,928
              ReLU-4         [-1, 64, 256, 256]               0
          VGGBlock-5         [-1, 64, 256, 256]               0
         MaxPool2d-6         [-1, 64, 128, 128]               0
            Conv2d-7        [-1, 128, 128, 128]          73,856
              ReLU-8        [-1, 128, 128, 128]               0
            Conv2d-9        [-1, 128, 128, 128]         147,584
             ReLU-10        [-1, 128, 128, 128]               0
         VGGBlock-11        [-1, 128, 128, 128]               0
        MaxPool2d-12          [-1, 128, 64, 64]               0
           Conv2d-13          [-1, 256, 64, 64]         295,168
             ReLU-14          [-1, 256,

In [10]:
for step, (data, targets) in enumerate(dl):
    data, targets = data.to(device), targets.to(device)
    if step ==1:
        break

In [11]:
model = model.to(device)
model(data).shape

torch.Size([2, 3])

In [None]:
optimizer = torch.optim.Adam(model.parameters(), lr = learning_rate, weight_decay = weight_decay)
criterion = nn.CrossEntropyLoss()

In [None]:
training_loop(
    epochs = epochs,
    optimizer = optimizer,
    model = model,
    criterion = criterion,
    ds = ds,
    dl = dl,
    batch_size = batch_size,
    run_name = run_name,
    cache_me = cache_me,
    device = device,
    mod_step = 500,
    time_me = True,
    time = time)