## Importing libraries

In [1]:
import time
from copy import deepcopy

import numpy as np
import torch
from dlordinal.output_layers import StickBreakingLayer
from dlordinal.losses import TriangularCrossEntropyLoss
from dlordinal.datasets import FGNet
from sklearn.metrics import (accuracy_score, cohen_kappa_score,
                             confusion_matrix, mean_absolute_error)
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn.utils import class_weight
from torch import cuda
from torch.optim import Adam
from torch.utils.data import DataLoader, Subset
from torchvision import models
from torchvision.transforms import Compose, ToTensor
from tqdm import tqdm

## Load and preprocess of FGNet dataset

First, we present the configuration parameters for the experimentation and the number of workers for the *DataLoader*, which defines the number of subprocesses to use for data loading. In this specific case, it refers to the images.

In [2]:
optimiser_params = {
    'lr': 1e-3,
    'bs': 200,
    'epochs': 5,
    's': 2,
    'c': 0.2,
    'beta': 0.5
}

workers = 3

Now we use the *FGNet* method to download and preprocess the images. Once that is done with the training data, we create a validation partition comprising 15% of the data using the *StratifiedShuffleSplit* method. Finally, with all the partitions, we load the images using a method called *DataLoader*.

In [3]:
fgnet_trainval = FGNet(
    root="./datasets",
    download=True,
    train=True,
    target_transform=np.array,
    transform=Compose([ToTensor()]),
)

test_data = FGNet(
    root="./datasets",
    download=True,
    train=False,
    target_transform=np.array,
    transform=Compose([ToTensor()]),
)

num_classes = len(fgnet_trainval.classes)
classes = fgnet_trainval.classes
targets = fgnet_trainval.targets

# Create a validation split
sss = StratifiedShuffleSplit(n_splits=1, test_size=0.15, random_state=0)
sss_splits = list(
    sss.split(X=np.zeros(len(fgnet_trainval)), y=fgnet_trainval.targets)
)
train_idx, val_idx = sss_splits[0]

# Create subsets for training and validation
train_data = Subset(fgnet_trainval, train_idx)
val_data = Subset(fgnet_trainval, val_idx)

# Get CUDA device
device = "cuda" if cuda.is_available() else "cpu"
print(f"Using {device} device")

# Create dataloaders
train_dataloader = DataLoader(
    train_data, batch_size=optimiser_params["bs"], shuffle=True, num_workers=workers
)
val_dataloader = DataLoader(
    val_data, batch_size=optimiser_params["bs"], shuffle=True, num_workers=workers
)
test_dataloader = DataLoader(
    test_data, batch_size=optimiser_params["bs"], shuffle=False, num_workers=workers
)

# Get image shape
img_shape = None
for X, _ in train_dataloader:
    img_shape = list(X.shape[1:])
    break
print(f"Detected image shape: {img_shape}")

# Define class weights for imbalanced datasets
classes_array = np.array([int(c) for c in classes])

class_weights = class_weight.compute_class_weight(
    "balanced", classes=classes_array, y=targets
)
print(f"{class_weights=}")
class_weights = (
    torch.from_numpy(class_weights).float().to(device)
)  # Transform to Tensor

Files already downloaded and verified
Files already processed and verified
Files already split and verified
Files already downloaded and verified
Files already processed and verified
Files already split and verified
Using cpu device
Detected image shape: [3, 128, 128]
class_weights=array([1.01908397, 1.53448276, 0.79464286, 1.13135593, 0.55165289,
       2.42727273])


## Model

We are using a pretrained *ResNet* model, which has previously been trained on ImageNet. We are modifying the last fully connected layer by a methodology called *Stick Breaking*.

The stick breaking approach considers the problem of breaking a stick of length 1 into J segments. This methodology is related to non-parametric Bayesian methods and can be considered a subset of the random allocation processes [1].

Finally, we define the *Adam* optimiser, which is used to adjust the network's weights and minimize the error of a loss function.

[1]: Vargas, Víctor Manuel et al. (2022). *Unimodal regularisation based on beta distribution for deep ordinal regression.* Pattern Recognition, 122, 108310. Elsevier. doi.org/10.1016/j.patcog.2021.108310

In [4]:
model = models.resnet18(weights='IMAGENET1K_V1')
model.fc = StickBreakingLayer(model.fc.in_features, num_classes)
model = model.to(device)

# Optimizer and scheduler
optimizer = Adam(model.parameters(), lr=optimiser_params["lr"])

## Loss Function

$$
\begin{align*}
f(x; a_j, b_j, c_j) &= \begin{cases}
   0, & x < a_j, \\
   \frac{2(x - a_j)}{(b_j - a_j)(c_j - a_j)}, & a_j \leq x < c_j, \\
   \frac{2(b_j - x)}{(b_j - a_j)(b_j - c_j)}, & c_j \leq x < b_j, \\
   0, & b_j \leq x,
\end{cases}
\end{align*}
$$


The triangular distribution [1] can be determined using three parameters a, b and c, which define the lower limit, upper limit, and mode, respectively. These parameters also determine the x coordinate of each of the vertices of the triangle.

The distributions employed for the extreme classes should differ from those utilized for the intermediate ones. Consequently, the distributions for the initial and final classes should allocate their probabilities just in one direction: positively for the first class and negatively for the last one.

[1]: Víctor Manuel Vargas, Pedro Antonio Gutiérrez, Javier Barbero-Gómez, and César Hervás-Martínez (2023). *Soft Labelling Based on Triangular Distributions for Ordinal Classification.* Information Fusion, 93, 258--267. doi.org/10.1016/j.inffus.2023.01.003


In [5]:
loss_fn = TriangularCrossEntropyLoss(num_classes=num_classes).to(device)

## Metrics

In [6]:
# Metrics computation


def compute_metrics(y_true: np.ndarray, 
    y_pred: np.ndarray, 
    num_classes: int):

    if len(y_true.shape) > 1:
        y_true = np.argmax(y_true, axis=1)

    if len(y_pred.shape) > 1:
        y_pred = np.argmax(y_pred, axis=1)

    labels = range(0, num_classes)

    # Metrics calculation
    qwk = cohen_kappa_score(y_true, y_pred, weights='quadratic', labels=labels)
    ms = minimum_sensitivity(y_true, y_pred, labels=labels)
    mae = mean_absolute_error(y_true, y_pred)
    acc = accuracy_score(y_true, y_pred)
    off1 = accuracy_off1(y_true, y_pred, labels=labels)
    conf_mat = confusion_matrix(y_true, y_pred, labels=labels)

    metrics = {
        'QWK': qwk,
        'MS': ms,
        'MAE': mae,
        'CCR': acc,
        '1-off': off1,
        'Confusion matrix': conf_mat
    }

    return metrics


def _compute_sensitivities(y_true, y_pred, labels=None):
	if len(y_true.shape) > 1:
		y_true = np.argmax(y_true, axis=1)
	if len(y_pred.shape) > 1:
		y_pred = np.argmax(y_pred, axis=1)

	conf_mat = confusion_matrix(y_true, y_pred, labels=labels)

	sum = np.sum(conf_mat, axis=1)
	mask = np.eye(conf_mat.shape[0], conf_mat.shape[1])
	correct = np.sum(conf_mat * mask, axis=1)
	sensitivities = correct / sum

	sensitivities = sensitivities[~np.isnan(sensitivities)]

	return sensitivities


def minimum_sensitivity(y_true, y_pred, labels=None):
	return np.min(_compute_sensitivities(y_true, y_pred, labels=labels))


def accuracy_off1(y_true, y_pred, labels=None):
	if len(y_true.shape) > 1:
		y_true = np.argmax(y_true, axis=1)
	if len(y_pred.shape) > 1:
		y_pred = np.argmax(y_pred, axis=1)

	conf_mat = confusion_matrix(y_true, y_pred, labels=labels)
	n = conf_mat.shape[0]
	mask = np.eye(n, n) + np.eye(n, n, k=1), + np.eye(n, n, k=-1)
	correct = mask * conf_mat

	return 1.0 * np.sum(correct) / np.sum(conf_mat)


def print_metrics(metrics):
    print("")
    print('Confusion matrix :\n{}'.format(metrics['Confusion matrix']))
    print("")
    print('MS: {:.4f}'.format(metrics['MS']))
    print("")
    print('QWK: {:.4f}'.format(metrics['QWK']))
    print("")
    print('MAE: {:.4f}'.format(metrics['MAE']))
    print("")
    print('CCR: {:.4f}'.format(metrics['CCR']))
    print("")
    print('1-off: {:.4f}'.format(metrics['1-off']))

## Training Process

In [7]:
def train(
    dataloader: torch.utils.data.DataLoader,
    model: torch.nn.Module,
    loss_fn: torch.nn.Module,
    optimizer: torch.optim.Optimizer,
    device: torch.device,
    H: dict,
    num_classes: int,
):  # H: dict
    num_batches = len(dataloader)
    size = len(dataloader.dataset)
    progress_bar = tqdm(total=num_batches, ncols=100, position=0, desc="Train progress")
    model.train()
    mean_loss, accuracy = 0, 0
    y_pred, y_true = None, None

    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)  # Inputs and labels to device

        # Compute prediction error and accuracy of the training process
        pred = model(X)
        loss = loss_fn(pred, y)

        mean_loss += loss
        accuracy += (pred.argmax(1) == y).type(torch.float).sum().item()

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Stack predictions and true labels to determine the confusion matrix
        pred_np = pred.argmax(1).cpu().detach().numpy()
        true_np = y.cpu().detach().numpy()
        if y_pred is None:
            y_pred = pred_np
        else:
            y_pred = np.concatenate((y_pred, pred_np))

        if y_true is None:
            y_true = true_np
        else:
            y_true = np.concatenate((y_true, true_np))

        # Update progress bar
        progress_bar.set_postfix(loss=loss.item(), accuracy=accuracy)
        progress_bar.update(1)

    accuracy /= size
    mean_loss /= num_batches

    H["train_loss"].append(loss.cpu().detach().numpy())
    H["train_acc"].append(accuracy)

    # Confusion matrix for training
    labels = range(0, num_classes)
    conf_mat = confusion_matrix(y_true, y_pred, labels=labels)
    print("")
    print("Train Confusion matrix :\n{}".format(conf_mat))
    print("")

    return accuracy, mean_loss

## Test Process

In [8]:
def test(
    test_dataloader: torch.utils.data.DataLoader,
    model: torch.nn.Module,
    loss_fn: torch.nn.Module,
    device: torch.device,
    num_classes: int,
):
    num_batches = len(test_dataloader)
    progress_bar = tqdm(total=num_batches, ncols=100, position=0, desc="Test progress")
    model.eval()
    test_loss = 0
    y_pred, y_true = None, None

    with torch.no_grad():
        for batch, (X, y) in enumerate(test_dataloader):
            X, y = X.to(device), y.to(device)  # inputs and labels to device
            pred = model(X)
            test_loss += loss_fn(pred, y).item()

            # Stack predictions and true labels
            pred_np = pred.argmax(1).cpu().detach().numpy()
            true_np = y.cpu().detach().numpy()
            if y_pred is None:
                y_pred = pred_np
            else:
                y_pred = np.concatenate((y_pred, pred_np))

            if y_true is None:
                y_true = true_np
            else:
                y_true = np.concatenate((y_true, true_np))

            # Update progress bar
            progress_bar.set_postfix(loss=test_loss / (batch + 1))
            progress_bar.update(1)

    test_loss /= num_batches
    metrics = compute_metrics(y_true, y_pred, num_classes)
    print_metrics(metrics)

    return metrics, test_loss

## Validation Process

In [9]:
def validate(
    dataloader: torch.utils.data.DataLoader,
    model: torch.nn.Module,
    loss_fn: torch.nn.Module,
    device: torch.device,
    H: dict,
    num_classes: int,
):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    loss, accuracy = 0, 0
    y_pred, y_true = None, None

    with torch.no_grad():
        for batch, (X, y) in enumerate(dataloader):
            X, y = X.to(device), y.to(device)
            pred = model(X)
            loss += loss_fn(pred, y)
            accuracy += (pred.argmax(1) == y).type(torch.float).sum().item()

            pred_np = pred.argmax(1).cpu().detach().numpy()
            true_np = y.cpu().detach().numpy()
            if y_pred is None:
                y_pred = pred_np
            else:
                y_pred = np.concatenate((y_pred, pred_np))

            if y_true is None:
                y_true = true_np
            else:
                y_true = np.concatenate((y_true, true_np))

    accuracy /= size
    loss /= num_batches

    H["val_loss"].append(loss.cpu().detach().numpy())
    H["val_acc"].append(accuracy)

    metrics = compute_metrics(y_true, y_pred, num_classes)

    return metrics, accuracy, loss

## Results

In [10]:
H = {"train_loss": [], "train_acc": [], "val_loss": [], "val_acc": []}

# To store validation metrics
validation_metrics = {}

# Definition to store best model weights
best_model_weights = model.state_dict()
best_qwk = 0.0

# Start time
start_time = time.time()

for e in range(optimiser_params["epochs"]):
    train_acc, train_loss = train(
        train_dataloader, model, loss_fn, optimizer, device, H, num_classes=num_classes
    )
    validation_metrics, val_acc, val_loss = validate(
        val_dataloader, model, loss_fn, device, H, num_classes=num_classes
    )

    if validation_metrics["QWK"] >= best_qwk:
        best_qwk = validation_metrics["QWK"]
        best_model_weights = deepcopy(model.state_dict())

    print("[INFO] EPOCH: {}/{}".format(e + 1, optimiser_params["epochs"]))
    print("Train loss: {:.6f}, Train accuracy: {:.4f}".format(train_loss, train_acc))
    print("Val loss: {:.6f}, Val accuracy: {:.4f}\n".format(val_loss, val_acc))

# Store last train error
train_error = H["train_loss"][-1]

# Restore best weights
model.load_state_dict(best_model_weights)

# Start evaluation
print("[INFO] Network evaluation ...")

test_metrics, test_loss = test(
    test_dataloader, model, loss_fn, device, num_classes=num_classes
)

# End time
end_time = time.time()
print("\n[INFO] Total training time: {:.2f}s".format(end_time - start_time))

Train progress: 100%|████████████████████████| 4/4 [00:11<00:00,  2.81s/it, accuracy=118, loss=1.63]


Train Confusion matrix :
[[ 64   4   5   0   1   0]
 [143  20  20  13   9   0]
 [ 73  12   9  11   6   0]
 [ 92  12  12  10  16   1]
 [ 56   8   8  15  12   1]
 [ 29   3   1   4   7   3]]






[INFO] EPOCH: 1/5
Train loss: 1.847242, Train accuracy: 0.1735
Val loss: 2.574011, Val accuracy: 0.1074



Train progress: 100%|████████████████████████| 4/4 [00:09<00:00,  2.28s/it, accuracy=379, loss=1.33]


Train Confusion matrix :
[[ 60  12   0   2   0   0]
 [ 26 109  22  31  16   1]
 [  4  14  18  55  20   0]
 [  7   4   1 106  24   1]
 [  2   0   0  32  63   3]
 [  0   1   1   7  15  23]]






[INFO] EPOCH: 2/5
Train loss: 1.367832, Train accuracy: 0.5574
Val loss: 2.786200, Val accuracy: 0.1074



Train progress: 100%|█████████████████████████| 4/4 [00:09<00:00,  2.41s/it, accuracy=422, loss=1.2]


Train Confusion matrix :
[[ 60  14   0   0   0   0]
 [ 23 162   3  11   5   1]
 [  9  44   7  36  13   2]
 [  7  17   0 105  14   0]
 [  3   9   0  17  51  20]
 [  2   0   0   1   7  37]]






[INFO] EPOCH: 3/5
Train loss: 1.251675, Train accuracy: 0.6206
Val loss: 2.030842, Val accuracy: 0.1901



Train progress: 100%|████████████████████████| 4/4 [00:09<00:00,  2.40s/it, accuracy=479, loss=1.22]


Train Confusion matrix :
[[ 63  11   0   0   0   0]
 [ 13 174   7   9   2   0]
 [  6  31  28  38   8   0]
 [  4   2  13 118   5   1]
 [  7   4   0  18  60  11]
 [  0   2   0   0   9  36]]






[INFO] EPOCH: 4/5
Train loss: 1.168523, Train accuracy: 0.7044
Val loss: 1.750762, Val accuracy: 0.2562



Train progress: 100%|████████████████████████| 4/4 [00:08<00:00,  2.23s/it, accuracy=534, loss=1.12]


Train Confusion matrix :
[[ 69   5   0   0   0   0]
 [ 15 178   7   4   1   0]
 [ 10  27  42  27   5   0]
 [  1   1  11 124   6   0]
 [  2   3   2   8  82   3]
 [  0   0   0   0   8  39]]






[INFO] EPOCH: 5/5
Train loss: 1.091701, Train accuracy: 0.7853
Val loss: 1.488901, Val accuracy: 0.3554

[INFO] Network evaluation ...


Test progress: 100%|███████████████████████████████████████| 2/2 [00:01<00:00,  1.36it/s, loss=1.63]


Confusion matrix :
[[22  0  0  0  0  0]
 [33 22  4  1  0  0]
 [ 5 16  4  6  2  0]
 [ 8  9  3 14  6  2]
 [ 2  1  3  9 12  3]
 [ 2  0  0  0  8  4]]

MS: 0.1212

QWK: 0.6697

MAE: 0.8806

CCR: 0.3881

1-off: 0.8259

[INFO] Total training time: 55.39s



