<a href="https://colab.research.google.com/github/avkornaev/Sleep_Stages/blob/main/SleepStages.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Sleep Stages
*March, 21, 2025.*

## Problem Statement

Given data files for a several volunteers. Each file contain time series multi-sensory data (20 samples per second) and its processing results (19 columns total) and a column for label at each time step. The label is a sleep stage (NREM1, NREM2, REM, Wakefulness). Some of the data is missed. [Repo
](https://github.com/avkornaev/Sleep_Stages) of the project.

## Tasks and Requirements  

- Review the [Lightning framework](https://lightning.ai/docs/pytorch/stable/) (Level Up, Core API, Optional API sections of the manual).  
- Briefly review the [ClearML](https://clear.ml/docs/latest/docs/integrations/pytorch_lightning/) documentation.

# Preparation of simulation models

## Import and Install Libraries

In [1]:
!pip install pytorch-lightning clearml

Collecting pytorch-lightning
  Downloading pytorch_lightning-2.5.1-py3-none-any.whl.metadata (20 kB)
Collecting clearml
  Downloading clearml-1.18.0-py2.py3-none-any.whl.metadata (18 kB)
Collecting torchmetrics>=0.7.0 (from pytorch-lightning)
  Downloading torchmetrics-1.7.0-py3-none-any.whl.metadata (21 kB)
Collecting lightning-utilities>=0.10.0 (from pytorch-lightning)
  Downloading lightning_utilities-0.14.2-py3-none-any.whl.metadata (5.6 kB)
Collecting furl>=2.0.0 (from clearml)
  Downloading furl-2.1.4-py2.py3-none-any.whl.metadata (25 kB)
Collecting pathlib2>=2.3.0 (from clearml)
  Downloading pathlib2-2.3.7.post1-py2.py3-none-any.whl.metadata (3.5 kB)
Collecting pyjwt<2.10.0,>=2.4.0 (from clearml)
  Downloading PyJWT-2.9.0-py3-none-any.whl.metadata (3.0 kB)
Collecting orderedmultidict>=1.0.1 (from furl>=2.0.0->clearml)
  Downloading orderedmultidict-1.0.1-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.1.0->pytorch-lightning)
  D

In [11]:
#Pytorch modules
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import Dataset, DataLoader, random_split, TensorDataset
from torchvision import datasets, transforms, models
#scipy
from scipy.stats import mode
#sklearn
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.model_selection import LeaveOneGroupOut

#Numpy
import numpy as np
#Pandas
import pandas as pd
#Lightning & logging
import pytorch_lightning as pl
from pytorch_lightning import Trainer
from pytorch_lightning.callbacks import ModelCheckpoint
#Data observation
import os
import sys
import pickle
import requests
from pathlib import Path
from collections import defaultdict
#Plotting
import matplotlib.pyplot as plt
import seaborn as sns
#Logging
from clearml import Task

## Set the Models

### Simulation Settings

Check the current directory

In [None]:
os.getcwd() #returns the current working directory

'/content'

In [None]:
# Path to the folder where the pretrained models are saved
CHECKPOINT_PATH = os.environ.get("PATH_CHECKPOINT", "saved_models/")
print(f'CHECKPOINT_PATH: {CHECKPOINT_PATH}')

os.makedirs(CHECKPOINT_PATH, exist_ok=True)

CHECKPOINT_PATH: saved_models/


Set the reproducibility options

In [None]:
# Function for setting the seed to implement parallel tests
SEEDS =  [42] #[42, 0, 17, 9, 3, 16, 2]
SEED = 42 # random seed by default
pl.seed_everything(SEED)

# Determine the device (GPU if available, otherwise CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Prioritizes speed but may reduce precision
torch.set_float32_matmul_precision('high')

INFO:lightning_fabric.utilities.seed:Seed set to 42


### Logging

To configure ClearML in your Colab environment, follow these steps:

---

*Step 1: Create a ClearML Account*
1. Go to the [ClearML website](https://clear.ml/).
2. Sign up for a free account if you don’t already have one.
3. Once registered, log in to your ClearML account.

---

*Step 2: Get Your ClearML Credentials*
1. After logging in, navigate to the **Settings** page (click on your profile icon in the top-right corner and select **Settings**).
2. Under the **Workspace** section, find your **+ Create new credentials**.
3. Copy these credentials for a Jupiter notebook into the code cell below.

---

*Step 3: Accessing the ClearML Dashboard*
1. Go to your ClearML dashboard (https://app.clear.ml).
2. Navigate to the **Projects** section to see your experiments.
3. Click on the experiment (e.g., `Lab_1`) to view detailed metrics, logs, and artifacts.

---

In [None]:
#Enter your code here to implement Step 2 of the logging instruction as it is shown below
%env CLEARML_WEB_HOST=https://app.clear.ml/
%env CLEARML_API_HOST=https://api.clear.ml
%env CLEARML_FILES_HOST=https://files.clear.ml
%env CLEARML_API_ACCESS_KEY=ZP02U03C6V5ER4K9VWRNZT7EWA5ZTV
%env CLEARML_API_SECRET_KEY=BtA5GXZufr6QGpaqhX1GSKPTvaCt56OLqaNqUGLNoxx2Ye8Ctwbui0Ln5OXVnzUgH4I

env: CLEARML_WEB_HOST=https://app.clear.ml/
env: CLEARML_API_HOST=https://api.clear.ml
env: CLEARML_FILES_HOST=https://files.clear.ml
env: CLEARML_API_ACCESS_KEY=ZP02U03C6V5ER4K9VWRNZT7EWA5ZTV
env: CLEARML_API_SECRET_KEY=BtA5GXZufr6QGpaqhX1GSKPTvaCt56OLqaNqUGLNoxx2Ye8Ctwbui0Ln5OXVnzUgH4I


### Dataset

Summary

In [10]:
DATASET = 'Sleep_Stages' # dataset with the real-world noise
#Clone the GitHub repository
repo_url = "https://github.com/avkornaev/Sleep_Stages"  # Replace with your repo URL
!git clone {repo_url}

#Navigate to the data folder
repo_name = repo_url.split("/")[-1].replace(".git", "")  # Extract repo name
data_dir = os.path.join(repo_name, "data")  # Replace "data" with your folder name
# os.chdir(data_dir)  # Change working directory to the data folder

# Verify the data directory
if os.path.exists(data_dir):
    print(f"Data directory found: {data_dir}")
else:
    print(f"Data directory not found: {data_dir}")

Cloning into 'Sleep_Stages'...
remote: Enumerating objects: 11, done.[K
remote: Counting objects: 100% (11/11), done.[K
remote: Compressing objects: 100% (10/10), done.[K
remote: Total 11 (delta 0), reused 8 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (11/11), 5.93 MiB | 13.81 MiB/s, done.
Data directory found: Sleep_Stages/data


### Collect parameters

In [None]:
#Model parameters
LOSS_FUN = 'CE' # 'CE','CELoss'(custom), 'N', 'B', etc.
ARCHITECTURE = 'ResNet50' # 'CNN, 'ResNet50', 'ViT', etc.

#Collect the parameters (hyperparams and others)
im_size = SIZE if ARCHITECTURE == 'CNN' else 224
hparams = {
    "seed": SEED,
    "lr": 0.001,
    'weight_decay': 0.0,
    "dropout": 0.0,
    "bs": 128,
    "num_workers": 2,
    "num_epochs": 2,
    "criterion": LOSS_FUN,
    "architecture": ARCHITECTURE,
    "num_samples": NS,
    "im_size": im_size,
    "mean": np.array([0.4914, 0.4822, 0.4465]),
    "std": np.array([0.2470, 0.2435, 0.2616]),
    'randResCrop': {'size': (im_size, im_size), 'scale': (0.8, 1.0), 'ratio': (0.9, 1.1)},
    'label_smoothing': 0.0,
    "n_classes": NUM_CLASSES,
    "noise_path": './data/CIFAR-10_human.pt',
    "noise_type": NOISE_TYPE  # Can be 'clean_label', 'worse_label', 'aggre_label', etc.
}

#Visualization
vis_params = {
    'fig_size': 5,
    'num_samples': 5,
    'num_bins': 50,
}

## Functions

### Lightning

Data module

In [12]:
def load_and_merge_volunteer_data(directory):
    """
    Loads and merges volunteer data from .gz files in the specified directory.
    """
    volunteer_data = defaultdict(list)

    for filename in os.listdir(directory):
        if filename.endswith(".csv.gz") and filename.startswith("Vol_"):
            volunteer_id = filename.split("_")[1].split(".")[0]
            file_path = os.path.join(directory, filename)
            df = pd.read_csv(file_path, compression='gzip')
            volunteer_data[volunteer_id].append(df)

    # Merge DataFrames for each volunteer
    merged_data = {}
    for volunteer_id, dfs in volunteer_data.items():
        merged_df = pd.concat(dfs, ignore_index=True)
        merged_data[f"Vol_{volunteer_id}"] = merged_df

        # Print label distribution
        print(f"\nVolunteer ID: Vol_{volunteer_id}")
        print("Label distribution:")
        print(merged_df['label'].value_counts())

    return merged_data

# Load and merge data
merged_data = load_and_merge_volunteer_data(data_dir)


Volunteer ID: Vol_02
Label distribution:
label
N2    72000
R     47999
W     13329
Name: count, dtype: int64

Volunteer ID: Vol_03
Label distribution:
label
N2    677854
R     292200
N3    288013
W     192065
N1     11953
Name: count, dtype: int64

Volunteer ID: Vol_01
Label distribution:
label
N2    155954
N3    107999
R     100018
Name: count, dtype: int64


In [13]:
class SleepStageDataset(Dataset):
    """Custom Dataset for sleep stage classification."""
    def __init__(self, data, labels):
        self.data = data
        self.labels = labels

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        return self.data[idx], self.labels[idx]

class SleepStageDataModule(pl.LightningDataModule):
    """PyTorch Lightning DataModule for sleep stage classification."""
    def __init__(self, merged_data, batch_size=32):
        super().__init__()
        self.merged_data = merged_data
        self.batch_size = batch_size
        self.setup()

    def setup(self, stage=None):
        """Prepare data for LOSO cross-validation."""
        self.datasets = {}
        self.subject_ids = []

        # Prepare data for each volunteer
        for vol_id, df in self.merged_data.items():
            # Extract features and labels
            features = df.iloc[:, :-1].values.astype(np.float32)  # All columns except the last one
            labels = df['label'].astype('category').cat.codes.values  # Convert labels to integers

            # Store datasets and subject IDs
            self.datasets[vol_id] = SleepStageDataset(features, labels)
            self.subject_ids.append(vol_id)

    def train_dataloader(self, subject_id):
        """Create a DataLoader for training (all subjects except one)."""
        train_data = []
        train_labels = []

        for vol_id, dataset in self.datasets.items():
            if vol_id != subject_id:
                train_data.append(dataset.data)
                train_labels.append(dataset.labels)

        train_data = np.concatenate(train_data)
        train_labels = np.concatenate(train_labels)

        train_dataset = SleepStageDataset(train_data, train_labels)
        return DataLoader(train_dataset, batch_size=self.batch_size, shuffle=True)

    def val_dataloader(self, subject_id):
        """Create a DataLoader for validation (one subject)."""
        val_dataset = self.datasets[subject_id]
        return DataLoader(val_dataset, batch_size=self.batch_size)

# Example usage
data_module = SleepStageDataModule(merged_data, batch_size=32)

In [None]:
from pytorch_lightning import Trainer
from pytorch_lightning.callbacks import EarlyStopping

# Define a simple PyTorch Lightning model
class SleepStageModel(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential(
            torch.nn.Linear(18, 64),  # Adjust input size based on your data
            torch.nn.ReLU(),
            torch.nn.Linear(64, 4)    # 4 output classes (NREM1, NREM2, REM, Wake)
        )
        self.loss_fn = torch.nn.CrossEntropyLoss()

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.loss_fn(y_hat, y)
        self.log("train_loss", loss)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.loss_fn(y_hat, y)
        self.log("val_loss", loss)
        return loss

    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=1e-3)

# Perform LOSO cross-validation
for subject_id in data_module.subject_ids:
    print(f"\nTraining on all subjects except {subject_id}...")

    # Initialize model and trainer
    model = SleepStageModel()
    trainer = Trainer(max_epochs=10, callbacks=[EarlyStopping(monitor="val_loss", patience=3)])

    # Train and validate
    trainer.fit(model, data_module.train_dataloader(subject_id), data_module.val_dataloader(subject_id))

In [None]:
class CIFAR10DataModule(pl.LightningDataModule):
    def __init__(self, params):
        super().__init__()
        self.seed = params['seed']
        self.batch_size = params['bs']
        self.num_workers = params['num_workers']
        self.mean = params['mean']
        self.std = params['std']
        self.ns = params['num_samples']
        self.rand_res_crop = params['randResCrop']
        self.noise_path = params.get('noise_path', './data/CIFAR-10_human.pt')
        self.noise_type = params.get('noise_type', 'worse_label')  # Default to 'worse_label'

        # Ensure the data directory exists
        os.makedirs(os.path.dirname(self.noise_path), exist_ok=True)

        # Download the CIFAR-10_human.pt file if it doesn't exist
        if not os.path.exists(self.noise_path):
            print(f"Downloading CIFAR-10_human.pt from GitHub...")
            download_file(
                url="https://github.com/UCSC-REAL/cifar-10-100n/raw/main/data/CIFAR-10_human.pt",
                save_path=self.noise_path
            )

        self.transform = transforms.Compose([
            transforms.RandomResizedCrop(size=self.rand_res_crop['size'],
                                         scale=self.rand_res_crop['scale'],
                                         ratio=self.rand_res_crop['ratio']),
            transforms.ToTensor(),
            transforms.Normalize(self.mean, self.std)
        ])

    def prepare_data(self):
        # Download CIFAR-10 dataset
        datasets.CIFAR10(root='./data', train=True, download=True)
        datasets.CIFAR10(root='./data', train=False, download=True)

    def setup(self, stage=None):
        # Load noisy labels
        noise_file = torch.load(self.noise_path)
        clean_label = noise_file['clean_label']
        noisy_label = noise_file[self.noise_type]

        # Split dataset into train and validation sets
        cifar10_full = CIFAR10(root='./data', train=True, transform=self.transform,
                               noise_type=self.noise_type, noise_path=self.noise_path, is_human=True)
        pl.seed_everything(self.seed)
        self.cifar10_train, self.cifar10_val = random_split(cifar10_full,
                                                            [self.ns['train'],
                                                             self.ns['val']])
        self.cifar10_test = CIFAR10(root='./data', train=False, transform=self.transform)

    def train_dataloader(self):
        return DataLoader(self.cifar10_train, batch_size=self.batch_size,
                          num_workers=self.num_workers, shuffle=True)

    def val_dataloader(self):
        return DataLoader(self.cifar10_val, batch_size=self.batch_size,
                          num_workers=self.num_workers)

    def test_dataloader(self):
        return DataLoader(self.cifar10_test, batch_size=self.batch_size,
                          shuffle=False)

Training module

In [None]:
class train_model(pl.LightningModule):
    def __init__(self, model=None, loss=None, hparams=hparams):
        super().__init__()
        self.save_hyperparameters(hparams)
        self.model = model
        self.loss_fn = loss
        self.nc = hparams['n_classes']
        self.lr = hparams['lr']
        self.wd = hparams['weight_decay']

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        x, y, _ = batch  # Unpack batch (ignore indices for now)
        logits = self(x)
        loss = self.loss_fn(logits, y)

        # Log training loss and accuracy
        # preds = torch.argmax(logits[:, :self.nc], dim=1)
        # acc = (preds == y).float().mean()
        self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
        # self.log('train_acc', acc, on_step=True, on_epoch=True, prog_bar=True)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y, _ = batch  # Unpack batch (ignore indices for now)
        logits = self(x)
        loss = self.loss_fn(logits, y)

        # Log validation loss and accuracy
        # preds = torch.argmax(logits[:, :self.nc], dim=1)
        # acc = (preds == y).float().mean()
        self.log('val_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
        # self.log('val_acc', acc, on_step=True, on_epoch=True, prog_bar=True)
        return loss

    def test_step(self, batch, batch_idx):
        x, y, _ = batch  # Unpack batch (ignore indices for now)
        logits = self(x)
        loss = self.loss_fn(logits, y)

        # Log test loss and accuracy
        preds = torch.argmax(logits[:, :self.nc], dim=1)
        acc = (preds == y).float().mean()
        self.log('test_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
        self.log('test_acc', acc, on_step=True, on_epoch=True, prog_bar=True)
        return {'loss': loss, 'preds': preds, 'y': y}

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.lr, weight_decay=self.wd)

        # Optionally, add a learning rate scheduler
        scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=1.0)
        return [optimizer], [scheduler]

### Models

CNN from paper by [Xia](https://arxiv.org/abs/2106.00445)

In [None]:
def call_bn(bn, x):
    return bn(x)

class CNN(nn.Module):
    def __init__(self, input_channel=3, n_outputs=10, dropout_rate=0.25, top_bn=False):
        self.dropout_rate = dropout_rate
        self.top_bn = top_bn
        super(CNN, self).__init__()
        self.c1=nn.Conv2d(input_channel,128,kernel_size=3,stride=1, padding=1)
        self.c2=nn.Conv2d(128,128,kernel_size=3,stride=1, padding=1)
        self.c3=nn.Conv2d(128,128,kernel_size=3,stride=1, padding=1)
        self.c4=nn.Conv2d(128,256,kernel_size=3,stride=1, padding=1)
        self.c5=nn.Conv2d(256,256,kernel_size=3,stride=1, padding=1)
        self.c6=nn.Conv2d(256,256,kernel_size=3,stride=1, padding=1)
        self.c7=nn.Conv2d(256,512,kernel_size=3,stride=1, padding=0)
        self.c8=nn.Conv2d(512,256,kernel_size=3,stride=1, padding=0)
        self.c9=nn.Conv2d(256,128,kernel_size=3,stride=1, padding=0)
        self.l_c1=nn.Linear(128,n_outputs)
        self.bn1=nn.BatchNorm2d(128)
        self.bn2=nn.BatchNorm2d(128)
        self.bn3=nn.BatchNorm2d(128)
        self.bn4=nn.BatchNorm2d(256)
        self.bn5=nn.BatchNorm2d(256)
        self.bn6=nn.BatchNorm2d(256)
        self.bn7=nn.BatchNorm2d(512)
        self.bn8=nn.BatchNorm2d(256)
        self.bn9=nn.BatchNorm2d(128)

    def forward(self, x,):
        h=x
        h=self.c1(h)
        h=F.leaky_relu(call_bn(self.bn1, h), negative_slope=0.01)
        h=self.c2(h)
        h=F.leaky_relu(call_bn(self.bn2, h), negative_slope=0.01)
        h=self.c3(h)
        h=F.leaky_relu(call_bn(self.bn3, h), negative_slope=0.01)
        h=F.max_pool2d(h, kernel_size=2, stride=2)
        h=F.dropout2d(h, p=self.dropout_rate)

        h=self.c4(h)
        h=F.leaky_relu(call_bn(self.bn4, h), negative_slope=0.01)
        h=self.c5(h)
        h=F.leaky_relu(call_bn(self.bn5, h), negative_slope=0.01)
        h=self.c6(h)
        h=F.leaky_relu(call_bn(self.bn6, h), negative_slope=0.01)
        h=F.max_pool2d(h, kernel_size=2, stride=2)
        h=F.dropout2d(h, p=self.dropout_rate)

        h=self.c7(h)
        h=F.leaky_relu(call_bn(self.bn7, h), negative_slope=0.01)
        h=self.c8(h)
        h=F.leaky_relu(call_bn(self.bn8, h), negative_slope=0.01)
        h=self.c9(h)
        h=F.leaky_relu(call_bn(self.bn9, h), negative_slope=0.01)
        h=F.avg_pool2d(h, kernel_size=h.data.shape[2])

        h = h.view(h.size(0), h.size(1))
        logit=self.l_c1(h)
        if self.top_bn:
            logit=call_bn(self.bn_c1, logit)
        return logit

ResNet50

In [None]:
class ResNet50(nn.Module):
    def __init__(self, n_outputs, freeze=True):
        """
        Args:
            n_outputs (int): Number of output classes.
            freeze (bool): If True, freeze all layers except the head.
        """
        super(ResNet50, self).__init__()
        self.n_outputs = n_outputs
        self.freeze = freeze

        # Load the pre-trained ResNet50 model
        self.resnet50 = models.resnet50(pretrained=True)

        # Modify the final layer to match the number of outputs
        self.resnet50.fc = nn.Linear(self.resnet50.fc.in_features, n_outputs)

        # Freeze all layers except the head if freeze=True
        if self.freeze:
            self._freeze_layers()

    def _freeze_layers(self):
        """
        Freeze all layers except the head.
        """
        # Freeze all parameters in the model
        for param in self.resnet50.parameters():
            param.requires_grad = False

        # Unfreeze the final classification layer (head)
        for param in self.resnet50.fc.parameters():
            param.requires_grad = True

    def forward(self, x):
        return self.resnet50(x)

def count_trainable_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

# Example usage
model_frozen = ResNet50(n_outputs=10, freeze=True)
print(f"Trainable parameters (freeze=True): {count_trainable_parameters(model_frozen)}")

model_unfrozen = ResNet50(n_outputs=10, freeze=False)
print(f"Trainable parameters (freeze=False): {count_trainable_parameters(model_unfrozen)}")

Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:00<00:00, 119MB/s]


Trainable parameters (freeze=True): 20490
Trainable parameters (freeze=False): 23528522


ViT

In [None]:
class ViT(nn.Module):
    def __init__(self, n_outputs, freeze=True):
        """
        Args:
            n_outputs (int): Number of output classes.
            freeze (bool): If True, freeze all layers except the head.
        """
        super(ViT, self).__init__()
        self.n_outputs = n_outputs
        self.freeze = freeze

        # Load the pre-trained ViT model
        self.vit = models.vit_b_16(pretrained=True)

        # Modify the final layer to match the number of outputs
        self.vit.heads.head = nn.Linear(self.vit.heads.head.in_features, n_outputs)

        # Freeze all layers except the head if freeze=True
        if self.freeze:
            self._freeze_layers()

    def _freeze_layers(self):
        """
        Freeze all layers except the head.
        """
        # Freeze all parameters in the model
        for param in self.vit.parameters():
            param.requires_grad = False

        # Unfreeze the head (final classification layer)
        for param in self.vit.heads.head.parameters():
            param.requires_grad = True

    def forward(self, x):
        return self.vit(x)

def count_trainable_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

# Example usage
model_frozen = ViT(n_outputs=10, freeze=True)
print(f"Trainable parameters (freeze=True): {count_trainable_parameters(model_frozen)}")

model_unfrozen = ViT(n_outputs=10, freeze=False)
print(f"Trainable parameters (freeze=False): {count_trainable_parameters(model_unfrozen)}")

# class ViT(nn.Module):
#     def __init__(self, n_outputs):
#         super(ViT, self).__init__()
#         self.n_outputs = n_outputs
#         # Load the pre-trained ViT model
#         self.vit = models.vit_b_16(pretrained=True)

#         # Modify the final layer to match the number of outputs
#         self.vit.heads.head = nn.Linear(self.vit.heads.head.in_features, n_outputs)

#     def forward(self, x):
#         return self.vit(x)

Downloading: "https://download.pytorch.org/models/vit_b_16-c867db91.pth" to /root/.cache/torch/hub/checkpoints/vit_b_16-c867db91.pth
100%|██████████| 330M/330M [00:05<00:00, 62.3MB/s]


Trainable parameters (freeze=True): 7690
Trainable parameters (freeze=False): 85806346


### Loss functions

Create a loss function class, or use a standart one.

In [None]:
# Cross entropy loss maden from scratch (just in case)
class CELoss(nn.Module):
    def __init__(self, params=hparams):
        super(CELoss, self).__init__()
        self.smoothing = params.get('label_smoothing', 0.1)  # Default smoothing value
        self.num_classes = params.get('n_classes', 10)
        self.inv_smoothing = 1.0 - self.smoothing  # Probability for the correct class

    def forward(self, x, y):
        """
        x: Model output (logits)
            - Shape: (batch_size, num_classes)
        y: Labels
            - Shape: (batch_size,)
        """
        # Apply label smoothing to the one-hot encoded labels
        with torch.no_grad():
            yoh = torch.zeros_like(x)  # Create a one-hot encoded version of y
            yoh.fill_(self.smoothing / (self.num_classes - 1))  # Fill with smoothed values
            yoh.scatter_(1, y.unsqueeze(1), self.inv_smoothing)  # Set correct class to 1 - smoothing

        # Compute the cross-entropy loss between logits and smoothed labels
        log_probs = F.log_softmax(x, dim=1)  # Log probabilities
        loss = -(yoh * log_probs).sum(dim=1).mean()  # Sum over classes and mean over batch

        return loss

In [None]:
class NLoss(nn.Module):
    def __init__(self, params=hparams):
        super(NLoss, self).__init__()
        self.smoothing =   params.get('label_smoothing', 0.0)
        self.num_classes = params.get('n_classes', 10)
        self.inv_smoothing = 1.0 - self.smoothing  # Probability for the correct class

    def forward(self, x, y):
        """
        x: Model output (logits + log variance)
            - x[:, :self.num_classes]: Logits for class probabilities (h)
            - x[:, self.num_classes:]: Logarithmic variance (s)
        y: Labels
        """
        # Split the model output into predictions (h) and log variance (s)
        logits = x[:, :self.num_classes]  # Predictions (h)
        log_var = x[:, self.num_classes:]  # Logarithmic variance (s)

        # Apply label smoothing to the one-hot encoded labels
        with torch.no_grad():
            yoh = torch.zeros_like(logits)
            yoh.fill_(self.smoothing / (self.num_classes - 1))
            yoh.scatter_(1, y.data.unsqueeze(1), self.inv_smoothing)

        # Compute the squared differences between predictions and smoothed labels
        squared_diff = torch.pow(yoh - logits, 2)  # (y_k - h_k)^2

        # Compute the exponential of the negative log variance (e^{-s})
        exp_neg_log_var = torch.exp(-log_var)

        # Compute the first term of the loss: e^{-s} * sum((y_k - h_k)^2)
        term1 = exp_neg_log_var * squared_diff.sum(dim=1)

        # Compute the second term of the loss: N * s
        term2 = self.num_classes * log_var

        # Combine the terms and compute the mean over the batch
        loss = (term1 + term2).mean()

        return loss

In [None]:
class BLoss(nn.Module):
    def __init__(self, params=hparams):
        super(BLoss, self).__init__()
        self.smoothing =   params.get('label_smoothing', 0.0)
        self.num_classes = params.get('n_classes', 10)
        self.inv_smoothing = 1.0 - self.smoothing  # Probability for the correct class


    def forward(self, x, y):
        # Extract certainty and probabilities from the model output
        certainty = torch.sigmoid(x[:, self.num_classes:])  # Certainty values
        logits = x[:, :self.num_classes]  # Logits for class probabilities
        prob = F.softmax(logits, dim=1)  # Softmax probabilities

        # Compute cosine similarity between predictions and labels
        cos = nn.CosineSimilarity(dim=1)

        # Apply label smoothing to the one-hot encoded labels
        with torch.no_grad():
            yoh = torch.zeros_like(logits)
            yoh.fill_(self.smoothing / (self.num_classes - 1))
            yoh.scatter_(1, y.data.unsqueeze(1), self.inv_smoothing)


        # Compute the terms of the loss
        cosyh = cos(yoh, prob)
        delta = yoh * prob  # Element-wise product of one-hot labels and probabilities
        entropy_term = delta * torch.log(delta + 1e-10)  # Entropy term (avoid log(0))

        # Loss terms
        loss0 = -cosyh * torch.log(certainty / self.num_classes + 1e-10)  # First term
        loss1 = -(self.num_classes - 1) * (1 - cosyh) * torch.log((1 - certainty) / self.num_classes + 1e-10)  # Second term

        # Combine the terms and compute the mean over the batch
        loss = (entropy_term.sum(dim=1) + loss0 + loss1).mean()

        return loss

### Models zoo

Architectures and loss functions

In [None]:
def get_arch_and_loss(hparams):
    """
    Returns the architecture and loss function based on the provided hparams.

    Args:
        hparams (dict): Hyperparameters dictionary, including 'ARCHITECTURE' and 'criterion'.

    Returns:
        arch: The model architecture.
        loss: The loss function.
    """
    # Determine the number of outputs based on the loss function
    if hparams['criterion'] in ['B', 'N']:
        n_outputs = hparams['n_classes'] + 1  # Add 1 output neuron for BLoss or NLoss
    else:
        n_outputs = hparams['n_classes']  # Default number of outputs

    # Define the architectures
    architectures = {
        'CNN': CNN(n_outputs=n_outputs),
        'ResNet50': ResNet50(n_outputs=n_outputs, freeze=hparams.get('freeze', True)),
        'ViT': ViT(n_outputs=n_outputs, freeze=hparams.get('freeze', True)),
    }

    # Define the loss functions
    losses = {
        'CE':CELoss(),
        'B': BLoss(),
        'N': NLoss(),
    }

    # Get the architecture and loss based on hparams
    arch = architectures.get(hparams['architecture'])
    loss = losses.get(hparams['criterion'])

    if arch is None:
        raise ValueError(f"Architecture '{hparams['ARCHITECTURE']}' is not supported.")
    if loss is None:
        raise ValueError(f"Loss function '{hparams['criterion']}' is not supported.")

    return arch, loss


### Metrics

In [None]:
def metrics(dataloader,model,hparams=hparams,loss_fn_red=None):
    # Collect images, predictions, and losses
    # images = []
    preds  = []
    labels = []
    losses = []
    correct= 0
    total  = 0
    for batch in dataloader:
        x, y, _ = batch
        with torch.no_grad():
            logits = model(x)
            # loss = loss_fn_red(h,y)
            pred = torch.argmax(logits[:,:hparams['n_classes']], dim=1)
        correct += (pred == y).sum().item()  # Number of correct predictions
        total += y.size(0)  # Total number of samples

        # images.extend(x.cpu())
        preds.extend(pred.cpu().numpy())
        labels.extend(y.cpu().numpy())
        # losses.extend(loss.cpu().numpy())
    acc = correct / total
    return preds, labels, acc

### Visualization
Note: needs collection of the loss values for the each sample

In [None]:
# Plot image samples with top loss values
def top_losses_vis(vis_params, images, preds, labels, losses):
    num_imgs = vis_params['num_samples']
    top_loss_indices = np.argsort(losses)[-num_imgs:]

    plt.figure(figsize=(num_imgs*2, 2))
    for i, idx in enumerate(top_loss_indices):
        plt.subplot(1, num_imgs, i + 1)
        plt.imshow(images[idx].squeeze(), cmap='gray')
        plt.title(f'True: {labels[idx]}\nPred: {preds[idx]}\nLoss: {losses[idx]:.2f}')
        plt.axis('off')
    plt.show()

# Plot confusion matrix
def conf_mat(figsize,class_names=None):
    plt.figure(figsize)
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                xticklabels=class_names, yticklabels=class_names)
    plt.xlabel('Predicted')
    plt.ylabel('True')
    plt.title('Confusion Matrix')
    plt.show()

# Ensembling
This approach is expected to give a robust ensemble model that leverages the diversity introduced by different seeds, potentially improving the overall accuracy on the test set.

## Create Dataset and Data Loaders

Initialization of the dataset, the dataloader, and the training module

In [None]:
data_module = CIFAR10DataModule(hparams)

Downloading CIFAR-10_human.pt from GitHub...
File downloaded and saved to ./data/CIFAR-10_human.pt


## Train the Ensemble

Loop over different seeds

In [None]:
# List to store predictions from each model
all_predictions = []

In [None]:
for seed in SEEDS:
    # Set seed for reproducibility at the VERY BEGINNING
    pl.seed_everything(seed)

    # Reinitialize the model architecture for each seed
    arch, loss_fn = get_arch_and_loss(hparams)
    # archs_and_losses = get_arch_and_loss(hparams)

    # arch, loss_fn = archs_and_losses[hparams['criterion']]['arch']
    # loss_fn = archs_and_losses[hparams['criterion']]['loss']


    checkpoint_callback_img = ModelCheckpoint(
        monitor='val_loss',       # Monitor validation loss
        dirpath=CHECKPOINT_PATH,  # Directory to save checkpoints
        filename=f'best_model_{ARCHITECTURE}_{LOSS_FUN}_{seed}_{NOISE_TYPE}',  # Checkpoint filename
        save_top_k=1,             # Save only the best model
        mode='min',               # Minimize validation loss
    )

    task = Task.init(project_name="ICML-2025",
                     task_name=f'arch_{ARCHITECTURE}_loss_{LOSS_FUN}_seed_{seed}_noise_{NOISE_TYPE}')

    # Initialize the model with the reinitialized architecture
    model = train_model(model=arch, loss=loss_fn)

    # Log hyperparameters to ClearML
    task.connect(model.hparams)

    trainer = Trainer(max_epochs=hparams['num_epochs'],
                      callbacks=[checkpoint_callback_img],
                      accelerator="auto", devices="auto")
    trainer.fit(model, data_module)

    best_model_path = checkpoint_callback_img.best_model_path
    task.update_output_model(model_path=best_model_path, auto_delete_file=False)
    best_model = train_model.load_from_checkpoint(best_model_path,
                                                  model=arch,
                                                  loss=loss_fn)

    # Test set
    test_dataloader = data_module.test_dataloader()
    # Move the model to the correct device
    best_model = best_model.to(device)
    predictions = []
    with torch.no_grad():
        for batch in test_dataloader:
            x, _, _, = batch  # We only need the input data, not the labels
            logits = best_model(x.to(device))
            preds = torch.argmax(logits[:, :NUM_CLASSES], dim=1)
            predictions.append(preds.cpu().numpy())
    predictions = np.concatenate(predictions)  # Combine all batch predictions
    all_predictions.append(predictions)

    if seed != SEEDS[-1]:
        task.close()
        del[model, best_model, task, arch, loss_fn]

INFO:lightning_fabric.utilities.seed:Seed set to 42


ClearML Task: created new task id=0615a2ba5e3943bb99577e0a264a8476
2025-02-16 18:42:44,354 - clearml.Task - INFO - Storing jupyter notebook directly as code
ClearML results page: https://app.clear.ml/projects/ccaa059e6de442b6abe578eab9e214c8/experiments/0615a2ba5e3943bb99577e0a264a8476/output/log


INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


100%|██████████| 170M/170M [00:05<00:00, 31.0MB/s]


Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified



You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.



2025-02-16 18:43:01,193 - clearml.model - INFO - Selected model id: 1cc66cd032514d479ec921f8acfd3445


INFO:lightning_fabric.utilities.seed:Seed set to 42


Loaded worse_label from ./data/CIFAR-10_human.pt.
The overall noise rate is 0.40208


INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name    | Type     | Params | Mode 
---------------------------------------------
0 | model   | ResNet50 | 23.5 M | train
1 | loss_fn | CELoss   | 0      | train
---------------------------------------------
20.5 K    Trainable params
23.5 M    Non-trainable params
23.5 M    Total params
94.114    Total estimated model params size (MB)
153       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

2025-02-16 18:45:59,808 - clearml.frameworks - INFO - Found existing registered model id=51761c246edc4427a1b9fa7a875d58c7 [/content/saved_models/best_model_ResNet50_CE_42_worse_label.ckpt] reusing it.


Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=2` reached.


2025-02-16 18:49:01,351 - clearml.storage - INFO - Uploading: 90.23MB to /content/saved_models/best_model_ResNet50_CE_42_worse_label.ckpt



Attribute 'model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['model'])`.


Attribute 'loss' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['loss'])`.


clamping frac to range [0, 1]

███████████████████████████████ 100% | 90.23/90.23 MB [00:02<00:00, 32.96MB/s]: 

2025-02-16 18:49:04,116 - clearml.Task - INFO - Completed model upload to https://files.clear.ml/ICML-2025/arch_ResNet50_loss_CE_seed_42_noise_worse_label.0615a2ba5e3943bb99577e0a264a8476/models/best_model_ResNet50_CE_42_worse_label.ckpt



INFO:lightning_fabric.utilities.seed:Seed set to 0

The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.


Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.


Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ViT_B_16_Weights.IMAGENET1K_V1`. You can also use `weights=ViT_B_16_Weights.DEFAULT` to get the most up-to-date weights.



ClearML Task: created new task id=551c3a6f3572475fbc3c0a6680905257
ClearML results page: https://app.clear.ml/projects/ccaa059e6de442b6abe578eab9e214c8/experiments/551c3a6f3572475fbc3c0a6680905257/output/log


Parameters must be of builtin type (General/mean[ndarray], General/std[ndarray])
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


Files already downloaded and verified
Files already downloaded and verified



You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.

INFO:lightning_fabric.utilities.seed:Seed set to 42


Loaded worse_label from ./data/CIFAR-10_human.pt.
The overall noise rate is 0.40208



Checkpoint directory /content/saved_models exists and is not empty.

INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name    | Type     | Params | Mode 
---------------------------------------------
0 | model   | ResNet50 | 23.5 M | train
1 | loss_fn | CELoss   | 0      | train
---------------------------------------------
20.5 K    Trainable params
23.5 M    Non-trainable params
23.5 M    Total params
94.114    Total estimated model params size (MB)
153       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=2` reached.

Attribute 'model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['model'])`.


Attribute 'loss' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['loss'])`.


clamping frac to range [0, 1]

███████████████████████████████ 100% | 90.23/90.23 MB [00:02<00:00, 33.70MB/s]: 
INFO:lightning_fabric.utilities.seed:Seed set to 17

The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.


Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the 

ClearML Task: created new task id=943411caf4fa4a97a1e924ac3d9d5b60
ClearML results page: https://app.clear.ml/projects/ccaa059e6de442b6abe578eab9e214c8/experiments/943411caf4fa4a97a1e924ac3d9d5b60/output/log


Parameters must be of builtin type (General/mean[ndarray], General/std[ndarray])
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


Files already downloaded and verified
Files already downloaded and verified



You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.

INFO:lightning_fabric.utilities.seed:Seed set to 42


Loaded worse_label from ./data/CIFAR-10_human.pt.
The overall noise rate is 0.40208



Checkpoint directory /content/saved_models exists and is not empty.

INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name    | Type     | Params | Mode 
---------------------------------------------
0 | model   | ResNet50 | 23.5 M | train
1 | loss_fn | CELoss   | 0      | train
---------------------------------------------
20.5 K    Trainable params
23.5 M    Non-trainable params
23.5 M    Total params
94.114    Total estimated model params size (MB)
153       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=2` reached.

Attribute 'model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['model'])`.


Attribute 'loss' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['loss'])`.


clamping frac to range [0, 1]

███████████████████████████████ 100% | 90.23/90.23 MB [00:02<00:00, 33.99MB/s]: 
INFO:lightning_fabric.utilities.seed:Seed set to 9

The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.


Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the m

ClearML Task: created new task id=34e438b3c6724545808fa8a1702ddee1
ClearML results page: https://app.clear.ml/projects/ccaa059e6de442b6abe578eab9e214c8/experiments/34e438b3c6724545808fa8a1702ddee1/output/log


Parameters must be of builtin type (General/mean[ndarray], General/std[ndarray])
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


Files already downloaded and verified
Files already downloaded and verified



You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.

INFO:lightning_fabric.utilities.seed:Seed set to 42


Loaded worse_label from ./data/CIFAR-10_human.pt.
The overall noise rate is 0.40208



Checkpoint directory /content/saved_models exists and is not empty.

INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name    | Type     | Params | Mode 
---------------------------------------------
0 | model   | ResNet50 | 23.5 M | train
1 | loss_fn | CELoss   | 0      | train
---------------------------------------------
20.5 K    Trainable params
23.5 M    Non-trainable params
23.5 M    Total params
94.114    Total estimated model params size (MB)
153       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=2` reached.

Attribute 'model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['model'])`.


Attribute 'loss' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['loss'])`.


clamping frac to range [0, 1]

███████████████████████████████ 100% | 90.23/90.23 MB [00:02<00:00, 33.55MB/s]: 
INFO:lightning_fabric.utilities.seed:Seed set to 3

The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.


Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the m

ClearML Task: created new task id=61dbe3bc20b849ad91a785b27d8486ef
ClearML results page: https://app.clear.ml/projects/ccaa059e6de442b6abe578eab9e214c8/experiments/61dbe3bc20b849ad91a785b27d8486ef/output/log


Parameters must be of builtin type (General/mean[ndarray], General/std[ndarray])
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs


Files already downloaded and verified
Files already downloaded and verified



You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.

INFO:lightning_fabric.utilities.seed:Seed set to 42


Loaded worse_label from ./data/CIFAR-10_human.pt.
The overall noise rate is 0.40208



Checkpoint directory /content/saved_models exists and is not empty.

INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name    | Type     | Params | Mode 
---------------------------------------------
0 | model   | ResNet50 | 23.5 M | train
1 | loss_fn | CELoss   | 0      | train
---------------------------------------------
20.5 K    Trainable params
23.5 M    Non-trainable params
23.5 M    Total params
94.114    Total estimated model params size (MB)
153       Modules in train mode
0         Modules in eval mode


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=2` reached.

Attribute 'model' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['model'])`.


Attribute 'loss' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['loss'])`.


clamping frac to range [0, 1]

███████████████████████████████ 100% | 90.23/90.23 MB [00:02<00:00, 34.36MB/s]: 


## Test the models and the ensemble of the models

In [None]:
all_predictions

[array([5, 8, 8, ..., 5, 1, 7]),
 array([5, 8, 8, ..., 5, 1, 7]),
 array([5, 8, 8, ..., 5, 1, 7]),
 array([5, 8, 8, ..., 5, 1, 7]),
 array([5, 8, 8, ..., 5, 1, 7])]

Individual models

In [None]:
# List to store individual model accuracies
individual_accuracies = []

# Compute accuracy for each model
for i, predictions in enumerate(all_predictions):
    # Get predictions for the current model
    model_predictions = predictions  # Shape: (num_samples,)

    # Get true labels (already collected earlier)
    true_labels = np.array(data_module.cifar10_test.targets)

    # Calculate accuracy for the current model
    accuracy = accuracy_score(true_labels, model_predictions)
    individual_accuracies.append(accuracy)
    print(f'Model {i+1} Accuracy: {accuracy:.4f}')

# Convert to numpy array for easier calculations
individual_accuracies = np.array(individual_accuracies)

# Compute mean accuracy
mean_accuracy = np.mean(individual_accuracies)

# Compute standard deviation of accuracy
std_accuracy = np.std(individual_accuracies)

print(f'Mean Accuracy: {mean_accuracy:.4f}')
print(f'Standard Deviation of Accuracy: {std_accuracy:.4f}')

Model 1 Accuracy: 0.6987
Model 2 Accuracy: 0.6979
Model 3 Accuracy: 0.6985
Model 4 Accuracy: 0.6990
Model 5 Accuracy: 0.6998
Mean Accuracy: 0.6988
Standard Deviation of Accuracy: 0.0006


Ensemble

In [None]:
# Stack predictions from all models
all_predictions = np.stack(all_predictions)  # Shape: (num_models, num_samples, num_classes)

# Ensemble predictions (e.g., by averaging)
ensemble_predictions = np.mean(all_predictions, axis=0)  # Shape: (num_samples, num_classes)
final_predictions, _ = mode(all_predictions, axis=0)  # Majority voting
final_predictions = final_predictions.flatten()  # Flatten to 1D array

# Get true labels from the CIFAR-10 data set
test_labels = np.array(data_module.cifar10_test.targets)
# test_labels = data_module.test_dataset.labels  # Adjust this based on your dataset

# Calculate accuracy
accuracy = accuracy_score(test_labels, final_predictions)
print(f'Ensemble Accuracy: {accuracy:.4f}')

# Compute confusion matrix
cm = confusion_matrix(test_labels, final_predictions)

Ensemble Accuracy: 0.7001


In [None]:
# Simulated test metrics
test_metrics = {
    "Mean Accuracy (individual)": mean_accuracy,
    "Standard Deviation of Accuracy (individual)": std_accuracy,
    "Ensemble Accuracy": accuracy,
}

task.connect(test_metrics)

{'Mean Accuracy (individual)': 0.69878,
 'Standard Deviation of Accuracy (individual)': 0.0006241794613730883,
 'Ensemble Accuracy': 0.7001}

In [None]:
task.close()