<a href="https://www.kaggle.com/code/pietrocaforio/unimodal-ct-training-kaggle?scriptVersionId=196992587" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Train unimodal CT

In [1]:
!git clone https://github.com/PietroCaforio/research-biocv-proj
!cd research-biocv-proj && git switch dev

Cloning into 'research-biocv-proj'...
remote: Enumerating objects: 178, done.[K
remote: Counting objects: 100% (178/178), done.[K
remote: Compressing objects: 100% (136/136), done.[K
remote: Total 178 (delta 94), reused 106 (delta 34), pack-reused 0 (from 0)[K
Receiving objects: 100% (178/178), 3.42 MiB | 29.39 MiB/s, done.
Resolving deltas: 100% (94/94), done.
Branch 'dev' set up to track remote branch 'dev' from 'origin'.
Switched to a new branch 'dev'


In [2]:
!cd research-biocv-proj && git pull

Already up to date.


In [3]:
!pip install wandb



In [4]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("wandb_api_key")

In [5]:
import wandb
wandb.login(key=secret_value_0)

[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

In [6]:
import sys
from pathlib import Path

# Add the 'data' directory to sys.path
sys.path.append(str(Path('research-biocv-proj').resolve()))
from data.unimodal import *
from pathlib import Path

import numpy as np
import torch
from torch.utils.data import DataLoader

### Train ResNet model

In [7]:
def train(model,config, run_name=None):
  wandb.init(
    # set the wandb project where this run will be logged
    project="unimodal_ct_training",
    name = run_name,
    # track hyperparameters and run metadata
    config=config
  )
  optimizer = optim.Adam(model.parameters(), lr=config["learning_rate"], weight_decay=config["weight_decay"])
  criterion = nn.CrossEntropyLoss()
  scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, 'min',factor = config["reduce_lr_factor"], patience = config["patience"])
  # Training loop
  num_epochs = config["epochs"]
  for epoch in range(num_epochs):
      model.train()
      running_loss = 0.0

      for batch in train_loader:
          frames = batch['frame'].float().to(device)
          labels = batch['label'].long().to(device)

          optimizer.zero_grad()
          outputs = model(frames)
          loss = criterion(outputs.logits, labels)

          loss.backward()
          optimizer.step()

          running_loss += loss.item()

      print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")

      # Validation loop
      model.eval()
      val_loss = 0.0
      correct = 0
      total = 0
      # Initialize counters for each class (G1, G2, G3)
      correct_per_class = [0, 0, 0]  # For G1, G2, G3
      total_per_class = [0, 0, 0]  # For G1, G2, G3

      with torch.no_grad():
          for batch in val_loader:
              frames = batch['frame'].float().to(device)
              labels = batch['label'].long().to(device)

              outputs = model(frames)
              loss = criterion(outputs.logits, labels)

              val_loss += loss.item()
              _, predicted = torch.max(outputs.logits, 1)
              total += labels.size(0)
              correct += (predicted == labels).sum().item()

              # Calculate accuracy per class
              for i in range(3):  # We have 3 classes: G1 (0), G2 (1), G3 (2)
                  correct_per_class[i] += ((predicted == i) & (labels == i)).sum().item()
                  total_per_class[i] += (labels == i).sum().item()
      scheduler.step(val_loss)
      # Compute total accuracy and per-class accuracy
      total_accuracy = 100 * correct / total
      class_accuracy = [(100 * correct_per_class[i] / total_per_class[i]) if total_per_class[i] > 0 else 0 for i in range(3)]
      print(f"Validation Loss: {val_loss/len(val_loader)}, Total Accuracy: {total_accuracy:.2f}%")
      print(f"Accuracy per class - G1: {class_accuracy[0]:.2f}%, G2: {class_accuracy[1]:.2f}%, G3: {class_accuracy[2]:.2f}%")
      # log metrics to wandb
      wandb.log({"Total Accuracy": total_accuracy, "Validation Loss": val_loss/len(val_loader), "G1_Acc":class_accuracy[0], "G2_Acc":class_accuracy[1], "G3_Acc":class_accuracy[2]})
  wandb.finish()  

In [8]:
import torch.nn as nn
import torch.optim as optim
from transformers import ResNetForImageClassification

In [9]:
train_dataset = UnimodalCTDataset(split='train',dataset_path = "/kaggle/input/preprocessed57patientscptacpda/processed/" )
val_dataset = UnimodalCTDataset(split='val',dataset_path = "/kaggle/input/preprocessed57patientscptacpda/processed/")

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

In [10]:
print(f"Training set stats:{train_dataset.stats()}")
print(f"Validation set stats:{val_dataset.stats()}")

Training set stats:{'length': 2292, 'class_frequency': {'G1': 78, 'G2': 1543, 'G3': 671}}
Validation set stats:{'length': 335, 'class_frequency': {'G1': 37, 'G2': 166, 'G3': 132}}


### Resnet-50

In [11]:
model = ResNetForImageClassification.from_pretrained('microsoft/resnet-50')
model.classifier[-1] = nn.Linear(model.classifier[-1].in_features, UnimodalCTDataset.num_classes) #Adjusting the final layer to the unimodal number of classes

config.json:   0%|          | 0.00/69.6k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/102M [00:00<?, ?B/s]

In [12]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

ResNetForImageClassification(
  (resnet): ResNetModel(
    (embedder): ResNetEmbeddings(
      (embedder): ResNetConvLayer(
        (convolution): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
        (normalization): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activation): ReLU()
      )
      (pooler): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    )
    (encoder): ResNetEncoder(
      (stages): ModuleList(
        (0): ResNetStage(
          (layers): Sequential(
            (0): ResNetBottleNeckLayer(
              (shortcut): ResNetShortCut(
                (convolution): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
                (normalization): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
              )
              (layer): Sequential(
                (0): ResNetConvLayer(
                  (convolution): Conv2d(64

In [13]:
config={
    "learning_rate": 1e-3,
    "architecture": "microsoft/resnet-50 new",
    "epochs": 100,
    "weight_decay": 1e-3,
    "reduce_lr_factor": 0.2,
    "patience": 10
    }
train(model, config, run_name = config["architecture"])

[34m[1mwandb[0m: Currently logged in as: [33mpietro-caforio[0m ([33mpietro-caforio-politecnico-di-milano[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: wandb version 0.18.1 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade
[34m[1mwandb[0m: Tracking run with wandb version 0.17.7
[34m[1mwandb[0m: Run data is saved locally in [35m[1m/kaggle/working/wandb/run-20240917_093200-0zfhmuc1[0m
[34m[1mwandb[0m: Run [1m`wandb offline`[0m to turn off syncing.
[34m[1mwandb[0m: Syncing run [33mmicrosoft/resnet-50 new[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training/runs/0zfhmuc1[0m


Epoch 1, Loss: 0.31855842446546173
Validation Loss: 20.33497434448112, Total Accuracy: 62.39%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 63.64%
Epoch 2, Loss: 0.034177304352245606
Validation Loss: 75.31478070543909, Total Accuracy: 61.79%
Accuracy per class - G1: 13.51%, G2: 79.52%, G3: 53.03%
Epoch 3, Loss: 0.028254963575616583
Validation Loss: 13.86516178958118, Total Accuracy: 63.88%
Accuracy per class - G1: 0.00%, G2: 81.93%, G3: 59.09%
Epoch 4, Loss: 0.05973166145364909
Validation Loss: 6.552539239132205, Total Accuracy: 74.63%
Accuracy per class - G1: 0.00%, G2: 85.54%, G3: 81.82%
Epoch 5, Loss: 0.03177825947770745
Validation Loss: 3.6120613618669184, Total Accuracy: 74.63%
Accuracy per class - G1: 0.00%, G2: 89.76%, G3: 76.52%
Epoch 6, Loss: 0.016632634853724286
Validation Loss: 2.4240675855597313, Total Accuracy: 67.16%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 75.76%
Epoch 7, Loss: 0.04171222469287588
Validation Loss: 1.8789448911662807, Total Accuracy: 61.19%
Accur

[34m[1mwandb[0m:                                                                                
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run history:
[34m[1mwandb[0m:          G1_Acc ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
[34m[1mwandb[0m:          G2_Acc ▆▆▆▇▆▇▅▇▆▆▁▆██▆▆▆▆▇▇▇▇▇▇▇▇▇▇▇▆▇▇▇▇▇▇▇▇▇▇
[34m[1mwandb[0m:          G3_Acc ▅▅▇▅▄▅▆▄▁▂▄▂▁▁████████████▇▇▇▇▇▇▇▇▇▇▇▇▇▇
[34m[1mwandb[0m:  Total Accuracy ▆▆▆▇▆▆▅▆▄▅▁▄▆▆▇▇▇▇███████████▇▇█████▇█▇█
[34m[1mwandb[0m: Validation Loss █▆▂▂▂▃▂▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run summary:
[34m[1mwandb[0m:          G1_Acc 0.0
[34m[1mwandb[0m:          G2_Acc 90.96386
[34m[1mwandb[0m:          G3_Acc 75.75758
[34m[1mwandb[0m:  Total Accuracy 74.92537
[34m[1mwandb[0m: Validation Loss 1.65964
[34m[1mwandb[0m: 
[34m[1mwandb[0m: 🚀 View run [33mmicrosoft/resnet-50 new[0m at: [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training/runs/0zfhmuc1[

### Resnet-18

In [14]:
model = ResNetForImageClassification.from_pretrained('microsoft/resnet-18')
model.classifier[-1] = nn.Linear(model.classifier[-1].in_features, UnimodalCTDataset.num_classes) #Adjusting the final layer to the unimodal number of classes

config.json:   0%|          | 0.00/69.5k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/46.8M [00:00<?, ?B/s]

In [15]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

ResNetForImageClassification(
  (resnet): ResNetModel(
    (embedder): ResNetEmbeddings(
      (embedder): ResNetConvLayer(
        (convolution): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
        (normalization): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activation): ReLU()
      )
      (pooler): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    )
    (encoder): ResNetEncoder(
      (stages): ModuleList(
        (0): ResNetStage(
          (layers): Sequential(
            (0): ResNetBasicLayer(
              (shortcut): Identity()
              (layer): Sequential(
                (0): ResNetConvLayer(
                  (convolution): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
                  (normalization): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
                  (activation): ReLU()
           

In [16]:
config={
    "learning_rate": 1e-3,
    "architecture": "microsoft/resnet-18 new",
    "epochs": 100,
    "weight_decay": 1e-3,
    "reduce_lr_factor": 0.2,
    "patience": 10
    }
train(model, config, run_name = config["architecture"])

[34m[1mwandb[0m: wandb version 0.18.1 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade
[34m[1mwandb[0m: Tracking run with wandb version 0.17.7
[34m[1mwandb[0m: Run data is saved locally in [35m[1m/kaggle/working/wandb/run-20240917_101107-ozpj3373[0m
[34m[1mwandb[0m: Run [1m`wandb offline`[0m to turn off syncing.
[34m[1mwandb[0m: Syncing run [33mmicrosoft/resnet-18 new[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training/runs/ozpj3373[0m


Epoch 1, Loss: 0.23343877721991804
Validation Loss: 2.2102840433839117, Total Accuracy: 64.48%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 68.94%
Epoch 2, Loss: 0.05873192230743977
Validation Loss: 2.0643023872240023, Total Accuracy: 61.19%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 60.61%
Epoch 3, Loss: 0.04576335646076283
Validation Loss: 2.0903198697434906, Total Accuracy: 55.52%
Accuracy per class - G1: 0.00%, G2: 36.75%, G3: 94.70%
Epoch 4, Loss: 0.0012593339874405905
Validation Loss: 1.6908757906745782, Total Accuracy: 74.63%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 94.70%
Epoch 5, Loss: 0.0006579747636755605
Validation Loss: 1.5838441643863916, Total Accuracy: 75.22%
Accuracy per class - G1: 0.00%, G2: 78.92%, G3: 91.67%
Epoch 6, Loss: 0.0019885401911273626
Validation Loss: 1.9976299589669162, Total Accuracy: 49.25%
Accuracy per class - G1: 0.00%, G2: 34.34%, G3: 81.82%
Epoch 7, Loss: 0.06524496578115052
Validation Loss: 2.0845895165707176, Total Accuracy: 68.06%

[34m[1mwandb[0m:                                                                                
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run history:
[34m[1mwandb[0m:          G1_Acc ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
[34m[1mwandb[0m:          G2_Acc ▆▁▁█▇█▆▆▄█▃▆▅▆▆▅▆▆▆▆▇▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆
[34m[1mwandb[0m:          G3_Acc ▄▇▆▅▆▅▆█▅▁▆▆▆▆▆▆▆▅▅▅▄▄▄▄▄▅▄▄▅▆▅▄▄▅▆▄▅▅▅▅
[34m[1mwandb[0m:  Total Accuracy ▅▃▁▇▇▇▆█▃▄▄▆▅▆▆▅▆▆▆▆▆▅▅▅▅▆▅▅▆▆▆▅▅▅▆▅▅▅▆▆
[34m[1mwandb[0m: Validation Loss ▄▃▃█▃▃▃▁▅▅▂▃▃▂▃▄▃▃▃▃▂▂▂▂▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run summary:
[34m[1mwandb[0m:          G1_Acc 0.0
[34m[1mwandb[0m:          G2_Acc 77.71084
[34m[1mwandb[0m:          G3_Acc 78.0303
[34m[1mwandb[0m:  Total Accuracy 69.25373
[34m[1mwandb[0m: Validation Loss 1.74507
[34m[1mwandb[0m: 
[34m[1mwandb[0m: 🚀 View run [33mmicrosoft/resnet-18 new[0m at: [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training/runs/ozpj3373[0

### Resnet-34

In [17]:

model = ResNetForImageClassification.from_pretrained('microsoft/resnet-34')
model.classifier[-1] = nn.Linear(model.classifier[-1].in_features, UnimodalCTDataset.num_classes) #Adjusting the final layer to the unimodal number of classes

config.json:   0%|          | 0.00/69.5k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/87.3M [00:00<?, ?B/s]

In [18]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

ResNetForImageClassification(
  (resnet): ResNetModel(
    (embedder): ResNetEmbeddings(
      (embedder): ResNetConvLayer(
        (convolution): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
        (normalization): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (activation): ReLU()
      )
      (pooler): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    )
    (encoder): ResNetEncoder(
      (stages): ModuleList(
        (0): ResNetStage(
          (layers): Sequential(
            (0): ResNetBasicLayer(
              (shortcut): Identity()
              (layer): Sequential(
                (0): ResNetConvLayer(
                  (convolution): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
                  (normalization): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
                  (activation): ReLU()
           

In [19]:
config={
    "learning_rate": 1e-3,
    "architecture": "microsoft/resnet-34 new",
    "epochs": 100,
    "weight_decay": 1e-3,
    "reduce_lr_factor": 0.2,
    "patience": 20
    }
train(model, config, run_name = config["architecture"])


[34m[1mwandb[0m: wandb version 0.18.1 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade
[34m[1mwandb[0m: Tracking run with wandb version 0.17.7
[34m[1mwandb[0m: Run data is saved locally in [35m[1m/kaggle/working/wandb/run-20240917_103632-k727aj4g[0m
[34m[1mwandb[0m: Run [1m`wandb offline`[0m to turn off syncing.
[34m[1mwandb[0m: Syncing run [33mmicrosoft/resnet-34 new[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training/runs/k727aj4g[0m


Epoch 1, Loss: 0.20202610432493706
Validation Loss: 5.608174627477473, Total Accuracy: 54.93%
Accuracy per class - G1: 0.00%, G2: 100.00%, G3: 13.64%
Epoch 2, Loss: 0.07349865093581481
Validation Loss: 2.0374332565221596, Total Accuracy: 65.67%
Accuracy per class - G1: 0.00%, G2: 68.07%, G3: 81.06%
Epoch 3, Loss: 0.04099719719346871
Validation Loss: 3.1141539784994965, Total Accuracy: 42.39%
Accuracy per class - G1: 8.11%, G2: 75.30%, G3: 10.61%
Epoch 4, Loss: 0.0738862558436166
Validation Loss: 9.00469750652767, Total Accuracy: 30.45%
Accuracy per class - G1: 0.00%, G2: 0.00%, G3: 77.27%
Epoch 5, Loss: 0.04712846941967857
Validation Loss: 3.6497657205062835, Total Accuracy: 57.01%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 50.00%
Epoch 6, Loss: 0.004149384109243531
Validation Loss: 3.3041919232557784, Total Accuracy: 58.81%
Accuracy per class - G1: 0.00%, G2: 75.30%, G3: 54.55%
Epoch 7, Loss: 0.05527411167925796
Validation Loss: 4.507976371921938, Total Accuracy: 40.90%
Accuracy 

[34m[1mwandb[0m:                                                                                
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run history:
[34m[1mwandb[0m:          G1_Acc ▁▂▁▁▁▁██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
[34m[1mwandb[0m:          G2_Acc █▄▄▁▃▄▄▆▅▄▅▄▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆
[34m[1mwandb[0m:          G3_Acc ▁▁▅█▇▆▁▆▆▅▄▆▅▅▆▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅
[34m[1mwandb[0m:  Total Accuracy ▃▁▄▅▅▅▃█▅▄▄▅▅▅▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆▆
[34m[1mwandb[0m: Validation Loss █▄▄▁▂▂▁▁▄▃▃▁▂▃▁▂▂▂▃▃▄▃▆▄▄▄▇▅▅▄▄▄▄▄▄▄▄▅▅▄
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run summary:
[34m[1mwandb[0m:          G1_Acc 0.0
[34m[1mwandb[0m:          G2_Acc 89.75904
[34m[1mwandb[0m:          G3_Acc 65.15152
[34m[1mwandb[0m:  Total Accuracy 70.14925
[34m[1mwandb[0m: Validation Loss 3.02514
[34m[1mwandb[0m: 
[34m[1mwandb[0m: 🚀 View run [33mmicrosoft/resnet-34 new[0m at: [34m[4mhttps://wandb.ai/pietro-caforio-politecnico-di-milano/unimodal_ct_training/runs/k727aj4g[