# Other Non- causal Basline Models for comparison with our CAPRI-CT

The non-causal baseline models is used as a reference point to compare against the causal-aware Capri-CT model. Unlike Capri-CT, which integrates causal reasoning to understand how interventions affect outcomes, the baseline model relies solely on correlational patterns in the data without explicitly modeling causal relationships.

These baseline models typically consists of a ResNet model , SqueezeNet model and Densenet model(Unused for comparison). It predicts outcomes like Signal-to-Noise Ratio (SNR) based on observed features, without accounting for causal interventions.

While effective at capturing associations, the non-causal baseline lacks robustness to changes caused by interventions or shifts in the data distribution. Therefore, it provides a meaningful benchmark to demonstrate the advantages of causal-aware models like Capri-CT in terms of interpretability, generalization, and handling of counterfactual scenarios.


In [None]:
######################################################################################
# Importing the required libraries for other Baseline Non - Causal Models
######################################################################################

import pandas as pd
import os
from PIL import Image
import random
import numpy as np
import torch
import torch.nn as nn
from pathlib import Path
import torch.nn.functional as F
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split
from torch.utils.data import Subset
from sklearn.metrics import mean_absolute_error,mean_squared_error, r2_score
from torch.utils.data import Dataset, DataLoader, Subset
from torch.optim import Adam, lr_scheduler
from torch.optim.lr_scheduler import ReduceLROnPlateau
from torchvision.models import densenet121, DenseNet121_Weights
from torchvision.models import squeezenet1_1, SqueezeNet1_1_Weights

In [None]:
##########################################################################################
# Below is the Class CTDataset 
# Combining the CT image with the metadata using Dataset package
# for SNR prediction
##########################################################################################

class CTDataset(Dataset):
    """
    PyTorch Dataset for loading CT scan images and associated metadata.

    Args:
        metadata_csv (str or Path): Path to the CSV file containing metadata.
        img_folder_path (str or Path): Directory containing CT scan image files.
        transform (callable, optional): Transformations to apply to the images.

    Attributes:
        img_data (pd.DataFrame): DataFrame containing the metadata.
        img_folder (Path): Path to the image folder.
        transform (callable or None): Optional transform to apply to images.

    Methods:
        __getitem__(idx): Returns a single data sample consisting of:
            - transformed image tensor (grayscale),
            - one-hot encoded agent vector (tensor),
            - voltage (tensor),
            - time (tensor),
            - SNR (tensor).
        __len__(): Returns the total number of samples.
    """
    def __init__(self, metadata_csv, img_folder_path, transform=None):
        self.img_data = pd.read_csv(metadata_csv)
        self.img_folder = img_folder_path
        self.transform = transform

    def __getitem__(self, idx):
        row = self.img_data.iloc[idx]
        img = Image.open(os.path.join(self.img_folder, row['Filename'])).convert('L')
        if self.transform:
            image = self.transform(img)

        agent_dict = {'Iodine': 0, 'BiNPs 50nm': 1, 'BiNPs 100nm': 2}
        agent_vector = torch.zeros(len(agent_dict))
        agent_vector[agent_dict[row['Classification']]] = 1

        voltage = torch.tensor([row['Voltage']], dtype=torch.float32)
        time = torch.tensor([row['Time']], dtype=torch.float32)
        snr = torch.tensor([row['SNR']], dtype=torch.float32)

        return image, agent_vector, voltage, time, snr

    def __len__(self):
        return len(self.img_data)

In [None]:
############################################################################################
# get_data_loaders function:
# Loads a CT dataset with images and metadata.
# Splits the dataset into training and validation sets.
# Applies image transformations.
############################################################################################

def get_data_loaders(seed, test_ind=False, model_name='sample'):
    """
    Returns PyTorch DataLoaders for training and validation splits of the CT dataset.

    Args:
        seed (int): Random seed for reproducibility of dataset splits.
        test_ind (bool): If True, returns a small subset of the dataset for testing/debugging.
        model_name (str): Specifies model type ('resnet', 'squeezenet', 'densenet', or 'sample') 
                          to adjust input image resolution accordingly.

    Returns:
        Tuple[DataLoader, DataLoader]: DataLoaders for training and validation sets.
    """
    torch.manual_seed(seed)
    generator = torch.Generator().manual_seed(seed)

    if model_name=='resnet':
        transform = transforms.Compose([
        transforms.Resize((9, 9)),  
        transforms.ToTensor()
        ])
    elif model_name =='squeezenet' or model_name == 'densenet':
        transform = transforms.Compose([
        transforms.Resize((128, 128)),  
        transforms.ToTensor()
        ])
    else:
        transform = transforms.Compose([
        transforms.Resize((9, 9)),  
        transforms.ToTensor()
        ])
    

    base_path = Path("../dataset")
    
    
    dataset = CTDataset(
        metadata_csv= base_path / "final_dataset.csv",
        img_folder_path= base_path / "img" ,
        transform=transform
    )

    if test_ind:
        tiny_subset = torch.utils.data.Subset(dataset, indices=list(range(20)))
        train_set, val_set = torch.utils.data.random_split(tiny_subset, [16, 4])
        train_loader = DataLoader(train_set, batch_size=2, shuffle=True)
        val_loader = DataLoader(val_set, batch_size=2)

    else:
        # Split
        train_indices, val_indices = train_test_split(list(range(len(dataset))), test_size=0.2, random_state=seed)
        train_subset = Subset(dataset, train_indices)
        val_subset = Subset(dataset, val_indices)

        # DataLoaders
        train_loader = DataLoader(train_subset, batch_size=16, shuffle=True, generator=generator)
        val_loader = DataLoader(val_subset, batch_size=16, shuffle=False)
    return train_loader, val_loader

In [None]:
###############################################################
# Setting the seed value for each training loop
###############################################################

def set_seed(seed):
    """
    Sets the random seed across Python, NumPy, and PyTorch (CPU and GPU) for reproducibility.

    Parameters
    ----------
    seed : int
        The seed value to ensure deterministic behavior across runs.
    """
    
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)

In [None]:
#################################################################################
# Below is our Simple Resnet Baseline Non - Causal Model
#################################################################################

class SimpleResidualBlock1(nn.Module):
    """
    A basic residual block with two convolutional layers and an optional downsampling path.

    Args:
        in_channels (int): Number of input channels.
        out_channels (int): Number of output channels.
        downsample (bool): Whether to downsample the input (by stride=2) to match output size.

    Structure:
        - Conv → BN → ReLU → Conv → BN
        - Optional downsampling for residual path
        - Residual connection added to output
    """

    def __init__(self, in_channels, out_channels, downsample=False):
        super().__init__()
        stride = 2 if downsample else 1

        self.conv1 = nn.Conv2d(in_channels, out_channels, 3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, 3, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)

        self.downsample = None
        if downsample or in_channels != out_channels:
            self.downsample = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, 1, stride=stride, bias=False),
                nn.BatchNorm2d(out_channels),
            )

    def forward(self, x):
        identity = x

        out = F.relu(self.bn1(self.conv1(x)), inplace=True)
        out = self.bn2(self.conv2(out))

        if self.downsample is not None:
            identity = self.downsample(identity)

        out += identity
        return F.relu(out, inplace=True)

class CTResNetModel(nn.Module):
    """
    A ResNet-based model for CT image analysis with metadata fusion for SNR prediction.

    Args:
        image_channels (int): Number of input image channels (default is 1 for grayscale CT).
        meta_dim (int): Dimension of the metadata vector (default is 5).

    Architecture:
        - Convolutional stem followed by stacked residual blocks
        - Adaptive pooling to flatten image features
        - Metadata (e.g., voltage, time, agent) passed through a feedforward network
        - Image and metadata features are concatenated and passed through fully connected layers
        - Outputs a single SNR prediction (regression)

    Forward Inputs:
        - image (Tensor): CT image tensor of shape (B, C, H, W)
        - agent_vector (Tensor): Encoded vector for contrast agent
        - voltage (Tensor): Voltage metadata (B, 1)
        - time (Tensor): Time metadata (B, 1)

    Returns:
        - snr (Tensor): Predicted SNR value for each input image (B, 1)
    """
    
    def __init__(self, image_channels=1, meta_dim=5):
        super().__init__()

        self.stem_channels = 64

        self.stem = nn.Sequential(
            nn.Conv2d(image_channels, self.stem_channels, 3, padding=1, bias=False),
            nn.BatchNorm2d(self.stem_channels),
            nn.ReLU(inplace=True)
        )

        self.blocks = nn.Sequential(
            SimpleResidualBlock1(self.stem_channels, 64, downsample=False),   # 9x9 -> 9x9
            SimpleResidualBlock1(64, 128, downsample=True),                   # 9x9 -> 5x5
            SimpleResidualBlock1(128, 128, downsample=False),                 # 5x5 -> 5x5
            SimpleResidualBlock1(128, 256, downsample=True),                  # 5x5 -> 3x3
            SimpleResidualBlock1(256, 256, downsample=False),                 # 3x3 -> 3x3
        )

        self.adaptive_pool = nn.AdaptiveAvgPool2d((1, 1))
        self.flat_dim = 256

        # Meta network stays the same but outputs 32 dims
        self.meta_net = nn.Sequential(
            nn.Linear(meta_dim, 64),
            nn.ReLU(inplace=True),
            nn.Linear(64, 32),
            nn.ReLU(inplace=True)
        )

        # Fully connected layers now take fused feature + meta together
        self.fc1 = nn.Sequential(
            nn.Linear(self.flat_dim + 32, 256),
            nn.BatchNorm1d(256),
            nn.ReLU(inplace=True),
            nn.Dropout(0.2)
        )
        self.fc2 = nn.Sequential(
            nn.Linear(256, 128),
            nn.BatchNorm1d(128),
            nn.ReLU(inplace=True),
            nn.Dropout(0.2)
        )

        self.head_snr = nn.Linear(128, 1)

        self._init_weights()

    def _init_weights(self):
        for m in self.modules():
            if isinstance(m, (nn.Conv2d, nn.Linear)):
                nn.init.kaiming_normal_(m.weight, nonlinearity='relu')
                if m.bias is not None:
                    nn.init.zeros_(m.bias)

    def forward(self, image, agent_vector, voltage, time):
        x = self.stem(image)
        x = self.blocks(x)
        x = self.adaptive_pool(x)
        x = torch.flatten(x, 1)

        meta = torch.cat([voltage, time, agent_vector], dim=1)
        meta = self.meta_net(meta)

        fused = torch.cat([x, meta], dim=1)
        fused = self.fc1(fused)
        fused = self.fc2(fused)

        snr = self.head_snr(fused)

        return snr

In [None]:
#################################################################################
# Below is our SqueezeNet Baseline Non - Causal Model
#################################################################################

class SqueezeNetSNR(nn.Module):
    """
    A lightweight SqueezeNet-based model for predicting Signal-to-Noise Ratio (SNR) from CT images and metadata.

    Args:
        input_dim (int): Dimension of metadata input (default=5), typically from agent_vector, voltage, and time.

    Architecture:
        - Uses a pretrained SqueezeNet (v1.1) backbone with modified first layer for grayscale images.
        - Extracted CNN features are combined with metadata.
        - Fused features are passed through fully connected layers to predict a single SNR value.

    Forward Inputs:
        - image (Tensor): Grayscale CT image tensor of shape (B, 1, H, W)
        - agent_vector (Tensor): One-hot or embedded contrast agent vector (B, *)
        - voltage (Tensor): Scalar voltage input (B, 1)
        - time (Tensor): Scalar acquisition time (B, 1)

    Returns:
        - snr_out (Tensor): Predicted SNR values for each image in the batch (B, 1)
    """
    
    def __init__(self, input_dim=5):  # input_dim = len(agent_vector + voltage + time)
        super(SqueezeNetSNR, self).__init__()

        weights = SqueezeNet1_1_Weights.DEFAULT  
        self.backbone = squeezenet1_1(weights=weights)

        # Modify the first conv layer to accept 1-channel (grayscale) input
        self.backbone.features[0] = nn.Conv2d(1, 64, kernel_size=3, stride=2)

        self.backbone.classifier = nn.Identity()

        self.global_avg_pool = nn.AdaptiveAvgPool2d((1, 1))

        # Combine CNN features with metadata (5 inputs)
        self.fc1 = nn.Linear(512 + input_dim, 256)
        self.fc2 = nn.Linear(256, 64)
        self.fc_out = nn.Linear(64, 1) 

    def forward(self, image, agent_vector, voltage, time):
        # Feature extraction
        x = self.backbone.features(image)            # (B, 512, H, W)
        x = self.global_avg_pool(x)                  # (B, 512, 1, 1)
        x = torch.flatten(x, 1)                      # (B, 512)

        # Metadata
        meta = torch.cat([voltage, time, agent_vector], dim=1)  # (B, 5)

        # Fully connected layers
        x = torch.cat([x, meta], dim=1)              # (B, 512 + 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        snr_out = self.fc_out(x)

        return snr_out


In [None]:
##############################################################
# Train one epoch
##############################################################
def train_one_epoch(model, dataloader, optimizer, criterion, device):
    """
    Trains the model for one epoch.

    Args:
        model (nn.Module): The neural network model.
        dataloader (DataLoader): DataLoader for training data.
        optimizer (torch.optim.Optimizer): Optimizer used for training.
        criterion (nn.Module): Loss function.
        device (torch.device): Device to run computations on (CPU or CUDA).

    Returns:
        float: Average training loss for the epoch.
    """
    model.train()
    running_loss = 0.0
    for images, agent_vector, voltage, time, snr_targets in dataloader:
        images = images.to(device)
        voltage = voltage.to(device)
        time = time.to(device)
        agent_vector = agent_vector.to(device)
        snr_targets = snr_targets.to(device)
        optimizer.zero_grad()
        snr_preds = model(images, agent_vector, voltage, time)
        loss = criterion(snr_preds.squeeze(), snr_targets.squeeze())
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * images.size(0)

    epoch_loss = running_loss / len(dataloader.dataset)
    return epoch_loss

############################################################################
# Evaluates the model on validation data for one epoch.
############################################################################
def validate_one_epoch(model, dataloader, criterion, device):
    """
    Evaluates the model on validation data for one epoch.

    Args:
        model (nn.Module): The neural network model.
        dataloader (DataLoader): DataLoader for validation data.
        criterion (nn.Module): Loss function.
        device (torch.device): Device to run computations on (CPU or CUDA).

    Returns:
        float: Average validation loss for the epoch.
    """
    model.eval()
    running_loss = 0.0
    with torch.no_grad():
        for images, agent_vector, voltage, time, snr_targets in dataloader:
            images = images.to(device)
            voltage = voltage.to(device)
            time = time.to(device)
            agent_vector = agent_vector.to(device)
            snr_targets = snr_targets.to(device)
            snr_preds = model(images, agent_vector, voltage, time)
            loss = criterion(snr_preds.squeeze(), snr_targets.squeeze())
            running_loss += loss.item() * images.size(0)

    epoch_loss = running_loss / len(dataloader.dataset)
    return epoch_loss

############################################################################################
# Trains the model over multiple epochs with early stopping and learning rate scheduling
############################################################################################
def train_model_new(model, train_loader, val_loader, device, epochs=100, lr=1e-4, patience=7):
    """
    Trains the model over multiple epochs with early stopping and learning rate scheduling.

    Args:
        model (nn.Module): The neural network model.
        train_loader (DataLoader): DataLoader for training data.
        val_loader (DataLoader): DataLoader for validation data.
        device (torch.device): Device to run computations on (CPU or CUDA).
        epochs (int): Maximum number of training epochs.
        lr (float): Initial learning rate.
        patience (int): Number of epochs to wait for improvement before early stopping.

    Saves:
        best_model.pth: The model weights with the lowest validation loss.

    Returns:
        None
    """
    model.to(device)
    optimizer = Adam(model.parameters(), lr=lr)
    criterion = nn.MSELoss()
    scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.5, patience=5)

    best_val_loss = float('inf')
    epochs_no_improve = 0

    for epoch in range(1, epochs + 1):
        train_loss = train_one_epoch(model, train_loader, optimizer, criterion, device)
        val_loss = validate_one_epoch(model, val_loader, criterion, device)

        print(f"Epoch {epoch}/{epochs} — Train Loss: {train_loss:.4f}, Validation Loss: {val_loss:.4f}")
        scheduler.step(val_loss)
        print(f"Learning rate: {scheduler.optimizer.param_groups[0]['lr']:.6f}")

        if val_loss < best_val_loss:
            best_val_loss = val_loss
            epochs_no_improve = 0
            torch.save(model.state_dict(), "best_model.pth")
            print("Best model saved.")
        else:
            epochs_no_improve += 1
            print(f"No improvement for {epochs_no_improve} epoch(s).")

        if epochs_no_improve >= patience:
            print(f"Early stopping triggered after {epoch} epochs.")
            break

In [None]:
#################################################################################
# Below is our DenseNet Baseline Non - Causal Model
#################################################################################

class DenseNetSNR(nn.Module):
    """
    DenseNet-based neural network for predicting SNR from grayscale CT images and metadata.

    Combines image features extracted using a pre-trained DenseNet121 (adapted for 1-channel input)
    with additional input metadata (contrast agent vector, voltage, and time).

    Args:
        input_dim (int): Dimensionality of metadata input (default is 5, combining voltage, time, and agent vector).

    Forward Inputs:
        image (Tensor): Grayscale CT image tensor of shape (B, 1, H, W).
        agent_vector (Tensor): Encoded contrast agent metadata of shape (B, n).
        voltage (Tensor): Voltage values of shape (B, 1).
        time (Tensor): Exposure time values of shape (B, 1).

    Returns:
        Tensor: Predicted SNR values of shape (B, 1).
    """
    
    def __init__(self, input_dim=5):  # input_dim = len(agent_vector + voltage + time)
        super(DenseNetSNR, self).__init__()

        # Load pre-trained DenseNet121
        weights = DenseNet121_Weights.DEFAULT
        self.backbone = densenet121(weights=weights)

        # Modify input layer if grayscale
        self.backbone.features.conv0 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)

        self.backbone.classifier = nn.Identity()
        self.global_pool = nn.AdaptiveAvgPool2d((1, 1))

        # Combine CNN features with metadata
        self.fc1 = nn.Linear(1024 + input_dim, 256)
        self.fc2 = nn.Linear(256, 64)
        self.fc_out = nn.Linear(64, 1) 

    def forward(self, image, agent_vector, voltage, time):
        x = self.backbone.features(image)         # (B, 1024, H, W)
        x = self.global_pool(x)                   # (B, 1024, 1, 1)
        x = torch.flatten(x, 1)                   # (B, 1024)

        # Metadata (voltage, time, agent_vector)
        meta = torch.cat([voltage, time, agent_vector], dim=1)  # (B, input_dim)
        x = torch.cat([x, meta], dim=1)                         # (B, 1029)

        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        snr_out = self.fc_out(x)

        return snr_out


In [None]:
##################################################################################################
# Trains an ensemble of deep learning models on CT image data for SNR prediction.
##################################################################################################

def train_deep_ensemble(num_models=5, base_seed=42, pretrained=True, model_name='default'):
    """
    Trains an ensemble of deep learning models on CT image data for SNR prediction.

    Each model in the ensemble is trained with a different random seed to encourage diversity.

    Args:
        num_models (int): Number of models to train in the ensemble. Default is 5.
        base_seed (int): Base seed for reproducibility. Each model will use (base_seed + i). Default is 42.
        pretrained (bool): Whether to use pre-trained weights for backbone networks. Currently not used. Default is True.
        model_name (str): Model architecture to use — 'resnet', 'squeezenet', or 'densenet'. Default is 'default'.

    Returns:
        list: A list of trained model instances forming the ensemble.
    """
    ensemble_models_list = []
    
    for i in range(num_models):
        seed = base_seed + i
        print(f"\n🔁 Training model {i+1}/{num_models} with seed {seed}")
        set_seed(seed)

        if model_name=='resnet':
            model = CTResNetModel()
            lr_value = 1e-4
        elif model_name == 'squeezenet':
            model = SqueezeNetSNR(input_dim=5)
            lr_value = 1e-3
        elif model_name == 'densenet':
            model = DenseNetSNR(input_dim=5)
            lr_value = 1e-3
            
        train_loader, val_loader = get_data_loaders(seed=seed, test_ind=False, model_name=model_name)
        train_model_new(model, train_loader, val_loader,device='cpu',lr=lr_value)
        ensemble_models_list.append(model)

    return ensemble_models_list

In [None]:
##############################################################
# Evaluates a trained model on a given dataloader
# computes regression metrics for SNR prediction.
##############################################################
def evaluate_model(model, dataloader, device):
    """
    Evaluates a trained model on a given dataloader and computes regression metrics for SNR prediction.

    Args:
        model (torch.nn.Module): Trained model to be evaluated.
        dataloader (DataLoader): DataLoader containing the test/validation dataset.
        device (torch.device or str): Device to run evaluation on (e.g., 'cpu' or 'cuda').

    Returns:
        dict: Dictionary containing Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R² score for SNR prediction.
    """
    model.eval()
    
    all_snr_true = []
    all_snr_pred = []

    with torch.no_grad():
        for batch in dataloader:
            images = batch[0].to(device)
            agent_vector = batch[1].to(device)
            voltage = batch[2].to(device)
            time = batch[3].to(device)
            y_true_snr = batch[4].to(device)

            y_pred_snr = model(images, agent_vector, voltage, time)

            all_snr_true.append(y_true_snr.cpu().numpy())
            all_snr_pred.append(y_pred_snr.cpu().numpy())

    # Convert lists to numpy arrays
    all_snr_true = np.concatenate(all_snr_true).flatten()
    all_snr_pred = np.concatenate(all_snr_pred).flatten()

    # Calculate metrics for SNR
    snr_mae = mean_absolute_error(all_snr_true, all_snr_pred)
    snr_rmse = np.sqrt(mean_squared_error(all_snr_true, all_snr_pred))
    snr_r2 = r2_score(all_snr_true, all_snr_pred)

    print(f"SNR -> MAE: {snr_mae:.4f}, RMSE: {snr_rmse:.4f}, R2: {snr_r2:.4f}")

    return {
        'snr_mae': snr_mae,
        'snr_rmse': snr_rmse,
        'snr_r2': snr_r2
    }


In [None]:
################################################################
# Evaluates an ensemble of trained models on a validation set
# estimates prediction uncertainty
################################################################

def evaluate_models(ensemble_models, seed, device='cpu'):
    """
    Evaluates an ensemble of trained models on a validation set and estimates prediction uncertainty.

    Args:
        ensemble_models (list): List of trained PyTorch models.
        seed (int): Random seed used for reproducibility in data loading.
        device (str): Device to run the evaluation on ('cpu' or 'cuda').

    Returns:
        tuple:
            - preds_mean (np.ndarray): Mean SNR predictions across ensemble models.
            - preds_std (np.ndarray): Standard deviation of SNR predictions (uncertainty estimate).
            - targets (np.ndarray): Ground truth SNR values.
    """
    train_loader, val_loader = get_data_loaders(seed)

    preds_mean, preds_std, targets = [], [], []

    for image, agent_vector, voltage, time, snr in val_loader:
        image = image.to(device)
        agent_vector = agent_vector.to(device)
        voltage = voltage.to(device)
        time = time.to(device)
        snr = snr.to(device)

        batch_preds = []
        with torch.no_grad():
            for model in ensemble_models:
                model.eval()
                model.to(device)
                output = model(image, agent_vector, voltage, time)
                batch_preds.append(output.cpu())

        batch_preds = torch.stack(batch_preds)  # [num_models, B, 1]
        mean_pred = batch_preds.mean(dim=0).squeeze().numpy()     # [B, 1]
        std_pred = batch_preds.std(dim=0).squeeze().numpy()       # [B, 1]

        preds_mean.extend(mean_pred)
        preds_std.extend(std_pred)
        targets.extend(snr.cpu().numpy())

    preds_mean = np.array(preds_mean)
    preds_std = np.array(preds_std)
    targets = np.array(targets)

    

    return preds_mean, preds_std, targets

In [None]:
#############################################
# Train the resnet model 
#############################################
resnet_models = train_deep_ensemble(num_models=5,model_name='resnet')


🔁 Training model 1/5 with seed 42
Epoch 1/100 — Train Loss: 62033.3712, Validation Loss: 71287.0727
Learning rate: 0.000100
Best model saved.
Epoch 2/100 — Train Loss: 61720.8282, Validation Loss: 71054.2485
Learning rate: 0.000100
Best model saved.
Epoch 3/100 — Train Loss: 61516.8896, Validation Loss: 70771.4437
Learning rate: 0.000100
Best model saved.
Epoch 4/100 — Train Loss: 61291.5965, Validation Loss: 70483.0904
Learning rate: 0.000100
Best model saved.
Epoch 5/100 — Train Loss: 61025.5429, Validation Loss: 69884.3221
Learning rate: 0.000100
Best model saved.
Epoch 6/100 — Train Loss: 60557.6296, Validation Loss: 70370.6703
Learning rate: 0.000100
No improvement for 1 epoch(s).
Epoch 7/100 — Train Loss: 59971.5350, Validation Loss: 69693.2765
Learning rate: 0.000100
Best model saved.
Epoch 8/100 — Train Loss: 59155.7133, Validation Loss: 63758.9907
Learning rate: 0.000100
Best model saved.
Epoch 9/100 — Train Loss: 58138.4520, Validation Loss: 67582.0668
Learning rate: 0.00010

In [None]:
################################################
# Results of each individual Resnet model
################################################

for i in range(5):
    seed = 42
    _, val_loader = get_data_loaders(seed=seed, test_ind=False, model_name='resnet')
    evaluate_model(model=resnet_models[i], dataloader=val_loader, device='cpu')

SNR -> MAE: 115.5818, RMSE: 180.9482, R2: 0.4706
SNR -> MAE: 136.5066, RMSE: 219.0683, R2: 0.2241
SNR -> MAE: 79.3188, RMSE: 117.2559, R2: 0.7777
SNR -> MAE: 137.8658, RMSE: 221.3034, R2: 0.2081
SNR -> MAE: 136.0369, RMSE: 219.1944, R2: 0.2232


In [None]:
#############################################
# Train the Squeezenet model 
#############################################
squeezenet_models = train_deep_ensemble(num_models=5, model_name='squeezenet')


🔁 Training model 1/5 with seed 42
Epoch 1/100 — Train Loss: 55654.7663, Validation Loss: 68966.9488
Learning rate: 0.001000
Best model saved.
Epoch 2/100 — Train Loss: 55569.8090, Validation Loss: 61135.3118
Learning rate: 0.001000
Best model saved.
Epoch 3/100 — Train Loss: 55044.1738, Validation Loss: 62999.9551
Learning rate: 0.001000
No improvement for 1 epoch(s).
Epoch 4/100 — Train Loss: 55234.9219, Validation Loss: 60939.9731
Learning rate: 0.001000
Best model saved.
Epoch 5/100 — Train Loss: 54493.9528, Validation Loss: 61625.4676
Learning rate: 0.001000
No improvement for 1 epoch(s).
Epoch 6/100 — Train Loss: 54254.0632, Validation Loss: 59931.9180
Learning rate: 0.001000
Best model saved.
Epoch 7/100 — Train Loss: 53940.9768, Validation Loss: 59566.0296
Learning rate: 0.001000
Best model saved.
Epoch 8/100 — Train Loss: 53621.3525, Validation Loss: 60182.3508
Learning rate: 0.001000
No improvement for 1 epoch(s).
Epoch 9/100 — Train Loss: 53400.3672, Validation Loss: 59523.2

In [None]:
################################################
# Results of each individual Squeezenet model
################################################

for i in range(5):
    seed = 42
    _, val_loader = get_data_loaders(seed=seed, test_ind=False, model_name='squeezenet')
    evaluate_model(model=squeezenet_models[i], dataloader=val_loader, device='cpu')

SNR -> MAE: 100.1274, RMSE: 145.5695, R2: 0.6574
SNR -> MAE: 100.1073, RMSE: 147.7848, R2: 0.6469
SNR -> MAE: 101.5361, RMSE: 148.8427, R2: 0.6418
SNR -> MAE: 101.3464, RMSE: 146.9787, R2: 0.6507
SNR -> MAE: 94.2543, RMSE: 139.2030, R2: 0.6867


In [None]:
#############################################
# Train the densenet model 
#############################################
densenet_models = train_deep_ensemble(num_models=5, model_name='densenet')


🔁 Training model 1/5 with seed 42
Epoch 1/100 — Train Loss: 54852.8147, Validation Loss: 61337.8161
Learning rate: 0.001000
Best model saved.
Epoch 2/100 — Train Loss: 53326.5396, Validation Loss: 59946.9681
Learning rate: 0.001000
Best model saved.
Epoch 3/100 — Train Loss: 53266.7510, Validation Loss: 164628.3750
Learning rate: 0.001000
No improvement for 1 epoch(s).
Epoch 4/100 — Train Loss: 52578.3738, Validation Loss: 57924.0334
Learning rate: 0.001000
Best model saved.
Epoch 5/100 — Train Loss: 51719.4950, Validation Loss: 56702.6766
Learning rate: 0.001000
Best model saved.
Epoch 6/100 — Train Loss: 49994.5407, Validation Loss: 76025.3449
Learning rate: 0.001000
No improvement for 1 epoch(s).
Epoch 7/100 — Train Loss: 44353.6264, Validation Loss: 51655.7658
Learning rate: 0.001000
Best model saved.
Epoch 8/100 — Train Loss: 31616.4647, Validation Loss: 27470.4952
Learning rate: 0.001000
Best model saved.
Epoch 9/100 — Train Loss: 23355.9703, Validation Loss: 84142.3860
Learning

In [None]:
################################################
# Results of each individual Densenet model
################################################

for i in range(5):
    seed = 42+i
    _, val_loader = get_data_loaders(seed=seed, test_ind=False, model_name='densenet')
    evaluate_model(model=densenet_models[i], dataloader=val_loader, device='cpu')

SNR -> MAE: 101.9847, RMSE: 144.8510, R2: 0.6608
SNR -> MAE: 184.5575, RMSE: 234.6132, R2: 0.0330
SNR -> MAE: 104.3289, RMSE: 152.7661, R2: 0.6503
SNR -> MAE: 133.2465, RMSE: 176.5018, R2: 0.3771
SNR -> MAE: 123571.7500, RMSE: 132504.3906, R2: -323457.9150


Among the results of other baseline models trained above, ResNet and Squeezenet models will only be used for comparison since the Densenet model is not stable in terms of results

| Architecture     | Epochs | MAE      | RMSE     | R²     |
|------------------|--------|----------|----------|--------|
| CAPRI-CT (ours)  | 54     | 68.0280  | 106.4930 | 0.7990 |
| CNN              | 26     | 94.7015  | 141.2704 | 0.6773 |
| ResNet           | 93     | 89.3858  | 129.1232 | 0.7502 |
| SqueezeNet       | 93     | 94.2543  | 139.2030 | 0.6867 |

We compared the proposed CAPRI-CT framework against several baseline models, including CNN Baseline, ResNet and SqueezeNet. Among the baseline models, SqueezeNet demonstrated the most stable performance across all five folds. The model’s MAE values ranged narrowly from 91.45 to 102.93, and RMSE values from 140.08 to 148.71. The R2 scores also remained consistent, varying between 0.608 and 0.669.