# 2025 Kaggle Competition of AI Applied to Medicine at UC3M





Welcome to the **2025 Kaggle competition of AI applied to medicine at UC3M**. This project is set up as an **internal Kaggle competition** in which all students will participate. Our real-world challenge for this course will revolve around the **ISIC 2024** dataset, a large collection of skin images used for research in dermatology.


Welcome to our simple **ResNet50-based** starter notebook. Below we:
1. **Define** a function to load images from HDF5 files.
2. **Load** and display our training metadata (no preprocessing).
3. **Load** a pretrained **ResNet50** model (we won't fine-tune it).
4. **Evaluate** test samples (in a trivial way for demonstration).
5. **Generate** a simple `submission.csv` file with the required format.

> **Note**: This is a minimal example to help you set up your environment. It doesn’t include any real training or meaningful model inference. Feel free to modify it to perform actual classification (e.g., add custom layers, train on your dataset, etc.).

## ISIC 2024 Competition Overview

The **International Skin Imaging Collaboration (ISIC)** has launched this competition to advance automated skin cancer detection by:
- **Improving accuracy** in distinguishing malignant from benign lesions  
- **Enhancing efficiency** in clinical workflows  
- **Developing algorithms** that prioritize high-risk lesions  
- **Reducing mortality rates** by enabling earlier detection  

### Primary Task
You need to **classify skin lesions** as **benign** or **malignant**. For each lesion image (identified by `isic_id`), predict a **probability** in the range [0, 1] indicating the chance that the lesion is malignant.

### High-Level Data Summary
- The dataset is called **SLICE-3D**, containing **skin lesion images** (JPEG files) cropped from 3D Total Body Photography (TBP).  
- Each image has metadata in a corresponding `.csv` file, including:  
  - **Binary diagnostic label** (`target` = 0 or 1)  
  - **Patient data** (e.g., `age_approx`, `sex`, `anatom_site_general`)  
  - **Additional attributes** (image source, diagnosis type)

![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4972760%2F349a3ae1149d15dc5642063a2d742c88%2Fimage%20type_noexif_240425.jpg?generation=1714060307710359&alt=media)

This challenge dataset mimics **non-dermoscopic images** using standardized 15x15 mm “tiles” of lesions from a 3D TBP system. Thousands of patients from multiple continents are represented, creating a broad, diverse dataset.

### Task Description & Clinical Context
- **Why it matters**: Skin cancer can be deadly if not detected early. Many people lack access to dermatologic care, so accurate AI systems for image-based triage can improve outcomes.  
- **Key goal**: Develop a binary classifier that identifies malignant lesions from a set of smartphone-quality images.  
- **Impact**: This technology could help prioritize suspicious lesions (top K) for clinical review, especially in low-resource settings, potentially **saving lives** through earlier detection.

### Importance of 3D TBP
The **3D Total Body Photography (TBP)** approach captures the entire skin surface in macro resolution. Each lesion on the patient’s body is automatically cropped as a 15x15 mm image tile. These images more closely resemble photos taken by a regular smartphone camera, as opposed to specialized dermoscopy devices.

![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4972760%2F169b1f691322233e7b31aabaf6716ff3%2Fex-tiles.png?generation=1717700538524806&alt=media)

### Clinical Background
1. **Major skin cancer types**: Basal Cell Carcinoma (BCC), Squamous Cell Carcinoma (SCC), and Melanoma (most lethal).  
2. **Early detection** is crucial: Minor surgery can cure many skin cancers if caught in time.  
3. **Telemedicine implications**: With the rise in remote healthcare, patients often submit low-quality images captured at home. Robust AI models are needed to handle this variability.

### Summary
- You will build a model to **classify skin lesions** (benign vs. malignant) with probabilities.  
- The dataset includes **every lesion** from thousands of patients, reflecting real-world diversity.  
- **3D TBP** and the “ugly duckling sign” concept illustrate the importance of comparing each lesion against the patient’s total lesion landscape.  
- **Your work** can help improve early detection, prioritizing high-risk cases for clinical evaluation and potentially saving lives.


In [1]:
METADATA_COL2DESC = {
    "isic_id": "Unique identifier for each image case.",
    "target": "Binary class label (0 = benign, 1 = malignant).",
    "patient_id": "Unique identifier for each patient.",
    "age_approx": "Approximate age of the patient at time of imaging.",
    "sex": "Sex of the patient (male or female).",
    "anatom_site_general": "General location of the lesion on the patient's body.",
    "clin_size_long_diam_mm": "Maximum diameter of the lesion (mm).",
    "image_type": "Type of image captured, as defined in the ISIC Archive.",
    "tbp_tile_type": "Lighting modality of the 3D Total Body Photography (TBP) source image.",
    "tbp_lv_A": "Color channel A (green-red axis in LAB space) inside the lesion.",
    "tbp_lv_Aext": "Color channel A outside the lesion.",
    "tbp_lv_B": "Color channel B (blue-yellow axis in LAB space) inside the lesion.",
    "tbp_lv_Bext": "Color channel B outside the lesion.",
    "tbp_lv_C": "Chroma value inside the lesion.",
    "tbp_lv_Cext": "Chroma value outside the lesion.",
    "tbp_lv_H": "Hue value inside the lesion (LAB color space).",
    "tbp_lv_Hext": "Hue value outside the lesion.",
    "tbp_lv_L": "Luminance inside the lesion (LAB color space).",
    "tbp_lv_Lext": "Luminance outside the lesion.",
    "tbp_lv_areaMM2": "Area of the lesion in mm².",
    "tbp_lv_area_perim_ratio": "Ratio of the lesion's perimeter to its area (border jaggedness).",
    "tbp_lv_color_std_mean": "Mean color irregularity within the lesion.",
    "tbp_lv_deltaA": "Average contrast in color channel A between inside and outside.",
    "tbp_lv_deltaB": "Average contrast in color channel B between inside and outside.",
    "tbp_lv_deltaL": "Average contrast in luminance between inside and outside.",
    "tbp_lv_deltaLB": "Combined contrast between the lesion and surrounding skin.",
    "tbp_lv_deltaLBnorm": "Normalized contrast (LAB color space).",
    "tbp_lv_eccentricity": "Eccentricity of the lesion (how elongated it is).",
    "tbp_lv_location": "Detailed anatomical location (e.g., Upper Arm).",
    "tbp_lv_location_simple": "Simplified anatomical location (e.g., Arm).",
    "tbp_lv_minorAxisMM": "Smallest diameter of the lesion in mm.",
    "tbp_lv_nevi_confidence": "Confidence score (0-100) for the lesion being a nevus.",
    "tbp_lv_norm_border": "Normalized border irregularity (0-10 scale).",
    "tbp_lv_norm_color": "Normalized color variation (0-10 scale).",
    "tbp_lv_perimeterMM": "Perimeter of the lesion in mm.",
    "tbp_lv_radial_color_std_max": "Color asymmetry within the lesion, measured radially.",
    "tbp_lv_stdL": "Std. deviation of luminance inside the lesion.",
    "tbp_lv_stdLExt": "Std. deviation of luminance outside the lesion.",
    "tbp_lv_symm_2axis": "Asymmetry about a second axis of symmetry.",
    "tbp_lv_symm_2axis_angle": "Angle of that second axis of symmetry.",
    "tbp_lv_x": "X-coordinate in the 3D TBP model.",
    "tbp_lv_y": "Y-coordinate in the 3D TBP model.",
    "tbp_lv_z": "Z-coordinate in the 3D TBP model.",
    "attribution": "Image source or institution.",
    "copyright_license": "License information.",
    "lesion_id": "Unique ID for lesions of interest.",
    "iddx_full": "Full diagnosis classification.",
    "iddx_1": "First-level (broad) diagnosis.",
    "iddx_2": "Second-level diagnosis.",
    "iddx_3": "Third-level diagnosis.",
    "iddx_4": "Fourth-level diagnosis.",
    "iddx_5": "Fifth-level diagnosis.",
    "mel_mitotic_index": "Mitotic index of invasive malignant melanomas.",
    "mel_thick_mm": "Thickness of melanoma invasion in mm.",
    "tbp_lv_dnn_lesion_confidence": "Lesion confidence score (0-100) from a DNN classifier."
}

METADATA_COL2NAME = {
    "isic_id": "Unique Case Identifier",
    "target": "Binary Lesion Classification",
    "patient_id": "Unique Patient Identifier",
    "age_approx": "Approximate Age",
    "sex": "Sex",
    "anatom_site_general": "General Anatomical Location",
    "clin_size_long_diam_mm": "Clinical Size (Longest Diameter in mm)",
    "image_type": "Image Type",
    "tbp_tile_type": "TBP Tile Type",
    "tbp_lv_A": "Color Channel A (Inside)",
    "tbp_lv_Aext": "Color Channel A (Outside)",
    "tbp_lv_B": "Color Channel B (Inside)",
    "tbp_lv_Bext": "Color Channel B (Outside)",
    "tbp_lv_C": "Chroma (Inside)",
    "tbp_lv_Cext": "Chroma (Outside)",
    "tbp_lv_H": "Hue (Inside)",
    "tbp_lv_Hext": "Hue (Outside)",
    "tbp_lv_L": "Luminance (Inside)",
    "tbp_lv_Lext": "Luminance (Outside)",
    "tbp_lv_areaMM2": "Lesion Area (mm²)",
    "tbp_lv_area_perim_ratio": "Area-to-Perimeter Ratio",
    "tbp_lv_color_std_mean": "Mean Color Irregularity",
    "tbp_lv_deltaA": "Delta A",
    "tbp_lv_deltaB": "Delta B",
    "tbp_lv_deltaL": "Delta L",
    "tbp_lv_deltaLB": "Delta LB",
    "tbp_lv_deltaLBnorm": "Normalized Delta LB",
    "tbp_lv_eccentricity": "Eccentricity",
    "tbp_lv_location": "Detailed Location",
    "tbp_lv_location_simple": "Simplified Location",
    "tbp_lv_minorAxisMM": "Smallest Diameter (mm)",
    "tbp_lv_nevi_confidence": "Nevus Confidence Score",
    "tbp_lv_norm_border": "Normalized Border Irregularity",
    "tbp_lv_norm_color": "Normalized Color Variation",
    "tbp_lv_perimeterMM": "Lesion Perimeter (mm)",
    "tbp_lv_radial_color_std_max": "Radial Color Deviation",
    "tbp_lv_stdL": "Std. Dev. Luminance (Inside)",
    "tbp_lv_stdLExt": "Std. Dev. Luminance (Outside)",
    "tbp_lv_symm_2axis": "Symmetry (Second Axis)",
    "tbp_lv_symm_2axis_angle": "Symmetry Angle (Second Axis)",
    "tbp_lv_x": "X-Coordinate",
    "tbp_lv_y": "Y-Coordinate",
    "tbp_lv_z": "Z-Coordinate",
    "attribution": "Image Source",
    "copyright_license": "Copyright",
    "lesion_id": "Unique Lesion ID",
    "iddx_full": "Full Diagnosis",
    "iddx_1": "Diagnosis Level 1",
    "iddx_2": "Diagnosis Level 2",
    "iddx_3": "Diagnosis Level 3",
    "iddx_4": "Diagnosis Level 4",
    "iddx_5": "Diagnosis Level 5",
    "mel_mitotic_index": "Mitotic Index (Melanoma)",
    "mel_thick_mm": "Melanoma Thickness (mm)",
    "tbp_lv_dnn_lesion_confidence": "Lesion DNN Confidence"

}

In [2]:
import os
import h5py
import cv2
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, WeightedRandomSampler
import torchvision.transforms as T
from torchvision import models
from sklearn.metrics import roc_auc_score
from tqdm.auto import tqdm
import albumentations as A
from albumentations.pytorch import ToTensorV2
import matplotlib.pyplot as plt

# 1. Device Setup
# ---------------------------
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

# 2. Dataset Class
# ---------------------------
class ISIC_HDF5_Dataset(Dataset):
    def __init__(self, df: pd.DataFrame, hdf5_path: str, transform=None, is_labelled: bool = True):
        self.df = df.reset_index(drop=True)
        self.hdf5_path = hdf5_path
        self.transform = transform
        self.is_labelled = is_labelled
        self.hdf5_file = None  # Add this

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        if self.hdf5_file is None:
            self.hdf5_file = h5py.File(self.hdf5_path, 'r')  # Open once per worker

        row = self.df.iloc[idx]
        isic_id = row["isic_id"]
        encoded_bytes = self.hdf5_file[isic_id][()]
        image_bgr = cv2.imdecode(encoded_bytes, cv2.IMREAD_COLOR)
        image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)

        if self.transform:
            augmented = self.transform(image=image_rgb)
            image = augmented["image"]
        else:
            image = torch.from_numpy(image_rgb).permute(2, 0, 1).float()

        if self.is_labelled:
            label = torch.tensor(row["target"]).float()
            return image, label, isic_id
        else:
            return image, isic_id


    def _load_image_from_hdf5(self, isic_id: str):
        with h5py.File(self.hdf5_path, 'r') as hf:
            encoded_bytes = hf[isic_id][()]
        image_bgr = cv2.imdecode(encoded_bytes, cv2.IMREAD_COLOR)
        image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)
        return image_rgb

# 3. Load CSVs and Partition Dataset
# ---------------------------
from google.colab import drive
drive.mount('/content/drive')
path = '/content/drive/My Drive/KaggleChallenge'

TRAIN_CSV = path+"/new-train-metadata.csv"
TEST_CSV  = path+"/students-test-metadata.csv"
TRAIN_HDF5 = path+"/train-image.hdf5"
TEST_HDF5  = path+"/test-image.hdf5"

train_df = pd.read_csv(TRAIN_CSV)
test_df = pd.read_csv(TEST_CSV)

from sklearn.model_selection import train_test_split
train_df_sub, valid_df_sub = train_test_split(train_df, test_size=0.2, stratify=train_df['target'], random_state=42)

# 4. Data Augmentation with Albumentations
# ---------------------------
train_transform = A.Compose([
    A.Resize(224,224),
    A.HorizontalFlip(p=0.5),
    A.VerticalFlip(p=0.5),
    A.ShiftScaleRotate(shift_limit=0.1, scale_limit=0.2, rotate_limit=30, p=0.8),  # More aggressive rotate/scale
    A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.7),
    A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=0.5),
    A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
    ToTensorV2()
])

valid_transform = A.Compose([
    A.Resize(224, 224),
    A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
    ToTensorV2()
])


  check_for_updates()


Using device: cuda
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


  train_df = pd.read_csv(TRAIN_CSV)
  original_init(self, **validated_kwargs)


In [None]:
# 5. Dataset Instantiation
# ---------------------------
train_dataset = ISIC_HDF5_Dataset(train_df_sub, TRAIN_HDF5, transform=train_transform, is_labelled=True)
valid_dataset = ISIC_HDF5_Dataset(valid_df_sub, TRAIN_HDF5, transform=valid_transform, is_labelled=True)
test_dataset  = ISIC_HDF5_Dataset(test_df, TEST_HDF5, transform=valid_transform, is_labelled=False)

# 6. Weighted Sampler to Balance Classes
# ---------------------------
class_counts = train_df_sub['target'].value_counts()
weights = train_df_sub['target'].apply(lambda x: 1.0 / class_counts[x])
sampler = WeightedRandomSampler(weights=weights, num_samples=6000, replacement=True)

# 7. DataLoaders
# ---------------------------
BATCH_SIZE = 16
train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, sampler=sampler, num_workers=2)
valid_loader = DataLoader(valid_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2)
test_loader  = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2)


In [3]:
class FocalLoss(nn.Module):
    def __init__(self, gamma=2):
        super().__init__()
        self.gamma = gamma
        self.bce = nn.BCEWithLogitsLoss(reduction='none')  # important: no reduction here!

    def forward(self, inputs, targets):
        bce_loss = self.bce(inputs, targets)
        probs = torch.sigmoid(inputs)
        p_t = probs * targets + (1 - probs) * (1 - targets)  # p_t: prob of true class
        loss = (1 - p_t) ** self.gamma * bce_loss
        return loss.mean()

In [5]:
# 8. Model: EfficientNet + Dropout
# ---------------------------
!pip install efficientnet_pytorch
from efficientnet_pytorch import EfficientNet
model = EfficientNet.from_pretrained("efficientnet-b3")
model._fc = nn.Sequential(
    nn.Dropout(p=0.4),
    nn.Linear(model._fc.in_features, 1)
)
model = model.to(device)

# 9. Optimizer, Loss, Scheduler
# ---------------------------
criterion = FocalLoss(gamma=2)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', patience=2, factor=0.5)


Loaded pretrained weights for efficientnet-b3


In [None]:
from sklearn.model_selection import StratifiedKFold

N_FOLDS = 5
skf = StratifiedKFold(n_splits=N_FOLDS, shuffle=True, random_state=42)

EPOCHS = 5
PATIENCE = 3
BATCH_SIZE = 16

oof_preds = np.zeros(len(train_df))
oof_targets = train_df["target"].values

for fold, (train_idx, val_idx) in enumerate(skf.split(train_df, train_df["target"])):
    print(f"\n--- Fold {fold+1} ---")

    train_df_sub = train_df.iloc[train_idx].reset_index(drop=True)
    valid_df_sub = train_df.iloc[val_idx].reset_index(drop=True)

    # Datasets and Dataloaders
    train_dataset = ISIC_HDF5_Dataset(train_df_sub, TRAIN_HDF5, transform=train_transform)
    valid_dataset = ISIC_HDF5_Dataset(valid_df_sub, TRAIN_HDF5, transform=valid_transform)

    class_counts = train_df_sub['target'].value_counts()
    weights = train_df_sub['target'].apply(lambda x: 1.0 / class_counts[x])
    sampler = WeightedRandomSampler(weights=weights, num_samples=6000, replacement=True)

    train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, sampler=sampler, num_workers=2)
    valid_loader = DataLoader(valid_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=2)

    # Model, loss, optimizer
    model = EfficientNet.from_pretrained("efficientnet-b3")
    model._fc = nn.Sequential(nn.Dropout(0.4), nn.Linear(model._fc.in_features, 1))
    model = model.to(device)

    criterion = FocalLoss(gamma=2)
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode="max", patience=2, factor=0.5)

    best_auc = 0
    patience_counter = 0

    for epoch in range(1, EPOCHS+1):
        model.train()
        train_losses = []

        for images, labels, _ in tqdm(train_loader, desc=f"Epoch {epoch}"):
            images, labels = images.to(device), labels.to(device)

            optimizer.zero_grad()
            logits = model(images).view(-1)
            loss = criterion(logits, labels)
            loss.backward()
            optimizer.step()

            train_losses.append(loss.item())

        model.eval()
        val_logits, val_labels = [], []

        with torch.no_grad():
            for images, labels, _ in valid_loader:
                images = images.to(device)
                logits = model(images).view(-1)
                val_logits.extend(torch.sigmoid(logits).cpu().numpy())
                val_labels.extend(labels.numpy())

        val_auc = roc_auc_score(val_labels, val_logits)
        scheduler.step(val_auc)

        print(f"Epoch {epoch}: Train Loss={np.mean(train_losses):.4f}, Val AUC={val_auc:.4f}")

        if val_auc > best_auc:
            best_auc = val_auc
            torch.save(model.state_dict(), f"best_model_fold{fold}.pt")
            patience_counter = 0
        else:
            patience_counter += 1

        if patience_counter >= PATIENCE:
            print("Early stopping!")
            break
        torch.cuda.empty_cache()

    # Save OOF predictions
    oof_preds[val_idx] = val_logits




--- Fold 1 ---
Loaded pretrained weights for efficientnet-b3


Epoch 1:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 1: Train Loss=0.1245, Val AUC=0.9182


Epoch 2:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 2: Train Loss=0.0897, Val AUC=0.9147


Epoch 3:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 3: Train Loss=0.0729, Val AUC=0.9281


Epoch 4:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 4: Train Loss=0.0601, Val AUC=0.9292


Epoch 5:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 5: Train Loss=0.0509, Val AUC=0.9137

--- Fold 2 ---
Loaded pretrained weights for efficientnet-b3


Epoch 1:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 1: Train Loss=0.1275, Val AUC=0.9418


Epoch 2:   0%|          | 0/375 [00:00<?, ?it/s]

Epoch 2: Train Loss=0.0890, Val AUC=0.9516


Epoch 3:   0%|          | 0/375 [00:00<?, ?it/s]

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7b2687d2c9a0>Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7b2687d2c9a0>

Traceback (most recent call last):
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py", line 1618, in __del__
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py", line 1618, in __del__
        self._shutdown_workers()self._shutdown_workers()

  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py", line 1601, in _shutdown_workers
  File "/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py", line 1601, in _shutdown_workers
        if w.is_alive():
if w.is_alive():
         ^^   ^ ^ ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/process.py", line 160, in is_alive
^    ^assert self._parent_pid == os.getpid(), 'can only test a child process'
  File "/usr/lib/python3

Epoch 3: Train Loss=0.0767, Val AUC=0.9581


Epoch 4:   0%|          | 0/375 [00:00<?, ?it/s]

In [None]:
# 11. Inference using all folds
# -----------------------------
all_fold_preds = []

for fold in range(N_FOLDS):
    print(f"Loading model from fold {fold}...")
    model = EfficientNet.from_pretrained("efficientnet-b3")
    model._fc = nn.Sequential(nn.Dropout(0.4), nn.Linear(model._fc.in_features, 1))
    model.load_state_dict(torch.load(f"best_model_fold{fold}.pt"))
    model = model.to(device)
    model.eval()

    fold_preds = []

    with torch.no_grad():
        for images, isic_ids in tqdm(test_loader, desc=f"Inference Fold {fold}"):
            images = images.to(device)
            probs = torch.sigmoid(model(images).view(-1)).cpu().numpy()
            fold_preds.extend(probs)

    all_fold_preds.append(fold_preds)
    torch.cuda.empty_cache()

# Average predictions across all folds
avg_preds = np.mean(all_fold_preds, axis=0)

# Prepare submission
submission_df = pd.DataFrame({
    "isic_id": [id for _, id in test_dataset],
    "target": avg_preds
})
submission_df = submission_df.sort_values(by="isic_id").reset_index(drop=True)
submission_df.to_csv(path + "/submission_cv_ensemble.csv", index=False)
print("Ensemble submission saved.")



Inference:   0%|          | 0/7 [00:00<?, ?it/s]

Submission saved.
