# Plant Pathology 2020 - FGVC7 | MLE-Benchmark Medal Plan

Objective: Win a medal by maximizing mean-column-wise-roc-auc. Default target hint says `multiple_diseases`, but the original competition is multi-label with columns: healthy, multiple_diseases, rust, scab. I will explore train.csv to confirm targets and shape approach accordingly.

Plan:
- Data audit: load CSVs, verify columns, image paths, counts, label distributions; sanity-check image readability.
- Baseline CV: Stratified KFold (multilabel stratification if multi-label; else standard stratify) with strong image model.
- Modeling v1: pretrained CNN (timm, e.g., tf_efficientnet_b0/b3) with 384px, BCEWithLogits loss, Mixup/CutMix off initially, simple augmentations.
- Training details: 5 folds, 2-stage training (256px warmup then 384px), early stopping, cosine LR, weight decay; log fold times.
- Ensembling: if time permits, average 2 backbones (b0 + b3) or 2 seeds.
- Submission: out-of-fold AUC tracking; generate test preds and submission.csv.

Experiment Log:
- [ ] E0: Data loading + EDA
- [ ] E1: EfficientNet-b0 256px 5-fold baseline
- [ ] E2: Resize 384px finetune, TTA
- [ ] E3: Upgrade to b3/b4 if time allows
- [ ] E4: Seed ensemble / test-time augmentations

Checkpoints: I will request expert review after plan, after EDA, after first baseline CV, and if scores stall.

In [1]:
# E0: Data loading + quick audit
import os, sys, time, json, math, gc, random
import pandas as pd
import numpy as np
from pathlib import Path

DATA_DIR = Path('.')
IMG_DIR = DATA_DIR / 'images'

t0 = time.time()
print('CWD:', os.getcwd())
print('Images dir exists:', IMG_DIR.exists())
print('Num images:', len(list(IMG_DIR.glob('*.jpg'))))

train = pd.read_csv(DATA_DIR / 'train.csv')
test = pd.read_csv(DATA_DIR / 'test.csv')
print('train.shape:', train.shape, 'test.shape:', test.shape)
print('train.columns:', train.columns.tolist())
print('test.columns:', test.columns.tolist())
print('\ntrain.head():')
display(train.head())

# Identify target columns (original competition has 4 one-hot label columns)
potential_targets = ['healthy','multiple_diseases','rust','scab']
target_cols = [c for c in potential_targets if c in train.columns]
if len(target_cols) == 0 and 'multiple_diseases' in train.columns:
    target_cols = ['multiple_diseases']
print('Detected target columns:', target_cols)

# Identify image id column
img_col_candidates = [c for c in ['image_id','image','id','filename'] if c in train.columns]
img_col = img_col_candidates[0] if img_col_candidates else None
print('Image column:', img_col)

# Check image file existence for a few samples
def img_path_from_row(row):
    if img_col is None:
        return None
    # common pattern: {image_id}.jpg
    name = str(row[img_col])
    p = IMG_DIR / f"{name}.jpg"
    if not p.exists():
        # try uppercase/lowercase differences or prefixes
        alt = list(IMG_DIR.glob(f"{name}*.jpg"))
        if len(alt):
            return alt[0]
    return p

exist_checks = []
for i, row in train.head(10).iterrows():
    p = img_path_from_row(row)
    exist_checks.append((row[img_col] if img_col else None, p.name if p else None, p.exists() if p else False))
print('Sample image existence checks (first 10):')
for rec in exist_checks:
    print(rec)

# Label distribution
if len(target_cols) > 1:
    print('\nLabel sums (multi-label one-hot):')
    print(train[target_cols].sum())
else:
    # if single target provided as numeric/class, show value_counts
    print('\nSingle target distribution:')
    print(train[target_cols[0]].value_counts(dropna=False))

print(f'Elapsed: {time.time()-t0:.2f}s')

CWD: /app/agent_run_states/plant-pathology-2020-fgvc7-spray-20250908-012355
Images dir exists: True
Num images: 1821
train.shape: (1638, 5) test.shape: (183, 1)
train.columns: ['image_id', 'healthy', 'multiple_diseases', 'rust', 'scab']
test.columns: ['image_id']

train.head():


Unnamed: 0,image_id,healthy,multiple_diseases,rust,scab
0,Train_0,0,0,1,0
1,Train_1,1,0,0,0
2,Train_2,0,0,1,0
3,Train_3,1,0,0,0
4,Train_4,0,0,1,0


Detected target columns: ['healthy', 'multiple_diseases', 'rust', 'scab']
Image column: image_id
Sample image existence checks (first 10):
('Train_0', 'Train_0.jpg', True)
('Train_1', 'Train_1.jpg', True)
('Train_2', 'Train_2.jpg', True)
('Train_3', 'Train_3.jpg', True)
('Train_4', 'Train_4.jpg', True)
('Train_5', 'Train_5.jpg', True)
('Train_6', 'Train_6.jpg', True)
('Train_7', 'Train_7.jpg', True)
('Train_8', 'Train_8.jpg', True)
('Train_9', 'Train_9.jpg', True)

Label sums (multi-label one-hot):
healthy              468
multiple_diseases     85
rust                 551
scab                 534
dtype: int64
Elapsed: 0.02s


In [3]:
# Setup: installs, imports, seeds, utilities
import sys, subprocess, importlib, time, os, random

def ensure_package(pkg, import_name=None, version=None):
    import_name = import_name or pkg
    try:
        importlib.import_module(import_name)
        print(f"OK: {pkg} already installed")
    except Exception as e:
        pip_cmd = [sys.executable, '-m', 'pip', 'install', pkg + (f'=={version}' if version else '')]
        print('Installing', ' '.join(pip_cmd)); subprocess.check_call(pip_cmd)
        importlib.invalidate_caches()

# Core libs
ensure_package('timm')
ensure_package('albumentations')
ensure_package('iterative-stratification', 'iterstrat')

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from sklearn.metrics import roc_auc_score
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
import albumentations as A
from albumentations.pytorch import ToTensorV2
from PIL import Image

SEED = 42
def set_seed(seed=SEED):
    random.seed(seed); np.random.seed(seed); torch.manual_seed(seed); torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = True
set_seed(SEED)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Device:', device)

# Metric helper
def mean_columnwise_roc_auc(y_true, y_pred):
    scores = []
    for i in range(y_true.shape[1]):
        try:
            s = roc_auc_score(y_true[:, i], y_pred[:, i])
        except ValueError:
            s = np.nan
        scores.append(s)
    return float(np.nanmean(scores)), scores

print('Setup complete.')

  from .autonotebook import tqdm as notebook_tqdm


OK: timm already installed


OK: albumentations already installed
OK: iterative-stratification already installed


Device: cuda
Setup complete.


In [4]:
# Modeling pipeline: dataset, transforms, model, CV training/eval, inference
import os, time, math, gc, copy, warnings
from pathlib import Path

# Set writable cache dirs BEFORE importing timm/huggingface
CACHE_DIR = Path('./timm_cache')
CACHE_DIR.mkdir(parents=True, exist_ok=True)
os.environ['HF_HOME'] = str(CACHE_DIR)
os.environ['HF_HUB_CACHE'] = str(CACHE_DIR / 'hub')
os.environ['XDG_CACHE_HOME'] = str(CACHE_DIR)
os.environ['TIMM_HOME'] = str(CACHE_DIR)
os.environ['TORCH_HOME'] = str(CACHE_DIR / 'torch')
os.environ['PYTORCH_HUB'] = str(CACHE_DIR / 'pytorch_hub')

import numpy as np
import pandas as pd
import cv2
import timm
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, WeightedRandomSampler
from torch.optim import AdamW
from torch.optim.lr_scheduler import OneCycleLR
from sklearn.metrics import roc_auc_score
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
import albumentations as A
from albumentations.pytorch import ToTensorV2

# Device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.backends.cudnn.benchmark = True
print('Device:', device, flush=True)

warnings.filterwarnings('ignore')
IMG_DIR = Path('images')
TARGET_COLS = ['healthy','multiple_diseases','rust','scab']
N_CLASSES = len(TARGET_COLS)

# Transforms
IMG_SIZE = 384
def get_transforms(img_size=IMG_SIZE):
    train_tfms = A.Compose([
        A.RandomResizedCrop(size=(img_size, img_size), scale=(0.8, 1.0), ratio=(0.75, 1.333), p=1.0),
        A.HorizontalFlip(p=0.5),
        A.VerticalFlip(p=0.3),
        A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.1, rotate_limit=25, p=0.6),
        A.RandomBrightnessContrast(p=0.5),
        A.HueSaturationValue(p=0.5),
        A.ColorJitter(p=0.3),
        A.CoarseDropout(max_holes=8, max_height=int(img_size*0.1), max_width=int(img_size*0.1), fill_value=0, p=0.5),
        A.Normalize(mean=(0.485,0.456,0.406), std=(0.229,0.224,0.225)),
        ToTensorV2(),
    ])
    valid_tfms = A.Compose([
        A.Resize(height=img_size, width=img_size),
        A.Normalize(mean=(0.485,0.456,0.406), std=(0.229,0.224,0.225)),
        ToTensorV2(),
    ])
    return train_tfms, valid_tfms

class PlantDataset(Dataset):
    def __init__(self, df, img_dir, img_col='image_id', target_cols=None, transforms=None, is_train=True):
        self.df = df.reset_index(drop=True)
        self.img_dir = Path(img_dir)
        self.img_col = img_col
        self.target_cols = target_cols
        self.transforms = transforms
        self.is_train = is_train
    def __len__(self):
        return len(self.df)
    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        img_path = self.img_dir / f"{row[self.img_col]}.jpg"
        img = cv2.imread(str(img_path))
        if img is None:
            raise FileNotFoundError(f"Image not found: {img_path}")
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        if self.transforms:
            img = self.transforms(image=img)['image']
        if self.is_train and self.target_cols is not None:
            y = row[self.target_cols].values.astype('float32')
            return img, torch.from_numpy(y)
        else:
            return img, row[self.img_col]

# Model
class Model(nn.Module):
    def __init__(self, backbone='tf_efficientnet_b3_ns', num_classes=N_CLASSES, drop=0.2):
        super().__init__()
        # ensure pretrained weights cache in writable dir
        self.net = timm.create_model(backbone, pretrained=True, num_classes=num_classes, drop_rate=drop, in_chans=3, cache_dir=str(CACHE_DIR))
    def forward(self, x):
        return self.net(x)

# EMA utility
class EMA:
    def __init__(self, model, decay=0.9998):
        self.ema = copy.deepcopy(model).eval()
        for p in self.ema.parameters():
            p.requires_grad_(False)
        self.decay = decay
    @torch.no_grad()
    def update(self, model):
        msd = model.state_dict()
        for k, v in self.ema.state_dict().items():
            if k in msd:
                v.copy_(v * self.decay + msd[k] * (1.0 - self.decay))

# Asymmetric Loss for multi-label
class AsymmetricLossMultiLabel(nn.Module):
    def __init__(self, gamma_pos=0.0, gamma_neg=4.0, clip=0.05, eps=1e-8):
        super().__init__()
        self.gamma_pos = gamma_pos
        self.gamma_neg = gamma_neg
        self.clip = clip
        self.eps = eps
    def forward(self, logits, targets):
        x_sigmoid = torch.sigmoid(logits)
        xs_pos = x_sigmoid
        xs_neg = 1.0 - x_sigmoid
        if self.clip is not None and self.clip > 0:
            xs_neg = (xs_neg + self.clip).clamp(max=1)
        # Basic CE
        loss = targets * torch.log(xs_pos.clamp(min=self.eps)) + (1 - targets) * torch.log(xs_neg.clamp(min=self.eps))
        # Focusing
        if self.gamma_pos > 0 or self.gamma_neg > 0:
            pt = targets * xs_pos + (1 - targets) * xs_neg
            one_sided = (1 - pt) ** (self.gamma_pos * targets + self.gamma_neg * (1 - targets))
            loss *= one_sided
        return -loss.mean()

def compute_auc(y_true, y_pred):
    scores = []
    for i in range(y_true.shape[1]):
        try:
            scores.append(roc_auc_score(y_true[:, i], y_pred[:, i]))
        except Exception:
            scores.append(np.nan)
    return float(np.nanmean(scores)), scores

# Mixup
def do_mixup(images, targets, alpha=0.3):
    if alpha <= 0:
        return images, targets
    lam = np.random.beta(alpha, alpha)
    index = torch.randperm(images.size(0), device=images.device)
    mixed_images = lam * images + (1 - lam) * images[index]
    mixed_targets = lam * targets + (1 - lam) * targets[index]
    return mixed_images, mixed_targets

# Train/valid loops
def train_one_epoch(model, ema, loader, optimizer, scaler, loss_fn, device, scheduler=None, mixup_alpha=0.3, accum_steps=1):
    model.train()
    running = 0.0
    optimizer.zero_grad(set_to_none=True)
    for step, (images, targets) in enumerate(loader):
        images = images.to(device, non_blocking=True)
        targets = targets.to(device, non_blocking=True)
        if mixup_alpha and mixup_alpha > 0:
            images, targets = do_mixup(images, targets, alpha=mixup_alpha)
        with torch.cuda.amp.autocast(enabled=True):
            logits = model(images)
            loss = loss_fn(logits, targets) / accum_steps
        scaler.scale(loss).backward()
        # gradient clipping for stability
        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
        if ((step + 1) % accum_steps) == 0:
            scaler.step(optimizer)
            scaler.update()
            optimizer.zero_grad(set_to_none=True)
        if scheduler is not None:
            # OneCycleLR step per batch
            scheduler.step()
        if ema is not None:
            ema.update(model)
        running += loss.item() * accum_steps
        if (step+1) % 25 == 0:
            print(f"  step {step+1}/{len(loader)} loss {running/(step+1):.4f}", flush=True)
    return running / max(1, len(loader))

@torch.no_grad()
def validate(model, loader, device, use_ema=False, ema=None):
    model.eval()
    preds = []
    targs = []
    net = ema.ema if (use_ema and ema is not None) else model
    for images, targets in loader:
        images = images.to(device, non_blocking=True)
        logits = net(images)
        probs = torch.sigmoid(logits).float().cpu().numpy()
        preds.append(probs)
        targs.append(targets.numpy())
    preds = np.concatenate(preds, axis=0)
    targs = np.concatenate(targs, axis=0)
    mauc, per = compute_auc(targs, preds)
    return mauc, per, preds

# Cross-validation training
def run_cv(train_df, img_dir=IMG_DIR, n_folds=5, seed=42, epochs=18, batch_size=16, lr=3e-4, wd=1e-4, patience=5, backbone='tf_efficientnet_b3_ns', use_ema=True, device=device, mixup_alpha=0.3, oversample=True, ema_decay=0.9998, accum_steps=1):
    train_tfms, valid_tfms = get_transforms(IMG_SIZE)
    X = train_df[['image_id']].values
    y = train_df[TARGET_COLS].values
    mskf = MultilabelStratifiedKFold(n_splits=n_folds, shuffle=True, random_state=seed)
    folds = list(mskf.split(X, y))
    # sanity check rare class distribution
    for fi, (tr_idx, va_idx) in enumerate(folds):
        cnt = train_df.iloc[va_idx]['multiple_diseases'].sum()
        print(f"Fold {fi}: val multiple_diseases count = {int(cnt)}", flush=True)
    oof = np.zeros((len(train_df), N_CLASSES), dtype=np.float32)
    per_fold_scores = []
    for fi, (tr_idx, va_idx) in enumerate(folds):
        print(f"\n=== Fold {fi+1}/{n_folds} ===", flush=True)
        t_start = time.time()
        tr_df = train_df.iloc[tr_idx].reset_index(drop=True)
        va_df = train_df.iloc[va_idx].reset_index(drop=True)
        ds_tr = PlantDataset(tr_df, img_dir, img_col='image_id', target_cols=TARGET_COLS, transforms=train_tfms, is_train=True)
        ds_va = PlantDataset(va_df, img_dir, img_col='image_id', target_cols=TARGET_COLS, transforms=valid_tfms, is_train=True)
        # Oversampling for multiple_diseases class
        sampler = None
        if oversample:
            md = tr_df['multiple_diseases'].values.astype(np.float32)
            # weight boost for md positives
            w = np.ones(len(tr_df), dtype=np.float32)
            # inverse frequency style boost, capped at 10x
            pos_frac = md.mean() if md.mean() > 0 else 0.05
            boost = min(10.0, max(3.0, 1.0 / max(1e-3, pos_frac)))
            print(f"  Oversample: pos_frac={pos_frac:.4f}, boost={boost:.2f}", flush=True)
            w[md > 0.5] *= boost
            sampler = WeightedRandomSampler(weights=torch.from_numpy(w), num_samples=len(tr_df), replacement=True)
        dl_tr = DataLoader(ds_tr, batch_size=batch_size, shuffle=(sampler is None), sampler=sampler, num_workers=2, pin_memory=True, drop_last=True)
        dl_va = DataLoader(ds_va, batch_size=batch_size*2, shuffle=False, num_workers=2, pin_memory=True)
        model = Model(backbone=backbone, num_classes=N_CLASSES).to(device)
        optimizer = AdamW(model.parameters(), lr=lr, weight_decay=wd)
        # OneCycleLR per batch
        steps_per_epoch = max(1, len(dl_tr))
        scheduler = OneCycleLR(optimizer, max_lr=lr, epochs=epochs, steps_per_epoch=steps_per_epoch, pct_start=0.3, anneal_strategy='cos', div_factor=25.0, final_div_factor=1e4)
        scaler = torch.cuda.amp.GradScaler(enabled=True)
        ema = EMA(model, decay=ema_decay) if use_ema else None
        loss_fn = AsymmetricLossMultiLabel(gamma_pos=0.0, gamma_neg=4.0, clip=0.05)
        best = -1.0
        best_state = None
        best_epoch = -1
        no_improve = 0
        for epoch in range(epochs):
            ep_s = time.time()
            tr_loss = train_one_epoch(model, ema, dl_tr, optimizer, scaler, loss_fn, device, scheduler=scheduler, mixup_alpha=mixup_alpha, accum_steps=accum_steps)
            mauc, per, preds = validate(model, dl_va, device, use_ema=True, ema=ema)
            elapsed = time.time()-ep_s
            print(f"Fold {fi} Epoch {epoch+1}/{epochs} | tr_loss {tr_loss:.4f} | val mAUC {mauc:.5f} | per {np.round(per,4)} | {elapsed:.1f}s", flush=True)
            if mauc > best:
                best = mauc; best_state = copy.deepcopy((ema.ema if use_ema else model)).state_dict(); best_epoch = epoch; no_improve = 0
            else:
                no_improve += 1
            if no_improve >= patience:
                print(f"Early stopping at epoch {epoch+1}", flush=True)
                break
        # load best and predict OOF
        (ema.ema if use_ema else model).load_state_dict(best_state, strict=True)
        mauc, per, preds = validate(model, dl_va, device, use_ema=True, ema=ema)
        oof[va_idx] = preds
        per_fold_scores.append((best, per))
        # save fold weights
        ckpt_path = f"model_fold{fi}.pth"
        torch.save(best_state, ckpt_path)
        print(f"Fold {fi} best mAUC {best:.5f} at epoch {best_epoch+1}. Saved {ckpt_path}. Fold time {time.time()-t_start:.1f}s", flush=True)
        del model, optimizer, scheduler, scaler, ema, dl_tr, dl_va, ds_tr, ds_va; gc.collect(); torch.cuda.empty_cache()
    # OOF score
    mauc, per = compute_auc(train_df[TARGET_COLS].values, oof)
    print(f"OOF mAUC: {mauc:.5f}; per-class: {np.round(per,5)}", flush=True)
    np.save('oof_preds.npy', oof)
    train_df[[*TARGET_COLS]] = train_df[TARGET_COLS]
    return oof, per_fold_scores

# Inference with TTA (orig + hflip + vflip + hvflip)
@torch.no_grad()
def predict_test(models_paths, test_df, img_dir=IMG_DIR, batch_size=32, backbone='tf_efficientnet_b3_ns', device=device):
    _, valid_tfms = get_transforms(IMG_SIZE)
    ds_te = PlantDataset(test_df, img_dir, img_col='image_id', target_cols=None, transforms=valid_tfms, is_train=False)
    dl_te = DataLoader(ds_te, batch_size=batch_size, shuffle=False, num_workers=2, pin_memory=True)
    # prepare models
    models = []
    for p in models_paths:
        m = Model(backbone=backbone, num_classes=N_CLASSES).to(device)
        sd = torch.load(p, map_location=device)
        m.load_state_dict(sd, strict=True)
        m.eval()
        models.append(m)
    all_preds = []
    for images, ids in dl_te:
        images = images.to(device, non_blocking=True)
        logits_sum = None
        for m in models:
            logits0 = m(images)
            logits1 = m(torch.flip(images, dims=[3]))  # hflip
            logits2 = m(torch.flip(images, dims=[2]))  # vflip
            logits3 = m(torch.flip(images, dims=[2,3]))  # hvflip
            logits_tta = (logits0 + logits1 + logits2 + logits3) / 4.0
            logits_sum = logits_tta if logits_sum is None else logits_sum + logits_tta
        logits_avg = logits_sum / len(models)
        probs = torch.sigmoid(logits_avg).float().cpu().numpy()
        all_preds.append(probs)
    preds = np.concatenate(all_preds, axis=0)
    sub = pd.DataFrame(preds, columns=TARGET_COLS)
    sub.insert(0, 'image_id', test_df['image_id'].values)
    return sub

print('Pipeline defined. Ready to run CV training next.')

Device: cuda


Pipeline defined. Ready to run CV training next.


In [6]:
# Run 5-fold CV training and generate submission
import pandas as pd, numpy as np, time, os, gc

train_df = pd.read_csv('train.csv')
test_df = pd.read_csv('test.csv')

# sanity: ensure columns exist
assert all([c in train_df.columns for c in TARGET_COLS])

SEED = 42
N_FOLDS = 5
EPOCHS = 18
BATCH_SIZE = 16
LR = 4e-4  # safer max_lr for OneCycle per expert audit
WD = 1e-4
PATIENCE = 5
BACKBONE = 'tf_efficientnet_b3_ns'
MIXUP_ALPHA = 0.3
EMA_DECAY = 0.999
OVERSAMPLE = True

t0 = time.time()
print('Starting CV with ASL + Mixup + Oversampling (cap 10x) + OneCycleLR + EMA...')
oof, per_fold_scores = run_cv(
    train_df, img_dir=IMG_DIR, n_folds=N_FOLDS, seed=SEED,
    epochs=EPOCHS, batch_size=BATCH_SIZE, lr=LR, wd=WD, patience=PATIENCE,
    backbone=BACKBONE, use_ema=True, mixup_alpha=MIXUP_ALPHA, oversample=OVERSAMPLE, ema_decay=EMA_DECAY, accum_steps=2
)
print(f'CV finished in {time.time()-t0:.1f}s')

# Collect fold checkpoints
ckpts = [f'model_fold{i}.pth' for i in range(N_FOLDS) if os.path.exists(f'model_fold{i}.pth')]
print('Found checkpoints:', ckpts)

print('Predicting test with 4-way TTA (orig, hflip, vflip, hvflip) and fold-ensemble...')
sub = predict_test(ckpts, test_df, img_dir=IMG_DIR, batch_size=32, backbone=BACKBONE)
sub.to_csv('submission.csv', index=False)
print('Saved submission.csv with shape:', sub.shape)
sub.head()

Starting CV with ASL + Mixup + Oversampling (cap 10x) + OneCycleLR + EMA...
Fold 0: val multiple_diseases count = 17


Fold 1: val multiple_diseases count = 17


Fold 2: val multiple_diseases count = 17


Fold 3: val multiple_diseases count = 17


Fold 4: val multiple_diseases count = 17



=== Fold 1/5 ===


  Oversample: pos_frac=0.0519, boost=10.00


  step 25/81 loss 0.4439


  step 50/81 loss 0.4345


  step 75/81 loss 0.4127


Fold 0 Epoch 1/18 | tr_loss 0.4113 | val mAUC 0.59683 | per [0.6625 0.5926 0.5161 0.6162] | 74.6s


  step 25/81 loss 0.3353


  step 50/81 loss 0.2946


  step 75/81 loss 0.2820


Fold 0 Epoch 2/18 | tr_loss 0.2767 | val mAUC 0.60100 | per [0.6724 0.5935 0.524  0.6141] | 23.7s


  step 25/81 loss 0.2166


  step 50/81 loss 0.1950


  step 75/81 loss 0.1869


Fold 0 Epoch 3/18 | tr_loss 0.1842 | val mAUC 0.62270 | per [0.6917 0.6074 0.5609 0.6308] | 23.4s


  step 25/81 loss 0.1443


  step 50/81 loss 0.1409


  step 75/81 loss 0.1395


Fold 0 Epoch 4/18 | tr_loss 0.1438 | val mAUC 0.65719 | per [0.7244 0.6247 0.6086 0.6711] | 23.5s


  step 25/81 loss 0.1566


  step 50/81 loss 0.1575


  step 75/81 loss 0.1471


Fold 0 Epoch 5/18 | tr_loss 0.1451 | val mAUC 0.70641 | per [0.7731 0.6509 0.6614 0.7403] | 23.7s


  step 25/81 loss 0.1230


  step 50/81 loss 0.1221


  step 75/81 loss 0.1174


Fold 0 Epoch 6/18 | tr_loss 0.1183 | val mAUC 0.75626 | per [0.8285 0.6715 0.7214 0.8036] | 23.4s


  step 25/81 loss 0.1132


  step 50/81 loss 0.1183


  step 75/81 loss 0.1147


Fold 0 Epoch 7/18 | tr_loss 0.1116 | val mAUC 0.80095 | per [0.8778 0.6977 0.7756 0.8527] | 23.4s


  step 25/81 loss 0.0772


  step 50/81 loss 0.0816


  step 75/81 loss 0.0794


Fold 0 Epoch 8/18 | tr_loss 0.0796 | val mAUC 0.84250 | per [0.9161 0.7171 0.838  0.8988] | 23.4s


  step 25/81 loss 0.0878


  step 50/81 loss 0.0866


  step 75/81 loss 0.0867


Fold 0 Epoch 9/18 | tr_loss 0.0850 | val mAUC 0.87803 | per [0.9399 0.7457 0.8904 0.9361] | 23.3s


  step 25/81 loss 0.0857


  step 50/81 loss 0.0732


  step 75/81 loss 0.0712


Fold 0 Epoch 10/18 | tr_loss 0.0733 | val mAUC 0.90905 | per [0.9591 0.7816 0.9357 0.9598] | 23.8s


  step 25/81 loss 0.0637


  step 50/81 loss 0.0628


  step 75/81 loss 0.0621


Fold 0 Epoch 11/18 | tr_loss 0.0639 | val mAUC 0.93180 | per [0.9708 0.8207 0.9623 0.9734] | 23.5s


  step 25/81 loss 0.0622


  step 50/81 loss 0.0645


  step 75/81 loss 0.0575


Fold 0 Epoch 12/18 | tr_loss 0.0602 | val mAUC 0.94909 | per [0.98   0.8575 0.9776 0.9812] | 23.5s


  step 25/81 loss 0.0519


  step 50/81 loss 0.0521


  step 75/81 loss 0.0506


Fold 0 Epoch 13/18 | tr_loss 0.0515 | val mAUC 0.96323 | per [0.9856 0.8958 0.986  0.9855] | 23.4s


  step 25/81 loss 0.0393


  step 50/81 loss 0.0436


  step 75/81 loss 0.0500


Fold 0 Epoch 14/18 | tr_loss 0.0498 | val mAUC 0.97340 | per [0.9889 0.9254 0.9907 0.9886] | 23.4s


  step 25/81 loss 0.0593


  step 50/81 loss 0.0523


  step 75/81 loss 0.0520


Fold 0 Epoch 15/18 | tr_loss 0.0512 | val mAUC 0.97941 | per [0.9909 0.9419 0.9941 0.9907] | 23.5s


  step 25/81 loss 0.0471


  step 50/81 loss 0.0411


  step 75/81 loss 0.0376


Fold 0 Epoch 16/18 | tr_loss 0.0384 | val mAUC 0.98374 | per [0.9924 0.9545 0.9965 0.9916] | 23.4s


  step 25/81 loss 0.0399


  step 50/81 loss 0.0435


  step 75/81 loss 0.0458


Fold 0 Epoch 17/18 | tr_loss 0.0449 | val mAUC 0.98510 | per [0.9931 0.9577 0.9971 0.9925] | 23.4s


  step 25/81 loss 0.0372


  step 50/81 loss 0.0429


  step 75/81 loss 0.0485


Fold 0 Epoch 18/18 | tr_loss 0.0485 | val mAUC 0.98575 | per [0.9935 0.9588 0.9978 0.9929] | 23.5s


Fold 0 best mAUC 0.98575 at epoch 18. Saved model_fold0.pth. Fold time 480.4s



=== Fold 2/5 ===


  Oversample: pos_frac=0.0519, boost=10.00


  step 25/81 loss 0.5174


  step 50/81 loss 0.4703


  step 75/81 loss 0.4238


Fold 1 Epoch 1/18 | tr_loss 0.4239 | val mAUC 0.53499 | per [0.4516 0.5659 0.5106 0.6118] | 24.8s


  step 25/81 loss 0.3130


  step 50/81 loss 0.3013


  step 75/81 loss 0.2773


Fold 1 Epoch 2/18 | tr_loss 0.2750 | val mAUC 0.54225 | per [0.4585 0.5644 0.5228 0.6233] | 24.4s


  step 25/81 loss 0.2035


  step 50/81 loss 0.1854


  step 75/81 loss 0.1755


Fold 1 Epoch 3/18 | tr_loss 0.1748 | val mAUC 0.56896 | per [0.4952 0.5812 0.5481 0.6513] | 24.4s


  step 25/81 loss 0.1696


  step 50/81 loss 0.1672


  step 75/81 loss 0.1656


Fold 1 Epoch 4/18 | tr_loss 0.1630 | val mAUC 0.60953 | per [0.5584 0.6054 0.5853 0.689 ] | 24.6s


  step 25/81 loss 0.1395


  step 50/81 loss 0.1369


  step 75/81 loss 0.1323


Fold 1 Epoch 5/18 | tr_loss 0.1309 | val mAUC 0.65521 | per [0.6148 0.6524 0.6159 0.7378] | 24.2s


  step 25/81 loss 0.1116


  step 50/81 loss 0.1149


  step 75/81 loss 0.1125


Fold 1 Epoch 6/18 | tr_loss 0.1116 | val mAUC 0.70520 | per [0.6675 0.7036 0.6634 0.7863] | 24.4s


  step 25/81 loss 0.1054


  step 50/81 loss 0.1089


  step 75/81 loss 0.1149


Fold 1 Epoch 7/18 | tr_loss 0.1143 | val mAUC 0.76232 | per [0.7278 0.7611 0.7301 0.8303] | 24.3s


  step 25/81 loss 0.1154


  step 50/81 loss 0.1017


  step 75/81 loss 0.0927


Fold 1 Epoch 8/18 | tr_loss 0.0935 | val mAUC 0.82491 | per [0.7965 0.8061 0.8163 0.8807] | 24.4s


  step 25/81 loss 0.0942


  step 50/81 loss 0.0838


  step 75/81 loss 0.0821


Fold 1 Epoch 9/18 | tr_loss 0.0828 | val mAUC 0.87864 | per [0.8594 0.8468 0.892  0.9164] | 24.3s


  step 25/81 loss 0.0618


  step 50/81 loss 0.0684


  step 75/81 loss 0.0681


Fold 1 Epoch 10/18 | tr_loss 0.0677 | val mAUC 0.91759 | per [0.9073 0.8748 0.9484 0.9398] | 24.3s


  step 25/81 loss 0.0762


  step 50/81 loss 0.0701


  step 75/81 loss 0.0680


Fold 1 Epoch 11/18 | tr_loss 0.0678 | val mAUC 0.94305 | per [0.9375 0.8981 0.979  0.9577] | 24.2s


  step 25/81 loss 0.0685


  step 50/81 loss 0.0623


  step 75/81 loss 0.0612


Fold 1 Epoch 12/18 | tr_loss 0.0609 | val mAUC 0.95970 | per [0.9585 0.9187 0.9907 0.971 ] | 24.3s


  step 25/81 loss 0.0716


  step 50/81 loss 0.0615


  step 75/81 loss 0.0617


Fold 1 Epoch 13/18 | tr_loss 0.0604 | val mAUC 0.97015 | per [0.9707 0.9346 0.9948 0.9805] | 24.4s


  step 25/81 loss 0.0602


  step 50/81 loss 0.0639


  step 75/81 loss 0.0589


Fold 1 Epoch 14/18 | tr_loss 0.0582 | val mAUC 0.97714 | per [0.9792 0.9459 0.9976 0.9859] | 24.4s


  step 25/81 loss 0.0461


  step 50/81 loss 0.0451


  step 75/81 loss 0.0465


Fold 1 Epoch 15/18 | tr_loss 0.0461 | val mAUC 0.98146 | per [0.9856 0.9531 0.9991 0.9881] | 24.4s


  step 25/81 loss 0.0548


  step 50/81 loss 0.0523


  step 75/81 loss 0.0475


Fold 1 Epoch 16/18 | tr_loss 0.0464 | val mAUC 0.98460 | per [0.9905 0.9584 0.9995 0.99  ] | 24.1s


  step 25/81 loss 0.0407


  step 50/81 loss 0.0507


  step 75/81 loss 0.0476


Fold 1 Epoch 17/18 | tr_loss 0.0480 | val mAUC 0.98730 | per [0.9939 0.9646 0.9997 0.991 ] | 24.2s


  step 25/81 loss 0.0414


  step 50/81 loss 0.0452


  step 75/81 loss 0.0438


Fold 1 Epoch 18/18 | tr_loss 0.0445 | val mAUC 0.98849 | per [0.9958 0.9667 0.9998 0.9916] | 24.3s


Fold 1 best mAUC 0.98849 at epoch 18. Saved model_fold1.pth. Fold time 445.3s



=== Fold 3/5 ===


  Oversample: pos_frac=0.0519, boost=10.00


  step 25/81 loss 0.4663


  step 50/81 loss 0.4341


  step 75/81 loss 0.4152


Fold 2 Epoch 1/18 | tr_loss 0.4126 | val mAUC 0.50528 | per [0.5518 0.4956 0.5269 0.4468] | 24.5s


  step 25/81 loss 0.3276


  step 50/81 loss 0.3064


  step 75/81 loss 0.2917


Fold 2 Epoch 2/18 | tr_loss 0.2882 | val mAUC 0.50297 | per [0.5653 0.4793 0.548  0.4193] | 24.4s


  step 25/81 loss 0.2305


  step 50/81 loss 0.2304


  step 75/81 loss 0.2097


Fold 2 Epoch 3/18 | tr_loss 0.2075 | val mAUC 0.52095 | per [0.5897 0.4772 0.5987 0.4183] | 24.3s


  step 25/81 loss 0.1548


  step 50/81 loss 0.1504


  step 75/81 loss 0.1501


Fold 2 Epoch 4/18 | tr_loss 0.1462 | val mAUC 0.55524 | per [0.6249 0.4878 0.6669 0.4413] | 24.3s


  step 25/81 loss 0.1233


  step 50/81 loss 0.1231


  step 75/81 loss 0.1223


Fold 2 Epoch 5/18 | tr_loss 0.1212 | val mAUC 0.60658 | per [0.6724 0.5188 0.7393 0.4958] | 24.7s


  step 25/81 loss 0.1306


  step 50/81 loss 0.1260


  step 75/81 loss 0.1246


Fold 2 Epoch 6/18 | tr_loss 0.1221 | val mAUC 0.66409 | per [0.7341 0.561  0.807  0.5542] | 24.4s


  step 25/81 loss 0.1067


  step 50/81 loss 0.0955


  step 75/81 loss 0.0977


Fold 2 Epoch 7/18 | tr_loss 0.0979 | val mAUC 0.72856 | per [0.8069 0.6198 0.8605 0.627 ] | 24.5s


  step 25/81 loss 0.1023


  step 50/81 loss 0.0986


  step 75/81 loss 0.0948


Fold 2 Epoch 8/18 | tr_loss 0.0940 | val mAUC 0.78922 | per [0.8714 0.6673 0.906  0.7121] | 24.5s


  step 25/81 loss 0.0921


  step 50/81 loss 0.0854


  step 75/81 loss 0.0798


Fold 2 Epoch 9/18 | tr_loss 0.0791 | val mAUC 0.84426 | per [0.9161 0.7212 0.9456 0.7941] | 24.9s


  step 25/81 loss 0.0662


  step 50/81 loss 0.0723


  step 75/81 loss 0.0698


Fold 2 Epoch 10/18 | tr_loss 0.0677 | val mAUC 0.88736 | per [0.9513 0.7651 0.9706 0.8625] | 24.4s


  step 25/81 loss 0.0742


  step 50/81 loss 0.0665


  step 75/81 loss 0.0629


Fold 2 Epoch 11/18 | tr_loss 0.0622 | val mAUC 0.91522 | per [0.9705 0.7982 0.9842 0.908 ] | 24.3s


  step 25/81 loss 0.0732


  step 50/81 loss 0.0666


  step 75/81 loss 0.0686


Fold 2 Epoch 12/18 | tr_loss 0.0680 | val mAUC 0.93362 | per [0.9807 0.829  0.9907 0.9341] | 24.4s


  step 25/81 loss 0.0583


  step 50/81 loss 0.0611


  step 75/81 loss 0.0622


Fold 2 Epoch 13/18 | tr_loss 0.0630 | val mAUC 0.94648 | per [0.9876 0.8494 0.994  0.9548] | 24.3s


  step 25/81 loss 0.0568


  step 50/81 loss 0.0546


  step 75/81 loss 0.0516


Fold 2 Epoch 14/18 | tr_loss 0.0506 | val mAUC 0.95710 | per [0.9921 0.8712 0.995  0.9701] | 24.8s


  step 25/81 loss 0.0461


  step 50/81 loss 0.0487


  step 75/81 loss 0.0496


Fold 2 Epoch 15/18 | tr_loss 0.0490 | val mAUC 0.96361 | per [0.9945 0.8848 0.9957 0.9794] | 24.5s


  step 25/81 loss 0.0404


  step 50/81 loss 0.0480


  step 75/81 loss 0.0466


Fold 2 Epoch 16/18 | tr_loss 0.0452 | val mAUC 0.97001 | per [0.9959 0.9016 0.9962 0.9863] | 24.4s


  step 25/81 loss 0.0542


  step 50/81 loss 0.0494


  step 75/81 loss 0.0471


Fold 2 Epoch 17/18 | tr_loss 0.0451 | val mAUC 0.97369 | per [0.997  0.9109 0.9964 0.9904] | 24.4s


  step 25/81 loss 0.0438


  step 50/81 loss 0.0418


  step 75/81 loss 0.0425


Fold 2 Epoch 18/18 | tr_loss 0.0456 | val mAUC 0.97550 | per [0.998  0.9143 0.9967 0.993 ] | 24.5s


Fold 2 best mAUC 0.97550 at epoch 18. Saved model_fold2.pth. Fold time 447.5s



=== Fold 4/5 ===


  Oversample: pos_frac=0.0519, boost=10.00


  step 25/81 loss 0.5999


  step 50/81 loss 0.5231


  step 75/81 loss 0.4692


Fold 3 Epoch 1/18 | tr_loss 0.4658 | val mAUC 0.57402 | per [0.6215 0.5763 0.5228 0.5755] | 24.5s


  step 25/81 loss 0.3100


  step 50/81 loss 0.2956


  step 75/81 loss 0.2841


Fold 3 Epoch 2/18 | tr_loss 0.2791 | val mAUC 0.57637 | per [0.6172 0.56   0.5488 0.5796] | 24.4s


  step 25/81 loss 0.1917


  step 50/81 loss 0.1879


  step 75/81 loss 0.1730


Fold 3 Epoch 3/18 | tr_loss 0.1714 | val mAUC 0.60154 | per [0.6435 0.5564 0.6032 0.6031] | 24.3s


  step 25/81 loss 0.1650


  step 50/81 loss 0.1539


  step 75/81 loss nan


Fold 3 Epoch 4/18 | tr_loss nan | val mAUC 0.64210 | per [0.68   0.5774 0.6631 0.6478] | 24.8s


  step 25/81 loss nan


  step 50/81 loss nan


  step 75/81 loss nan


Fold 3 Epoch 5/18 | tr_loss nan | val mAUC 0.69311 | per [0.7448 0.5991 0.7137 0.7149] | 24.2s


  step 25/81 loss 0.2058


  step 50/81 loss 0.1660


  step 75/81 loss 0.1459


Fold 3 Epoch 6/18 | tr_loss 0.1434 | val mAUC 0.74993 | per [0.8398 0.593  0.7826 0.7844] | 24.1s


  step 25/81 loss 0.0920


  step 50/81 loss 0.0853


  step 75/81 loss 0.0900


Fold 3 Epoch 7/18 | tr_loss 0.0911 | val mAUC 0.80482 | per [0.9178 0.6106 0.8403 0.8505] | 24.0s


  step 25/81 loss 0.0731


  step 50/81 loss 0.0707


  step 75/81 loss 0.0702


Fold 3 Epoch 8/18 | tr_loss 0.0723 | val mAUC 0.85000 | per [0.9546 0.655  0.8854 0.9049] | 24.0s


  step 25/81 loss 0.0644


  step 50/81 loss 0.0619


  step 75/81 loss 0.0636


Fold 3 Epoch 9/18 | tr_loss 0.0635 | val mAUC 0.88298 | per [0.9713 0.6954 0.9237 0.9415] | 23.9s


  step 25/81 loss 0.0556


  step 50/81 loss 0.0547


  step 75/81 loss 0.0530


Fold 3 Epoch 10/18 | tr_loss 0.0518 | val mAUC 0.90791 | per [0.9797 0.7343 0.9554 0.9622] | 24.0s


  step 25/81 loss 0.0551


  step 50/81 loss 0.0462


  step 75/81 loss 0.0490


Fold 3 Epoch 11/18 | tr_loss 0.0487 | val mAUC 0.92644 | per [0.9862 0.7658 0.9782 0.9755] | 24.0s


  step 25/81 loss 0.0431


  step 50/81 loss 0.0450


  step 75/81 loss 0.0418


Fold 3 Epoch 12/18 | tr_loss 0.0415 | val mAUC 0.93687 | per [0.9892 0.7867 0.9897 0.9818] | 24.0s


  step 25/81 loss 0.0387


  step 50/81 loss 0.0415


  step 75/81 loss 0.0400


Fold 3 Epoch 13/18 | tr_loss 0.0396 | val mAUC 0.94215 | per [0.9915 0.7985 0.993  0.9856] | 24.0s


  step 25/81 loss 0.0364


  step 50/81 loss 0.0318


  step 75/81 loss 0.0348


Fold 3 Epoch 14/18 | tr_loss 0.0379 | val mAUC 0.94692 | per [0.9937 0.8106 0.9942 0.9892] | 24.0s


  step 25/81 loss 0.0416


  step 50/81 loss 0.0394


  step 75/81 loss 0.0395


Fold 3 Epoch 15/18 | tr_loss 0.0389 | val mAUC 0.94973 | per [0.9951 0.818  0.9947 0.9911] | 24.2s


  step 25/81 loss 0.0370


  step 50/81 loss 0.0336


  step 75/81 loss 0.0335


Fold 3 Epoch 16/18 | tr_loss 0.0342 | val mAUC 0.95276 | per [0.9959 0.8281 0.9948 0.9922] | 24.1s


  step 25/81 loss 0.0406


  step 50/81 loss 0.0378


  step 75/81 loss 0.0414


Fold 3 Epoch 17/18 | tr_loss 0.0418 | val mAUC 0.95517 | per [0.9966 0.8361 0.9947 0.9933] | 24.2s


  step 25/81 loss 0.0454


  step 50/81 loss 0.0479


  step 75/81 loss 0.0439


Fold 3 Epoch 18/18 | tr_loss 0.0432 | val mAUC 0.95687 | per [0.997  0.8419 0.9948 0.9938] | 24.0s


Fold 3 best mAUC 0.95687 at epoch 18. Saved model_fold3.pth. Fold time 441.4s



=== Fold 5/5 ===


  Oversample: pos_frac=0.0519, boost=10.00


  step 25/81 loss 0.4356


  step 50/81 loss 0.4060


  step 75/81 loss 0.4033


Fold 4 Epoch 1/18 | tr_loss 0.3995 | val mAUC 0.47685 | per [0.5617 0.513  0.4817 0.351 ] | 24.3s


  step 25/81 loss 0.3636


  step 50/81 loss 0.3254


  step 75/81 loss 0.3084


Fold 4 Epoch 2/18 | tr_loss 0.3006 | val mAUC 0.47461 | per [0.5621 0.5061 0.486  0.3442] | 24.5s


  step 25/81 loss 0.2356


  step 50/81 loss 0.2247


  step 75/81 loss 0.2054


Fold 4 Epoch 3/18 | tr_loss 0.2017 | val mAUC 0.49812 | per [0.5959 0.5131 0.5172 0.3662] | 24.3s


  step 25/81 loss 0.1447


  step 50/81 loss 0.1474


  step 75/81 loss 0.1445


Fold 4 Epoch 4/18 | tr_loss 0.1446 | val mAUC 0.54106 | per [0.6477 0.5324 0.5713 0.4129] | 24.4s


  step 25/81 loss 0.1467


  step 50/81 loss 0.1464


  step 75/81 loss 0.1393


Fold 4 Epoch 5/18 | tr_loss 0.1367 | val mAUC 0.59776 | per [0.7026 0.5642 0.6378 0.4864] | 24.7s


  step 25/81 loss 0.1321


  step 50/81 loss 0.1199


  step 75/81 loss 0.1191


Fold 4 Epoch 6/18 | tr_loss 0.1162 | val mAUC 0.66422 | per [0.7546 0.6219 0.7118 0.5686] | 24.3s


  step 25/81 loss 0.1027


  step 50/81 loss 0.1017


  step 75/81 loss 0.0991


Fold 4 Epoch 7/18 | tr_loss 0.0991 | val mAUC 0.73320 | per [0.8082 0.69   0.7823 0.6523] | 24.3s


  step 25/81 loss 0.1029


  step 50/81 loss 0.0949


  step 75/81 loss 0.0921


Fold 4 Epoch 8/18 | tr_loss 0.0911 | val mAUC 0.80194 | per [0.8599 0.7462 0.854  0.7477] | 24.3s


  step 25/81 loss 0.0944


  step 50/81 loss 0.0935


  step 75/81 loss 0.0898


Fold 4 Epoch 9/18 | tr_loss 0.0878 | val mAUC 0.85859 | per [0.9025 0.7783 0.9195 0.8341] | 24.3s


  step 25/81 loss 0.0678


  step 50/81 loss 0.0779


  step 75/81 loss 0.0735


Fold 4 Epoch 10/18 | tr_loss 0.0737 | val mAUC 0.89617 | per [0.937  0.7967 0.9621 0.8889] | 24.7s


  step 25/81 loss 0.0665


  step 50/81 loss 0.0641


  step 75/81 loss 0.0659


Fold 4 Epoch 11/18 | tr_loss 0.0662 | val mAUC 0.92343 | per [0.9608 0.8218 0.9826 0.9285] | 24.3s


  step 25/81 loss 0.0575


  step 50/81 loss 0.0555


  step 75/81 loss 0.0540


Fold 4 Epoch 12/18 | tr_loss 0.0539 | val mAUC 0.93963 | per [0.9744 0.8409 0.9904 0.9528] | 24.2s


  step 25/81 loss 0.0521


  step 50/81 loss 0.0422


  step 75/81 loss 0.0441


Fold 4 Epoch 13/18 | tr_loss 0.0463 | val mAUC 0.95144 | per [0.9829 0.8627 0.9933 0.9669] | 24.3s


  step 25/81 loss 0.0592


  step 50/81 loss 0.0580


  step 75/81 loss 0.0534


Fold 4 Epoch 14/18 | tr_loss 0.0537 | val mAUC 0.95888 | per [0.988  0.8774 0.9939 0.9762] | 24.7s


  step 25/81 loss 0.0576


  step 50/81 loss 0.0544


  step 75/81 loss 0.0554


Fold 4 Epoch 15/18 | tr_loss 0.0560 | val mAUC 0.96473 | per [0.9915 0.8909 0.9947 0.9819] | 24.3s


  step 25/81 loss 0.0444


  step 50/81 loss 0.0488


  step 75/81 loss 0.0505


Fold 4 Epoch 16/18 | tr_loss 0.0496 | val mAUC 0.96790 | per [0.9935 0.8973 0.9953 0.9855] | 24.3s


  step 25/81 loss 0.0370


  step 50/81 loss 0.0376


  step 75/81 loss 0.0404


Fold 4 Epoch 17/18 | tr_loss 0.0417 | val mAUC 0.96986 | per [0.9946 0.902  0.9956 0.9872] | 24.3s


  step 25/81 loss 0.0383


  step 50/81 loss 0.0384


  step 75/81 loss 0.0384


Fold 4 Epoch 18/18 | tr_loss 0.0387 | val mAUC 0.97173 | per [0.9955 0.9073 0.9957 0.9884] | 24.4s


Fold 4 best mAUC 0.97173 at epoch 18. Saved model_fold4.pth. Fold time 445.6s


OOF mAUC: 0.97510; per-class: [0.99578 0.91753 0.99608 0.991  ]


CV finished in 2262.1s
Found checkpoints: ['model_fold0.pth', 'model_fold1.pth', 'model_fold2.pth', 'model_fold3.pth', 'model_fold4.pth']
Predicting test with 4-way TTA (orig, hflip, vflip, hvflip) and fold-ensemble...


Saved submission.csv with shape: (183, 5)


Unnamed: 0,image_id,healthy,multiple_diseases,rust,scab
0,Test_0,0.01096,0.212108,0.012376,0.986899
1,Test_1,0.220426,0.269574,0.010789,0.879246
2,Test_2,0.081844,0.15579,0.950597,0.067517
3,Test_3,0.043459,0.07136,0.994454,0.02301
4,Test_4,0.008794,0.041318,0.999413,0.010087
