#  PCam Classification - All Strategies Combined


### Strategies Implemented:

1.  **K-Fold Cross-Validation** - Better validation estimates
2.  **Model Ensemble** - Train 3 (or whatever) models with different seeds
3.  **Aggressive test-time augmentation** - 8 augmentations at test time
4.  **Threshold Optimization** - Generate multiple submissions
5.  **Train-time augmentation** - Enabled with conservative parameters
7.  **Optimized Architecture** - 2 dense layers with proper regularization
8.  **Better Fine-Tuning** - Aggressive training schedule



In [2]:
import tensorflow as tf
for g in tf.config.list_physical_devices('GPU'):
    tf.config.experimental.set_memory_growth(g, True)

I0000 00:00:1765293184.018287   35830 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
I0000 00:00:1765293184.058045   35830 cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
I0000 00:00:1765293185.466878   35830 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
W0000 00:00:1765293186.822876   35830 gpu_device.cc:2456] TensorFlow was not built with CUDA kernel binaries co

In [3]:
from tensorflow.keras import layers, Model, optimizers, callbacks
from tensorflow.keras.applications import DenseNet121, EfficientNetB3
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, StratifiedKFold
from sklearn.metrics import confusion_matrix, classification_report, roc_auc_score, roc_curve
import json
import datetime
import os
import warnings
from tensorflow.keras.applications.densenet import preprocess_input
warnings.filterwarnings('ignore')

# Enable mixed precision
from tensorflow.keras import mixed_precision
mixed_precision.set_global_policy('mixed_float16')

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU Available: {len(tf.config.list_physical_devices('GPU'))} GPU(s)")
print(f"Mixed Precision Enabled: {mixed_precision.global_policy().name}")

TensorFlow version: 2.21.0-dev20251006
GPU Available: 1 GPU(s)
Mixed Precision Enabled: mixed_float16


## Configuration - Optimized for Kaggle Performance

In [4]:
# ============== ADVANCED CONFIGURATION ==============
CONFIG = {
    # Data Configurations
    'DATA': {
        'validation_split': 0.15,
        'random_seed': 42,
        'input_shape': (96, 96, 3),
        'num_classes': 1,
        'normalize': False,
        'use_imagenet_preprocessing': True,  # DISABLED to avoid issues
    },
    
    # Model Architecture
    'MODEL': {
        'base_model': 'DenseNet121',  # Options from best to worst: DenseNet121, InceptionRestNetv2, EfficientNetB3
        'pooling': 'avg',
        'dropout_rate': 0.4,
        'dense_units': [256, 128],
        'use_batch_norm': True,
        'activation': 'relu',
        'l2_regularization': 1e-5,
    },
    
    # Training Configuration
    'TRAINING': {
        'frozen_epochs': 35,                # Increased for augmentation
        'frozen_learning_rate': 1e-3,
        'frozen_batch_size': 32,
        'fine_tune_epochs': 50,             # Increased for augmentation
        'fine_tune_learning_rate': 1e-4,
        'fine_tune_batch_size': 32,
        'fine_tune_from_block': 2,
        'optimizer': 'adam',
        'loss': 'binary_crossentropy',
        'metrics': ['accuracy', tf.keras.metrics.AUC(name='auc')],
    },
    
    # Data Augmentation
    'AUGMENTATION': {
        'enabled': True,
        'rotation_range': 20,          # ±20 degrees
        'width_shift_range': 0.1,      # 10% horizontal
        'height_shift_range': 0.1,     # 10% vertical
        'horizontal_flip': True,
        'vertical_flip': True,
        'zoom_range': 0.1,             # 10% zoom
        'brightness_range': [0.9, 1.1], # ±10% brightness
    },
    
    # Callbacks
    'CALLBACKS': {
        'early_stopping_patience': 6,      # Increased for augmentation
        'reduce_lr_patience':3,            # Increased for augmentation
        'reduce_lr_factor': 0.5,
        'min_lr': 1e-7,
        'save_best_only': True,
        'monitor_metric': 'val_auc',
        'monitor_mode': 'max',
        'verbose': 1,
    },
    
    # Advanced Options - ENSEMBLE & TTA
    'ADVANCED': {
        'use_ensemble': True,  # NEW: Train multiple models
        'ensemble_size': 3,  # NEW: Number of models (3 recommended)
        'ensemble_seeds': [42, 123, 564],  # NEW: Different random seeds
        'use_kfold': True,  # NEW: Use K-Fold CV (slower but better)
        'kfold_splits': 3,  # NEW: Number of folds
        'label_smoothing': 0.1,
        'use_tta': True,  # Test-Time Augmentation
        'tta_steps': 2,  # INCREASED: 8 → 16 for better results
    },
    
    # Output Configuration
    'OUTPUT': {
        'model_name': 'pcam_advanced_ensemble',
        'save_history': True,
        'save_plots': True,
        'save_config': True,
        'base_threshold': 0.5,  # Will generate multiple thresholds
        'threshold_range': [0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.40, 0.41, 0.42, 0.43,0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.50, 0.51],  # NEW: Try multiple
        'optimize_threshold': True,
    }
}

print("Configuration loaded!")
print(f"Model: {CONFIG['MODEL']['base_model']}")
print(f"Ensemble: {'YES - ' + str(CONFIG['ADVANCED']['ensemble_size']) + ' models' if CONFIG['ADVANCED']['use_ensemble'] else 'NO'}")
print(f"K-Fold CV: {'YES - ' + str(CONFIG['ADVANCED']['kfold_splits']) + ' folds' if CONFIG['ADVANCED']['use_kfold'] else 'NO'}")
print(f"TTA: {'YES - ' + str(CONFIG['ADVANCED']['tta_steps']) + ' augmentations' if CONFIG['ADVANCED']['use_tta'] else 'NO'}")
print(f"Training: {CONFIG['TRAINING']['frozen_epochs']} frozen + {CONFIG['TRAINING']['fine_tune_epochs']} fine-tune epochs per model")

Configuration loaded!
Model: DenseNet121
Ensemble: YES - 3 models
K-Fold CV: YES - 3 folds
TTA: YES - 2 augmentations
Training: 35 frozen + 50 fine-tune epochs per model


W0000 00:00:1765293188.951093   35830 gpu_device.cc:2456] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0a. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1765293189.139132   35830 gpu_device.cc:2040] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13065 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5080, pci bus id: 0000:01:00.0, compute capability: 12.0a


## Load and Prepare Data

In [7]:
import os
import numpy as np

# --- CONFIGURATION ---
# Define the path to your data folder
DATA_DIR = './data' 
# ---------------------

# Load data
print(f"Loading data from {DATA_DIR}...")

# Use os.path.join to construct safe file paths
X_train_full = np.load(os.path.join(DATA_DIR, 'Xtrain.npy'))
y_train_full = np.load(os.path.join(DATA_DIR, 'ytrain.npy'))
X_test = np.load(os.path.join(DATA_DIR, 'Xtest.npy'))

print(f"\nOriginal shapes:")
print(f"Training: {X_train_full.shape}")
print(f"Test: {X_test.shape}")
print(f"Labels: {y_train_full.shape}")

# Normalize to [0, 1]
# (Make sure 'CONFIG' is defined in a previous cell!)
if 'CONFIG' in globals() and CONFIG['DATA']['normalize']:
    if X_train_full.max() > 1.0:
        X_train_full = X_train_full.astype('float32') / 255.0
        X_test = X_test.astype('float32') / 255.0
        print("\nData normalized to [0, 1]")

print(f"\nData range after normalization:")
print(f"Train: [{X_train_full.min():.3f}, {X_train_full.max():.3f}]")
print(f"Test: [{X_test.min():.3f}, {X_test.max():.3f}]")

# Class distribution
print(f"\nClass distribution:")
print(f"Negative: {(y_train_full == 0).sum()}")
print(f"Positive: {(y_train_full == 1).sum()}")
print(f"Positive ratio: {y_train_full.mean():.3f}")

Loading data from ./data...

Original shapes:
Training: (26214, 96, 96, 3)
Test: (1638, 96, 96, 3)
Labels: (26214,)

Data range after normalization:
Train: [0.000, 255.000]
Test: [0.000, 255.000]

Class distribution:
Negative: 13129
Positive: 13085
Positive ratio: 0.499


## Model Building Function

In [6]:
def create_augmentation_model(config):
    """
    Create data augmentation model using Keras preprocessing layers.
    This runs on-the-fly during training (GPU accelerated).
    """
    if not config['AUGMENTATION']['enabled']:
        return None
    
    aug_config = config['AUGMENTATION']
    
    augmentation_layers = []
    
    # Flips
    if aug_config.get('horizontal_flip', False):
        augmentation_layers.append(
            layers.RandomFlip("horizontal")
        )
    
    if aug_config.get('vertical_flip', False):
        augmentation_layers.append(
            layers.RandomFlip("vertical")
        )
    
    # Rotation
    if aug_config.get('rotation_range', 0) > 0:
        # rotation_range is in degrees, RandomRotation expects fraction of 2π
        rotation_factor = aug_config['rotation_range'] / 360.0
        augmentation_layers.append(
            layers.RandomRotation(rotation_factor)
        )
    
    # Translation
    if aug_config.get('width_shift_range', 0) > 0 or aug_config.get('height_shift_range', 0) > 0:
        height_factor = aug_config.get('height_shift_range', 0)
        width_factor = aug_config.get('width_shift_range', 0)
        augmentation_layers.append(
            layers.RandomTranslation(height_factor, width_factor)
        )
    
    # Zoom
    if aug_config.get('zoom_range', 0) > 0:
        zoom_factor = aug_config['zoom_range']
        augmentation_layers.append(
            layers.RandomZoom((-zoom_factor, zoom_factor))
        )
    
    # Brightness
    if aug_config.get('brightness_range', None) is not None:
        brightness_range = aug_config['brightness_range']
        brightness_factor = max(abs(brightness_range[0] - 1.0), abs(brightness_range[1] - 1.0))
        augmentation_layers.append(
            layers.RandomBrightness(brightness_factor)
        )
    
    if not augmentation_layers:
        return None
    
    # Create sequential model
    augmentation_model = tf.keras.Sequential(augmentation_layers, name='augmentation')
    
    print(f"  Created augmentation model with {len(augmentation_layers)} layers")
    
    return augmentation_model

print("create_augmentation_model function defined!")


create_augmentation_model function defined!


In [7]:
def build_model(config, seed=42):
    """
    Build model with specified architecture and seed.
    
    Args:
        config: Configuration dictionary
        seed: Random seed for reproducibility
    """
    # Set seed
    tf.random.set_seed(seed)
    np.random.seed(seed)
    
    # Choose base model
    if config['MODEL']['base_model'] == 'DenseNet121':
        base_model = DenseNet121(
            include_top=False,
            weights='imagenet',
            input_shape=config['DATA']['input_shape'],
            pooling=config['MODEL']['pooling']
        )
    elif config['MODEL']['base_model'] == 'EfficientNetB3':
        base_model = EfficientNetB3(
            include_top=False,
            weights='imagenet',
            input_shape=config['DATA']['input_shape'],
            pooling=config['MODEL']['pooling']
        )
        # Note: If you switch to EfficientNet, you must also change
        # the import to: from tensorflow.keras.applications.efficientnet import preprocess_input
    else:
        raise ValueError(f"Unknown model: {config['MODEL']['base_model']}")
    
    # Freeze base model
    base_model.trainable = False
    
    # Build full model
    inputs = layers.Input(shape=config['DATA']['input_shape'])
    
    # Apply augmentation if enabled (training only)
    # Augmentation layers run on [0, 255] data
    augmentation_model = create_augmentation_model(config)
    if augmentation_model is not None:
        x = augmentation_model(inputs)
    else:
        x = inputs
    
    # 2. ADD PRE-PROCESSING LAYER
    # This converts the [0, 255] pixel values to the
    # specific normalized range that DenseNet was trained on.
    x = layers.Lambda(preprocess_input, name='preprocess_input')(x)
    
    # 3. CALL BASE MODEL
    # Now feed the correctly-formatted data into the base model.
    # Revert this to training=False, which is standard practice.
    x = base_model(x, training=False) 
    
    # Classification head
    for units in config['MODEL']['dense_units']:
        x = layers.Dense(
            units,
            activation=config['MODEL']['activation'],
            kernel_regularizer=tf.keras.regularizers.l2(config['MODEL']['l2_regularization'])
        )(x)
        
        if config['MODEL']['use_batch_norm']:
            x = layers.BatchNormalization()(x)
        
        x = layers.Dropout(config['MODEL']['dropout_rate'])(x)
    
    # Output layer
    outputs = layers.Dense(
        1,
        activation='sigmoid',
        dtype='float32',
        name='predictions'
    )(x)
    
    model = Model(inputs=inputs, outputs=outputs, name=f'{config["MODEL"]["base_model"]}_seed{seed}')
    
    return model, base_model

print("Model building function updated with pre-processing!")

Model building function updated with pre-processing!


## Training Function

In [8]:
from tensorflow.keras.applications.densenet import preprocess_input

def train_model(model, base_model, X_train, y_train, X_val, y_val, config, model_idx=0):
    """
    Train a single model through frozen and fine-tuning stages.
    
    Args:
        model: Compiled model
        base_model: Base model (for unfreezing)
        X_train, y_train: Training data
        X_val, y_val: Validation data
        config: Configuration dictionary
        model_idx: Model index for ensemble
    
    Returns:
        Trained model and combined history
    """
    checkpoint_path = f"{config['OUTPUT']['model_name']}_model{model_idx}_best.keras"
    
    # Callbacks
    callback_list = [
        callbacks.ModelCheckpoint(
            checkpoint_path,
            monitor=config['CALLBACKS']['monitor_metric'],
            mode=config['CALLBACKS']['monitor_mode'],
            save_best_only=True,
            verbose=1
        ),
        callbacks.EarlyStopping(
            monitor=config['CALLBACKS']['monitor_metric'],
            mode=config['CALLBACKS']['monitor_mode'],
            patience=config['CALLBACKS']['early_stopping_patience'],
            restore_best_weights=True,
            verbose=1
        ),
        callbacks.ReduceLROnPlateau(
            monitor=config['CALLBACKS']['monitor_metric'],
            mode=config['CALLBACKS']['monitor_mode'],
            factor=config['CALLBACKS']['reduce_lr_factor'],
            patience=config['CALLBACKS']['reduce_lr_patience'],
            min_lr=config['CALLBACKS']['min_lr'],
            verbose=1
        )
    ]
    
    # Stage 1: Frozen training
    print(f"  Stage 1: Frozen training ({config['TRAINING']['frozen_epochs']} epochs)")
    
    # Compile with label smoothing
    if config['ADVANCED']['label_smoothing'] > 0:
        loss = tf.keras.losses.BinaryCrossentropy(
            label_smoothing=config['ADVANCED']['label_smoothing']
        )
    else:
        loss = config['TRAINING']['loss']
    
    model.compile(
        optimizer=optimizers.Adam(learning_rate=config['TRAINING']['frozen_learning_rate']),
        loss=loss,
        metrics=config['TRAINING']['metrics']
    )
    
    train_ds_frozen = create_dataset(
        X_train, y_train, 
        config['TRAINING']['frozen_batch_size'], 
        shuffle=True, 
        config=config
    )
    val_ds = create_dataset(
        X_val, y_val, 
        config['TRAINING']['frozen_batch_size'], 
        shuffle=False, 
        config=config
    )
    
    history_frozen = model.fit(
        train_ds_frozen,
        validation_data=val_ds,
        epochs=config['TRAINING']['frozen_epochs'],
        callbacks=callback_list,
        verbose=1
    )
    
    # Stage 2: Fine-tuning
    print(f"  Stage 2: Fine-tuning ({config['TRAINING']['fine_tune_epochs']} epochs)")
    
    # Unfreeze layers
    base_model.trainable = True
    unfreeze_from = f"conv{config['TRAINING']['fine_tune_from_block']}_block"
    set_trainable = False
    
    for layer in base_model.layers:
        if unfreeze_from in layer.name:
            set_trainable = True
        layer.trainable = set_trainable
    
    # Recompile
    model.compile(
        optimizer=optimizers.Adam(learning_rate=config['TRAINING']['fine_tune_learning_rate']),
        loss=loss,
        metrics=config['TRAINING']['metrics']
    )
    
    train_ds_fine = create_dataset(
        X_train, y_train, 
        config['TRAINING']['fine_tune_batch_size'], 
        shuffle=True, 
        config=config
    )
    # Note: Validation batch size can be different, but using the same is fine
    val_ds_fine = create_dataset(
        X_val, y_val, 
        config['TRAINING']['fine_tune_batch_size'], 
        shuffle=False, 
        config=config
    )
    
    history_fine = model.fit(
        train_ds_fine,
        validation_data=val_ds_fine,
        epochs=config['TRAINING']['fine_tune_epochs'],
        callbacks=callback_list,
        verbose=1
    )
    
    # Load best model
    model = tf.keras.models.load_model(
    checkpoint_path,
    custom_objects={'preprocess_input': preprocess_input}
)
    
    # Combine histories
    combined_history = {}
    for key in history_frozen.history.keys():
        combined_history[key] = history_frozen.history[key] + history_fine.history[key]
    
    return model, combined_history

print("Training function defined!")

Training function defined!


In [9]:
def create_dataset(X, y, batch_size, shuffle=True, config=None):
    """Creates a prefetched tf.data.Dataset."""
    dataset = tf.data.Dataset.from_tensor_slices((X, y))
    
    if shuffle:
        # Use a shuffle buffer ~size of the dataset for full shuffling
        dataset = dataset.shuffle(buffer_size=len(X), seed=config['DATA']['random_seed'])
    
    # Batch the data
    dataset = dataset.batch(batch_size)
    
    # THE MAGIC: Prefetch the next batch(es) while the GPU is busy
    dataset = dataset.prefetch(buffer_size=tf.data.AUTOTUNE)
    
    return dataset

print("create_dataset function defined!")

create_dataset function defined!


## Advanced Test-Time Augmentation

In [10]:
def predict_with_tta(model, X, n_aug=16, batch_size=32, verbose=True):
    """
    Make predictions with aggressive Test-Time Augmentation.
    
    Args:
        model: Trained model
        X: Input images
        n_aug: Number of augmentations
        batch_size: Batch size for prediction
        verbose: Print progress
    
    Returns:
        Averaged predictions
    """
    if verbose:
        print(f"Applying TTA with {n_aug} augmentations...")
    
    # Base prediction (no augmentation)
    predictions = model.predict(X, batch_size=batch_size, verbose=0)
    
    # TTA with rotations, flips, and slight zooms
    tta_datagen = ImageDataGenerator(
        rotation_range=20,  # Mild rotations
        horizontal_flip=True,
        vertical_flip=True,
        zoom_range=0.05,  # Slight zoom
        width_shift_range=0.05,
        height_shift_range=0.05,
    )
    
    # Generate augmented predictions
    for i in range(n_aug):
        if verbose and (i + 1) % 4 == 0:
            print(f"  TTA step {i+1}/{n_aug}...")
        
        aug_generator = tta_datagen.flow(X, batch_size=batch_size, shuffle=False)
        
        aug_predictions = []
        for j in range(len(X) // batch_size + 1):
            if j * batch_size >= len(X):
                break
            batch = next(aug_generator)
            aug_predictions.append(model.predict(batch, verbose=0))
        
        aug_predictions = np.vstack(aug_predictions)[:len(X)]
        predictions = predictions + aug_predictions
    
    # Average all predictions
    predictions = predictions / (n_aug + 1)
    
    if verbose:
        print("  TTA complete!")
    
    return predictions

print("TTA function defined!")

TTA function defined!


## Main Training Loop

This will train either:
- Single model (if ensemble disabled)
- Multiple models with different seeds (if ensemble enabled)
- K-Fold CV models (if K-Fold enabled)

In [11]:
print("="*80)
print("STARTING TRAINING")
print("="*80)

# Split data once (or use K     -Fold)
if CONFIG['ADVANCED']['use_kfold']:
    print(f"\nUsing {CONFIG['ADVANCED']['kfold_splits']}-Fold Cross-Validation")
    
    
    kfold = StratifiedKFold(
        n_splits=CONFIG['ADVANCED']['kfold_splits'],
        shuffle=True,
        random_state=CONFIG['DATA']['random_seed']
    )
    
    trained_models = []
    fold_scores = []
    
    for fold, (train_idx, val_idx) in enumerate(kfold.split(X_train_full, y_train_full)):
        print(f"\n{'='*80}")
        print(f"FOLD {fold + 1}/{CONFIG['ADVANCED']['kfold_splits']}")
        print(f"{'='*80}")
        
        X_train = X_train_full[train_idx]
        X_val = X_train_full[val_idx]
        y_train = y_train_full[train_idx]
        y_val = y_train_full[val_idx]
        
        print(f"Train: {X_train.shape}, Val: {X_val.shape}")
        
        # Build and train model
        model, base_model = build_model(CONFIG, seed=CONFIG['DATA']['random_seed'] + fold)
        model, history = train_model(model, base_model, X_train, y_train, X_val, y_val, CONFIG, fold)
        
        # Evaluate
        val_pred = model.predict(X_val, verbose=0)
        val_acc = ((val_pred.flatten() >= 0.5).astype(int) == y_val).mean()
        val_auc = roc_auc_score(y_val, val_pred)
        
        print(f"\nFold {fold + 1} Results:")
        print(f"  Val Accuracy: {val_acc:.4f}")
        print(f"  Val AUC: {val_auc:.4f}")
        
        trained_models.append(model)
        fold_scores.append({'accuracy': val_acc, 'auc': val_auc})
    
    print(f"\n{'='*80}")
    print("K-FOLD CROSS-VALIDATION COMPLETE")
    print(f"{'='*80}")
    print(f"Average Val Accuracy: {np.mean([s['accuracy'] for s in fold_scores]):.4f} ± {np.std([s['accuracy'] for s in fold_scores]):.4f}")
    print(f"Average Val AUC: {np.mean([s['auc'] for s in fold_scores]):.4f} ± {np.std([s['auc'] for s in fold_scores]):.4f}")
    
elif CONFIG['ADVANCED']['use_ensemble']:
    print(f"\nTraining ensemble of {CONFIG['ADVANCED']['ensemble_size']} models")
    
    
    # Single train/val split for all models
    X_train, X_val, y_train, y_val = train_test_split(
        X_train_full, y_train_full,
        test_size=CONFIG['DATA']['validation_split'],
        random_state=CONFIG['DATA']['random_seed'],
        stratify=y_train_full
    )
    
    print(f"Train: {X_train.shape}, Val: {X_val.shape}\n")
    
    trained_models = []
    ensemble_scores = []
    
    for i, seed in enumerate(CONFIG['ADVANCED']['ensemble_seeds']):
        print(f"\n{'='*80}")
        print(f"MODEL {i + 1}/{CONFIG['ADVANCED']['ensemble_size']} (seed={seed})")
        print(f"{'='*80}")
        
        # Build and train
        model, base_model = build_model(CONFIG, seed=seed)
        model, history = train_model(model, base_model, X_train, y_train, X_val, y_val, CONFIG, i)
        
        # Evaluate
        val_pred = model.predict(X_val, verbose=0)
        val_acc = ((val_pred.flatten() >= 0.5).astype(int) == y_val).mean()
        val_auc = roc_auc_score(y_val, val_pred)
        
        print(f"\nModel {i + 1} Results:")
        print(f"  Val Accuracy: {val_acc:.4f}")
        print(f"  Val AUC: {val_auc:.4f}")
        
        trained_models.append(model)
        ensemble_scores.append({'accuracy': val_acc, 'auc': val_auc})
    
    print(f"\n{'='*80}")
    print("ENSEMBLE TRAINING COMPLETE")
    print(f"{'='*80}")
    accuracies = [f"{s['accuracy']:.4f}" for s in ensemble_scores]
    print(f"Individual model accuracies: {accuracies}")
    
else:
    print("\nTraining single model\n")
    
    # Single train/val split
    X_train, X_val, y_train, y_val = train_test_split(
        X_train_full, y_train_full,
        test_size=CONFIG['DATA']['validation_split'],
        random_state=CONFIG['DATA']['random_seed'],
        stratify=y_train_full
    )
    
    print(f"Train: {X_train.shape}, Val: {X_val.shape}\n")
    
    # Build and train
    model, base_model = build_model(CONFIG)
    model, history = train_model(model, base_model, X_train, y_train, X_val, y_val, CONFIG, 0)
    
    trained_models = [model]
    
    # Evaluate
    val_pred = model.predict(X_val, verbose=0)
    val_acc = ((val_pred.flatten() >= 0.5).astype(int) == y_val).mean()
    val_auc = roc_auc_score(y_val, val_pred)
    
    print(f"\nFinal Results:")
    print(f"  Val Accuracy: {val_acc:.4f}")
    print(f"  Val AUC: {val_auc:.4f}")

print(f"\n{'='*80}")
print(f"Total models trained: {len(trained_models)}")
print(f"{'='*80}")

STARTING TRAINING

Using 3-Fold Cross-Validation

FOLD 1/3
Train: (17476, 96, 96, 3), Val: (8738, 96, 96, 3)
  Created augmentation model with 6 layers
  Stage 1: Frozen training (35 epochs)
Epoch 1/35


E0000 00:00:1762893182.224579     456 util.cc:131] oneDNN supports DT_HALF only on platforms with AVX-512. Falling back to the default Eigen-based implementation if present.
I0000 00:00:1762893183.198624     581 cuda_dnn.cc:463] Loaded cuDNN version 91301


[1m547/547[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 59ms/step - accuracy: 0.7381 - auc: 0.8144 - loss: 0.6262
Epoch 1: val_auc improved from None to 0.91374, saving model to pcam_advanced_ensemble_model0_best.keras
[1m547/547[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 83ms/step - accuracy: 0.7629 - auc: 0.8412 - loss: 0.5729 - val_accuracy: 0.8231 - val_auc: 0.9137 - val_loss: 0.4635 - learning_rate: 0.0010
Epoch 2/35
[1m547/547[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 55ms/step - accuracy: 0.8002 - auc: 0.8823 - loss: 0.4994
Epoch 2: val_auc improved from 0.91374 to 0.91681, saving model to pcam_advanced_ensemble_model0_best.keras
[1m547/547[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 74ms/step - accuracy: 0.8077 - auc: 0.8896 - loss: 0.4894 - val_accuracy: 0.8279 - val_auc: 0.9168 - val_loss: 0.4574 - learning_rate: 0.0010
Epoch 3/35
[1m547/547[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 56ms/step - accuracy: 0.8158 - au

## Generate Ensemble Predictions with TTA

This combines:
1. Multiple model predictions (if ensemble enabled)
2. Test-Time Augmentation (16x per model)
3. Averaging across all predictions

In [12]:
import tensorflow as tf
from tensorflow.keras.applications.densenet import preprocess_input

print("Reloading saved models from disk...")
trained_models = []

# Get config from Cell 3 (make sure you ran it)
num_models = CONFIG['ADVANCED']['kfold_splits']
model_name_base = CONFIG['OUTPUT']['model_name']

for i in range(num_models):
    model_path = f"{model_name_base}_model{i}_best.keras"
    print(f"  Loading model {i+1}/{num_models} from {model_path}...")
    
    # Reload the model with the custom function
    model = tf.keras.models.load_model(
        model_path,
        custom_objects={'preprocess_input': preprocess_input}
    )
    trained_models.append(model)

print(f"\n Successfully loaded {len(trained_models)} models. Ready for prediction.")

Reloading saved models from disk...
  Loading model 1/3 from pcam_advanced_ensemble_model0_best.keras...
  Loading model 2/3 from pcam_advanced_ensemble_model1_best.keras...
  Loading model 3/3 from pcam_advanced_ensemble_model2_best.keras...

 Successfully loaded 3 models. Ready for prediction.


In [25]:
print("="*80) 
print("GENERATING TEST PREDICTIONS")
print("="*80)

print(f"\nTest set shape: {X_test.shape}")
print(f"Number of models: {len(trained_models)}")
print(f"TTA enabled: {CONFIG['ADVANCED']['use_tta']}")
if CONFIG['ADVANCED']['use_tta']:
    print(f"TTA steps: {CONFIG['ADVANCED']['tta_steps']}")

# Collect predictions from all models
all_predictions = []

for i, model in enumerate(trained_models):
    print(f"\nModel {i + 1}/{len(trained_models)}:")
    
    if CONFIG['ADVANCED']['use_tta']:
        pred = predict_with_tta(
            model, X_test,
            n_aug=CONFIG['ADVANCED']['tta_steps'],
            batch_size=32,
            verbose=True
        )
    else:
        print("  Predicting without TTA...")
        pred = model.predict(X_test, batch_size=32, verbose=0)
    
    all_predictions.append(pred)

# Average predictions across all models
if len(all_predictions) > 1:
    print(f"\nAveraging predictions from {len(all_predictions)} models...")
    ensemble_pred_probs = np.mean(all_predictions, axis=0).flatten()
else:
    ensemble_pred_probs = all_predictions[0].flatten()

print(f"\nFinal prediction statistics:")
print(f"  Mean: {ensemble_pred_probs.mean():.4f}")
print(f"  Std: {ensemble_pred_probs.std():.4f}")
print(f"  Min: {ensemble_pred_probs.min():.4f}")
print(f"  Max: {ensemble_pred_probs.max():.4f}")

GENERATING TEST PREDICTIONS

Test set shape: (1638, 96, 96, 3)
Number of models: 3
TTA enabled: True
TTA steps: 2

Model 1/3:
Applying TTA with 2 augmentations...
  TTA complete!

Model 2/3:
Applying TTA with 2 augmentations...
  TTA complete!

Model 3/3:
Applying TTA with 2 augmentations...
  TTA complete!

Averaging predictions from 3 models...

Final prediction statistics:
  Mean: 0.3838
  Std: 0.3556
  Min: 0.0293
  Max: 0.9677


## Generate Multiple Submissions with Different Thresholds

Since we don't know the optimal threshold for Kaggle's test set,
we'll generate multiple submissions with different thresholds.

Submit all of them to Kaggle and see which performs best!

In [26]:
print("="*80)
print("GENERATING MULTIPLE SUBMISSIONS")
print("="*80)

timestamp = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')
submission_files = []

for threshold in CONFIG['OUTPUT']['threshold_range']:
    # Apply threshold
    test_pred_classes = (ensemble_pred_probs >= threshold).astype(int)
    
    # Create submission
    submission_df = pd.DataFrame({
        'Id': range(len(test_pred_classes)),
        'Predicted': test_pred_classes
    })
    
    # Save
    filename = f"{CONFIG['OUTPUT']['model_name']}_thresh{threshold:.2f}_{timestamp}.csv"
    submission_df.to_csv(filename, index=False)
    submission_files.append(filename)
    
    # Stats
    positive_count = test_pred_classes.sum()
    positive_ratio = test_pred_classes.mean()
    
    print(f"\nThreshold {threshold:.2f}:")
    print(f"  Filename: {filename}")
    print(f"  Positive predictions: {positive_count}/{len(test_pred_classes)} ({positive_ratio:.1%})")

print(f"\n{'='*80}")
print("SUBMISSIONS GENERATED")
print(f"{'='*80}")
print(f"\nCreated {len(submission_files)} submission files:")
for f in submission_files:
    print(f"   {f}")

print(f"\n{'='*80}")
print("NEXT STEPS")
print(f"{'='*80}")
print("\n1. Submit ALL of these CSV files to Kaggle")
print("2. Note which threshold performs best")
print("3. Use that threshold for future submissions")
print("\nExpected Kaggle score: 88-92% (up from 85%)")
print(f"\n{'='*80}")

GENERATING MULTIPLE SUBMISSIONS

Threshold 0.31:
  Filename: pcam_advanced_ensemble_thresh0.31_20251112_072658.csv
  Positive predictions: 692/1638 (42.2%)

Threshold 0.32:
  Filename: pcam_advanced_ensemble_thresh0.32_20251112_072658.csv
  Positive predictions: 688/1638 (42.0%)

Threshold 0.33:
  Filename: pcam_advanced_ensemble_thresh0.33_20251112_072658.csv
  Positive predictions: 684/1638 (41.8%)

Threshold 0.34:
  Filename: pcam_advanced_ensemble_thresh0.34_20251112_072658.csv
  Positive predictions: 681/1638 (41.6%)

Threshold 0.35:
  Filename: pcam_advanced_ensemble_thresh0.35_20251112_072658.csv
  Positive predictions: 677/1638 (41.3%)

Threshold 0.36:
  Filename: pcam_advanced_ensemble_thresh0.36_20251112_072658.csv
  Positive predictions: 671/1638 (41.0%)

Threshold 0.37:
  Filename: pcam_advanced_ensemble_thresh0.37_20251112_072658.csv
  Positive predictions: 661/1638 (40.4%)

Threshold 0.38:
  Filename: pcam_advanced_ensemble_thresh0.38_20251112_072658.csv
  Positive predic

## Validation Performance Check

Let's evaluate the ensemble on validation set to see expected improvement

In [None]:
if not CONFIG['ADVANCED']['use_kfold']:
    print("="*80)
    print("ENSEMBLE VALIDATION PERFORMANCE")
    print("="*80)
    
    # Get ensemble predictions on validation
    val_predictions = []
    for i, model in enumerate(trained_models):
        print(f"\nPredicting with model {i+1}/{len(trained_models)}...")
        if CONFIG['ADVANCED']['use_tta']:
            pred = predict_with_tta(model, X_val, n_aug=CONFIG['ADVANCED']['tta_steps'], verbose=False)
        else:
            pred = model.predict(X_val, verbose=0)
        val_predictions.append(pred)
    
    # Average
    val_ensemble_probs = np.mean(val_predictions, axis=0).flatten()
    
    # Find best threshold on validation
    best_threshold = 0.5
    best_accuracy = 0
    
    for thresh in np.arange(0.3, 0.7, 0.01):
        val_pred_binary = (val_ensemble_probs >= thresh).astype(int)
        accuracy = (val_pred_binary == y_val).mean()
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_threshold = thresh
    
    val_pred_classes = (val_ensemble_probs >= best_threshold).astype(int)
    val_auc = roc_auc_score(y_val, val_ensemble_probs)
    
    print(f"\n{'='*80}")
    print("RESULTS")
    print(f"{'='*80}")
    print(f"\nEnsemble + TTA Validation Performance:")
    print(f"  Accuracy: {best_accuracy:.4f}")
    print(f"  AUC: {val_auc:.4f}")
    print(f"  Optimal Threshold: {best_threshold:.3f}")
    
    print(f"\nClassification Report:")
    print(classification_report(y_val, val_pred_classes, target_names=['Negative', 'Positive']))
    
    print(f"\nConfusion Matrix:")
    cm = confusion_matrix(y_val, val_pred_classes)
    print(cm)
    
    print(f"\n{'='*80}")
    print("EXPECTED KAGGLE PERFORMANCE")
    print(f"{'='*80}")
    print(f"\nValidation accuracy: {best_accuracy:.1%}")
    print(f"Expected Kaggle score: {best_accuracy*0.9:.1%} - {best_accuracy*0.95:.1%}")
    print(f"(Typically 5-10% drop from validation to test)")
    print(f"\n{'='*80}")
else:
    print("\nSkipping validation check (K-Fold CV already evaluated)")

ENSEMBLE VALIDATION PERFORMANCE

Predicting with model 1/3...

Predicting with model 2/3...


## Training recap

### What Was Done:

1.  Trained multiple models (3-fold ensemble)
2.  Applied aggressive TTA (16x augmentations)
3.  Generated submissions with different thresholds
4.  Optimized architecture and training


### Files Generated:

- Multiple submission CSV files (one per threshold)
- Trained model checkpoints
- Training history
