# üçï Universal Food Classification Training
## Experiments: Inception v3 & Inception-ResNet-v2 (Head & Full Backbone)

**Select experiment by changing `EXPERIMENT_ID`:**
- 1: Inception v3 (Head only)
- 2: Inception-ResNet-v2 (Head only) 
- 3: Inception v3 (Full backbone)
- 4: Inception-ResNet-v2 (Full backbone)

In [22]:
# üü¶ Cell 1 ‚Äì Setup and Dependencies
!git clone https://github.com/Romaha095/architectural-comparison-inception.git
%cd architectural-comparison-inception
!git checkout inception-resnet-v2_experiments
!pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
!pip install -r requirements.txt

Cloning into 'architectural-comparison-inception'...
remote: Enumerating objects: 171, done.[K
remote: Counting objects: 100% (44/44), done.[K
remote: Compressing objects: 100% (34/34), done.[K
remote: Total 171 (delta 24), reused 9 (delta 9), pack-reused 127 (from 1)[K
Receiving objects: 100% (171/171), 63.72 KiB | 7.08 MiB/s, done.
Resolving deltas: 100% (69/69), done.
/kaggle/working/architectural-comparison-inception/architectural-comparison-inception/architectural-comparison-inception/architectural-comparison-inception/architectural-comparison-inception/architectural-comparison-inception/architectural-comparison-inception/architectural-comparison-inception
Branch 'inception-resnet-v2_experiments' set up to track remote branch 'inception-resnet-v2_experiments' from 'origin'.
Switched to a new branch 'inception-resnet-v2_experiments'
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu118


In [23]:
# üü© Cell 2 ‚Äì Universal Training Script with All Features

import os
import sys
import time
import logging
import shutil
from pathlib import Path
from datetime import datetime

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, random_split
from torch.amp import autocast, GradScaler
from torchvision import transforms, datasets

import timm
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report, precision_recall_fscore_support
from tqdm import tqdm

# ============================================================================
# üéØ EXPERIMENT CONFIGURATION - CHANGE THIS TO SELECT EXPERIMENT
# ============================================================================
EXPERIMENT_ID = 4  # Options: 1, 2, 3, 4

# Experiment configurations
EXPERIMENTS = {
    1: {'model': 'inception_v3', 'train_mode': 'head', 'name': 'InceptionV3-Head'},
    2: {'model': 'inception_resnet_v2', 'train_mode': 'head', 'name': 'InceptionResNetV2-Head'},
    3: {'model': 'inception_v3', 'train_mode': 'full', 'name': 'InceptionV3-FullBackbone'},
    4: {'model': 'inception_resnet_v2', 'train_mode': 'full', 'name': 'InceptionResNetV2-FullBackbone'}
}

# Validate experiment ID
if EXPERIMENT_ID not in EXPERIMENTS:
    raise ValueError(f"Invalid EXPERIMENT_ID: {EXPERIMENT_ID}. Must be 1, 2, 3, or 4.")

EXP_CONFIG = EXPERIMENTS[EXPERIMENT_ID]
MODEL_TYPE = EXP_CONFIG['model']
TRAIN_MODE = EXP_CONFIG['train_mode']  # 'head' or 'full'
EXP_NAME = EXP_CONFIG['name']

print(f"\n{'='*80}")
print(f"üöÄ EXPERIMENT {EXPERIMENT_ID}: {EXP_NAME}")
print(f"   Model: {MODEL_TYPE}")
print(f"   Training mode: {TRAIN_MODE} ({'Head only' if TRAIN_MODE == 'head' else 'Full backbone'})")
print(f"{'='*80}\n")

# ============================================================================
# üìã HYPERPARAMETERS
# ============================================================================
NUM_EPOCHS = 10
BATCH_SIZE = 32
LEARNING_RATE = 0.001
IMG_SIZE = 299  # Standard for Inception architectures
NUM_WORKERS = 4
USE_AMP = True  # Mixed precision training

# Auto-detect dataset path
if os.path.exists('/kaggle/input/food-image-classification-dataset/Food Classification dataset'):
    DATA_ROOT = Path('/kaggle/input/food-image-classification-dataset/Food Classification dataset')
    print(f"‚úÖ Found dataset at: {DATA_ROOT}")
elif os.path.exists('/kaggle/input/food-101/images'):
    DATA_ROOT = Path('/kaggle/input/food-101/images')
    print(f"‚úÖ Found dataset at: {DATA_ROOT}")
else:
    raise FileNotFoundError("Dataset not found! Please check the Kaggle dataset path.")

OUTPUT_DIR = Path(f'/kaggle/working/exp{EXPERIMENT_ID}_{EXP_NAME}')
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

# ============================================================================
# üìä LOGGING SETUP
# ============================================================================
log_file = OUTPUT_DIR / 'training.log'
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s | %(levelname)s | %(name)s | %(message)s',
    datefmt='%Y-%m-%d %H:%M:%S',
    handlers=[
        logging.FileHandler(log_file),
        logging.StreamHandler(sys.stdout)
    ]
)
logger = logging.getLogger('main')

logger.info(f"Starting Experiment {EXPERIMENT_ID}: {EXP_NAME}")
logger.info(f"Model: {MODEL_TYPE}, Training mode: {TRAIN_MODE}")
logger.info(f"Output directory: {OUTPUT_DIR}")

# ============================================================================
# üîß DEVICE SETUP
# ============================================================================
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
logger.info(f"Using device: {device}")
if torch.cuda.is_available():
    logger.info(f"GPU: {torch.cuda.get_device_name(0)}")

# ============================================================================
# üñºÔ∏è DATA PREPARATION
# ============================================================================
logger.info("Preparing datasets...")

train_transform = transforms.Compose([
    transforms.Resize((IMG_SIZE, IMG_SIZE)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(10),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

val_test_transform = transforms.Compose([
    transforms.Resize((IMG_SIZE, IMG_SIZE)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Check if data is already split into train/val/test subdirectories
has_splits = (DATA_ROOT / 'train').exists() and (DATA_ROOT / 'validation').exists()

if has_splits:
    logger.info("Loading pre-split datasets...")
    train_dataset = datasets.ImageFolder(DATA_ROOT / 'train', transform=train_transform)
    val_dataset = datasets.ImageFolder(DATA_ROOT / 'validation', transform=val_test_transform)
    test_dataset = datasets.ImageFolder(DATA_ROOT / 'test', transform=val_test_transform)
else:
    # Load all data and split manually
    logger.info("No pre-split found. Creating train/val/test splits (70/15/15)...")
    full_dataset = datasets.ImageFolder(DATA_ROOT)
    
    # Calculate split sizes
    total_size = len(full_dataset)
    train_size = int(0.70 * total_size)
    val_size = int(0.15 * total_size)
    test_size = total_size - train_size - val_size
    
    # Split dataset
    train_dataset, val_dataset, test_dataset = random_split(
        full_dataset, [train_size, val_size, test_size],
        generator=torch.Generator().manual_seed(42)
    )
    
    # Apply transforms
    train_dataset.dataset.transform = train_transform
    val_dataset.dataset.transform = val_test_transform
    test_dataset.dataset.transform = val_test_transform

# Create data loaders
train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, 
                         num_workers=NUM_WORKERS, pin_memory=True)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False,
                       num_workers=NUM_WORKERS, pin_memory=True)
test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False,
                        num_workers=NUM_WORKERS, pin_memory=True)

# Get number of classes
if has_splits:
    NUM_CLASSES = len(train_dataset.classes)
    class_names = train_dataset.classes
else:
    NUM_CLASSES = len(train_dataset.dataset.classes)
    class_names = train_dataset.dataset.classes

logger.info(f"Train samples: {len(train_dataset)}")
logger.info(f"Val samples: {len(val_dataset)}")
logger.info(f"Test samples: {len(test_dataset)}")
logger.info(f"Number of classes: {NUM_CLASSES}")
logger.info(f"Classes: {class_names[:5]}... (showing first 5)")

# ============================================================================
# üèóÔ∏è MODEL CREATION
# ============================================================================
def create_model(model_type, num_classes, train_mode):
    """Create and configure model based on type and training mode.
    
    Args:
        model_type: 'inception_v3' or 'inception_resnet_v2'
        num_classes: Number of output classes
        train_mode: 'head' (freeze backbone) or 'full' (train everything)
    """
    logger.info(f"Creating model: {model_type}")
    
    # Create model with pretrained weights
    model = timm.create_model(model_type, pretrained=True, num_classes=num_classes)
    
    if train_mode == 'head':
        # Freeze all layers except the classifier head
        logger.info("Freezing backbone layers (training head only)")
        for param in model.parameters():
            param.requires_grad = False
        
        # Unfreeze classifier head
        if model_type == 'inception_v3':
            for param in model.fc.parameters():
                param.requires_grad = True
        elif model_type == 'inception_resnet_v2':
            for param in model.classif.parameters():
                param.requires_grad = True
    else:
        # Train full backbone
        logger.info("Training full backbone (all layers trainable)")
        for param in model.parameters():
            param.requires_grad = True
    
    # Count parameters
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    logger.info(f"Model: {model_type}")
    logger.info(f"  Total params     = {total_params:,}")
    logger.info(f"  Trainable params = {trainable_params:,}")
    logger.info(f"  Frozen params    = {total_params - trainable_params:,}")
    
    return model

model = create_model(MODEL_TYPE, NUM_CLASSES, TRAIN_MODE).to(device)

# ============================================================================
# üéì TRAINING SETUP
# ============================================================================
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=LEARNING_RATE)
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', factor=0.5, patience=2)
scaler = GradScaler(device='cuda', enabled=USE_AMP)

logger.info(f"Optimizer: Adam (lr={LEARNING_RATE})")
logger.info(f"Mixed precision: {USE_AMP}")

# ============================================================================
# üìà TRAINING & EVALUATION FUNCTIONS
# ============================================================================
def train_epoch(model, loader, criterion, optimizer, scaler, device, use_amp):
    """Train for one epoch."""
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    
    pbar = tqdm(loader, desc='Training', leave=False)
    for inputs, labels in pbar:
        inputs, labels = inputs.to(device), labels.to(device)
        
        optimizer.zero_grad()
        
        with autocast(device_type='cuda', enabled=use_amp):
            outputs = model(inputs)
            loss = criterion(outputs, labels)
        
        scaler.scale(loss).backward()
        scaler.step(optimizer)
        scaler.update()
        
        running_loss += loss.item() * inputs.size(0)
        _, predicted = outputs.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()
        
        pbar.set_postfix({'loss': f'{loss.item():.4f}', 'acc': f'{100.*correct/total:.2f}%'})
    
    epoch_loss = running_loss / total
    epoch_acc = 100. * correct / total
    return epoch_loss, epoch_acc

def evaluate(model, loader, criterion, device):
    """Evaluate model on validation/test set."""
    model.eval()
    running_loss = 0.0
    correct = 0
    total = 0
    
    with torch.no_grad():
        for inputs, labels in tqdm(loader, desc='Evaluating', leave=False):
            inputs, labels = inputs.to(device), labels.to(device)
            
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            
            running_loss += loss.item() * inputs.size(0)
            _, predicted = outputs.max(1)
            total += labels.size(0)
            correct += predicted.eq(labels).sum().item()
    
    epoch_loss = running_loss / total
    epoch_acc = 100. * correct / total
    return epoch_loss, epoch_acc

def compute_detailed_metrics(model, loader, device, num_classes):
    """Compute detailed metrics including precision, recall, F1, and throughput."""
    model.eval()
    all_preds = []
    all_labels = []
    all_times = []
    
    with torch.no_grad():
        for inputs, labels in tqdm(loader, desc='Computing metrics', leave=False):
            inputs, labels = inputs.to(device), labels.to(device)
            
            # Measure inference time
            start_time = time.time()
            outputs = model(inputs)
            torch.cuda.synchronize()  # Wait for GPU to finish
            end_time = time.time()
            
            batch_time = end_time - start_time
            all_times.append(batch_time)
            
            _, predicted = outputs.max(1)
            all_preds.extend(predicted.cpu().numpy())
            all_labels.extend(labels.cpu().numpy())
    
    # Convert to numpy arrays
    all_preds = np.array(all_preds)
    all_labels = np.array(all_labels)
    
    # Calculate metrics
    accuracy = 100. * (all_preds == all_labels).sum() / len(all_labels)
    precision, recall, f1, _ = precision_recall_fscore_support(
        all_labels, all_preds, average='macro', zero_division=0
    )
    
    # Calculate latency and throughput
    total_time = sum(all_times)
    total_images = len(all_labels)
    latency_ms = (total_time / total_images) * 1000  # ms per image
    throughput = total_images / total_time  # images per second
    
    metrics = {
        'accuracy': accuracy,
        'precision': precision * 100,
        'recall': recall * 100,
        'f1': f1 * 100,
        'latency_ms': latency_ms,
        'throughput': throughput,
        'predictions': all_preds,
        'labels': all_labels
    }
    
    return metrics

def plot_confusion_matrix(y_true, y_pred, class_names, save_path):
    """Plot and save confusion matrix."""
    cm = confusion_matrix(y_true, y_pred)
    
    # For large number of classes, show simplified version
    plt.figure(figsize=(20, 18))
    
    # Normalize confusion matrix
    cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    
    sns.heatmap(cm_normalized, annot=False, fmt='.2f', cmap='Blues', 
                xticklabels=class_names, yticklabels=class_names,
                cbar_kws={'label': 'Normalized Count'})
    
    plt.title(f'Confusion Matrix - {EXP_NAME}', fontsize=16, pad=20)
    plt.ylabel('True Label', fontsize=12)
    plt.xlabel('Predicted Label', fontsize=12)
    plt.xticks(rotation=90, fontsize=6)
    plt.yticks(rotation=0, fontsize=6)
    plt.tight_layout()
    
    plt.savefig(save_path, dpi=150, bbox_inches='tight')
    logger.info(f"Confusion matrix saved to {save_path}")
    plt.close()

# ============================================================================
# üöÇ TRAINING LOOP
# ============================================================================
best_val_acc = 0.0
train_history = {'loss': [], 'acc': []}
val_history = {'loss': [], 'acc': []}

for epoch in range(1, NUM_EPOCHS + 1):
    logger.info(f"Epoch {epoch}/{NUM_EPOCHS}")
    
    # Train
    train_loss, train_acc = train_epoch(model, train_loader, criterion, optimizer, 
                                       scaler, device, USE_AMP)
    train_history['loss'].append(train_loss)
    train_history['acc'].append(train_acc)
    
    # Validate
    val_loss, val_acc = evaluate(model, val_loader, criterion, device)
    val_history['loss'].append(val_loss)
    val_history['acc'].append(val_acc)
    
    logger.info(f"  Train: loss={train_loss:.4f}, acc={train_acc:.2f}% | "
               f"Val: loss={val_loss:.4f}, acc={val_acc:.2f}%")
    
    # Save best model
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        checkpoint_path = OUTPUT_DIR / 'best_model.pt'
        torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'val_acc': val_acc,
            'val_loss': val_loss,
            'experiment_config': EXP_CONFIG,
            'hyperparameters': {
                'num_epochs': NUM_EPOCHS,
                'batch_size': BATCH_SIZE,
                'learning_rate': LEARNING_RATE,
                'img_size': IMG_SIZE
            }
        }, checkpoint_path)
        logger.info(f"  New best val acc: {val_acc:.2f}% (checkpoint: {checkpoint_path.name})")
    
    # Learning rate scheduling
    scheduler.step(val_acc)

# Save final model
final_model_path = OUTPUT_DIR / 'final_model.pt'
torch.save({
    'epoch': NUM_EPOCHS,
    'model_state_dict': model.state_dict(),
    'optimizer_state_dict': optimizer.state_dict(),
    'train_history': train_history,
    'val_history': val_history,
    'experiment_config': EXP_CONFIG,
}, final_model_path)
logger.info(f"Final model saved to {final_model_path}")

# ============================================================================
# üß™ FINAL TEST EVALUATION
# ============================================================================
logger.info("Training finished. Running final test evaluation...")

# Load best model
checkpoint = torch.load(OUTPUT_DIR / 'best_model.pt')
model.load_state_dict(checkpoint['model_state_dict'])

# Test evaluation with cross-entropy loss
test_loss, test_acc = evaluate(model, test_loader, criterion, device)
logger.info(f"Test (CE): loss={test_loss:.4f}, acc={test_acc:.2f}%")

# Compute detailed metrics
test_metrics = compute_detailed_metrics(model, test_loader, device, NUM_CLASSES)

logger.info(f"Test metrics: accuracy={test_metrics['accuracy']:.2f}%, "
           f"precision={test_metrics['precision']:.2f}%, "
           f"recall={test_metrics['recall']:.2f}%, "
           f"f1={test_metrics['f1']:.2f}%, "
           f"latency={test_metrics['latency_ms']:.2f} ms/img, "
           f"throughput={test_metrics['throughput']:.2f} img/s")

# ============================================================================
# üìä CONFUSION MATRIX
# ============================================================================
logger.info("Generating confusion matrix...")
cm_path = OUTPUT_DIR / 'confusion_matrix.png'
plot_confusion_matrix(test_metrics['labels'], test_metrics['predictions'], 
                     class_names, cm_path)

# ============================================================================
# üìà PLOT TRAINING HISTORY
# ============================================================================
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

# Loss plot
ax1.plot(range(1, NUM_EPOCHS + 1), train_history['loss'], 'b-', label='Train Loss', linewidth=2)
ax1.plot(range(1, NUM_EPOCHS + 1), val_history['loss'], 'r-', label='Val Loss', linewidth=2)
ax1.set_xlabel('Epoch', fontsize=12)
ax1.set_ylabel('Loss', fontsize=12)
ax1.set_title(f'Training & Validation Loss - {EXP_NAME}', fontsize=14)
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)

# Accuracy plot
ax2.plot(range(1, NUM_EPOCHS + 1), train_history['acc'], 'b-', label='Train Acc', linewidth=2)
ax2.plot(range(1, NUM_EPOCHS + 1), val_history['acc'], 'r-', label='Val Acc', linewidth=2)
ax2.set_xlabel('Epoch', fontsize=12)
ax2.set_ylabel('Accuracy (%)', fontsize=12)
ax2.set_title(f'Training & Validation Accuracy - {EXP_NAME}', fontsize=14)
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)

plt.tight_layout()
history_path = OUTPUT_DIR / 'training_history.png'
plt.savefig(history_path, dpi=150, bbox_inches='tight')
logger.info(f"Training history plot saved to {history_path}")
plt.close()

# ============================================================================
# üíæ SAVE FINAL RESULTS
# ============================================================================
results = {
    'experiment_id': EXPERIMENT_ID,
    'experiment_name': EXP_NAME,
    'model_type': MODEL_TYPE,
    'train_mode': TRAIN_MODE,
    'best_val_acc': best_val_acc,
    'test_metrics': {
        'loss': test_loss,
        'accuracy': test_metrics['accuracy'],
        'precision': test_metrics['precision'],
        'recall': test_metrics['recall'],
        'f1': test_metrics['f1'],
        'latency_ms': test_metrics['latency_ms'],
        'throughput': test_metrics['throughput']
    },
    'hyperparameters': {
        'num_epochs': NUM_EPOCHS,
        'batch_size': BATCH_SIZE,
        'learning_rate': LEARNING_RATE,
        'img_size': IMG_SIZE
    },
    'dataset_info': {
        'num_classes': NUM_CLASSES,
        'train_samples': len(train_dataset),
        'val_samples': len(val_dataset),
        'test_samples': len(test_dataset)
    }
}

import json
results_path = OUTPUT_DIR / 'results.json'
with open(results_path, 'w') as f:
    json.dump(results, f, indent=2)

logger.info(f"Results saved to {results_path}")
logger.info("\n" + "="*80)
logger.info(f"‚úÖ Experiment {EXPERIMENT_ID} completed successfully!")
logger.info(f"üìÅ All outputs saved to: {OUTPUT_DIR}")
logger.info("="*80)

print(f"\n\n{'='*80}")
print(f"üéâ EXPERIMENT {EXPERIMENT_ID} COMPLETED!")
print(f"{'='*80}")
print(f"\nüìä FINAL RESULTS:")
print(f"  Model: {MODEL_TYPE}")
print(f"  Training mode: {TRAIN_MODE}")
print(f"  Best validation accuracy: {best_val_acc:.2f}%")
print(f"  Test accuracy: {test_metrics['accuracy']:.2f}%")
print(f"  Test precision: {test_metrics['precision']:.2f}%")
print(f"  Test recall: {test_metrics['recall']:.2f}%")
print(f"  Test F1: {test_metrics['f1']:.2f}%")
print(f"  Inference latency: {test_metrics['latency_ms']:.2f} ms/img")
print(f"  Throughput: {test_metrics['throughput']:.2f} img/s")
print(f"\nüìÅ Outputs saved to: {OUTPUT_DIR}")
print(f"  - best_model.pt")
print(f"  - final_model.pt")
print(f"  - confusion_matrix.png")
print(f"  - training_history.png")
print(f"  - results.json")
print(f"  - training.log")
print(f"{'='*80}\n")

2026-01-29 19:21:03 | INFO | main | Starting Experiment 4: InceptionResNetV2-FullBackbone



üöÄ EXPERIMENT 4: InceptionResNetV2-FullBackbone
   Model: inception_resnet_v2
   Training mode: full (Full backbone)

‚úÖ Found dataset at: /kaggle/input/food-image-classification-dataset/Food Classification dataset
2026-01-29 19:21:03 | INFO | main | Starting Experiment 4: InceptionResNetV2-FullBackbone


2026-01-29 19:21:03 | INFO | main | Model: inception_resnet_v2, Training mode: full


2026-01-29 19:21:03 | INFO | main | Model: inception_resnet_v2, Training mode: full


2026-01-29 19:21:03 | INFO | main | Output directory: /kaggle/working/exp4_InceptionResNetV2-FullBackbone


2026-01-29 19:21:03 | INFO | main | Output directory: /kaggle/working/exp4_InceptionResNetV2-FullBackbone


2026-01-29 19:21:03 | INFO | main | Using device: cuda


2026-01-29 19:21:03 | INFO | main | Using device: cuda


2026-01-29 19:21:03 | INFO | main | GPU: Tesla P100-PCIE-16GB


2026-01-29 19:21:03 | INFO | main | GPU: Tesla P100-PCIE-16GB


2026-01-29 19:21:03 | INFO | main | Preparing datasets...


2026-01-29 19:21:03 | INFO | main | Preparing datasets...


2026-01-29 19:21:03 | INFO | main | No pre-split found. Creating train/val/test splits (70/15/15)...


2026-01-29 19:21:03 | INFO | main | No pre-split found. Creating train/val/test splits (70/15/15)...


2026-01-29 19:21:26 | INFO | main | Train samples: 16711


2026-01-29 19:21:26 | INFO | main | Train samples: 16711


2026-01-29 19:21:26 | INFO | main | Val samples: 3580


2026-01-29 19:21:26 | INFO | main | Val samples: 3580


2026-01-29 19:21:26 | INFO | main | Test samples: 3582


2026-01-29 19:21:26 | INFO | main | Test samples: 3582


2026-01-29 19:21:26 | INFO | main | Number of classes: 34


2026-01-29 19:21:26 | INFO | main | Number of classes: 34


2026-01-29 19:21:26 | INFO | main | Classes: ['Baked Potato', 'Crispy Chicken', 'Donut', 'Fries', 'Hot Dog']... (showing first 5)


2026-01-29 19:21:26 | INFO | main | Classes: ['Baked Potato', 'Crispy Chicken', 'Donut', 'Fries', 'Hot Dog']... (showing first 5)


2026-01-29 19:21:26 | INFO | main | Creating model: inception_resnet_v2


2026-01-29 19:21:26 | INFO | main | Creating model: inception_resnet_v2
2026-01-29 19:21:26 | INFO | timm.models._builder | Loading pretrained weights from Hugging Face hub (timm/inception_resnet_v2.tf_in1k)
2026-01-29 19:21:27 | INFO | timm.models._hub | [timm/inception_resnet_v2.tf_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
2026-01-29 19:21:27 | INFO | timm.models._builder | Missing keys (classif.weight, classif.bias) discovered while loading pretrained weights. This is expected if model is being adapted.


2026-01-29 19:21:27 | INFO | main | Training full backbone (all layers trainable)


2026-01-29 19:21:27 | INFO | main | Training full backbone (all layers trainable)


2026-01-29 19:21:27 | INFO | main | Model: inception_resnet_v2


2026-01-29 19:21:27 | INFO | main | Model: inception_resnet_v2


2026-01-29 19:21:27 | INFO | main |   Total params     = 54,358,722


2026-01-29 19:21:27 | INFO | main |   Total params     = 54,358,722


2026-01-29 19:21:27 | INFO | main |   Trainable params = 54,358,722


2026-01-29 19:21:27 | INFO | main |   Trainable params = 54,358,722


2026-01-29 19:21:27 | INFO | main |   Frozen params    = 0


2026-01-29 19:21:27 | INFO | main |   Frozen params    = 0


2026-01-29 19:21:27 | INFO | main | Optimizer: Adam (lr=0.001)


2026-01-29 19:21:27 | INFO | main | Optimizer: Adam (lr=0.001)


2026-01-29 19:21:27 | INFO | main | Mixed precision: True


2026-01-29 19:21:27 | INFO | main | Mixed precision: True


2026-01-29 19:21:27 | INFO | main | Epoch 1/10


2026-01-29 19:21:27 | INFO | main | Epoch 1/10


2026-01-29 19:26:53 | INFO | main |   Train: loss=1.1728, acc=65.36% | Val: loss=1.1278, acc=65.98%


2026-01-29 19:26:53 | INFO | main |   Train: loss=1.1728, acc=65.36% | Val: loss=1.1278, acc=65.98%


2026-01-29 19:26:54 | INFO | main |   New best val acc: 65.98% (checkpoint: best_model.pt)


2026-01-29 19:26:54 | INFO | main |   New best val acc: 65.98% (checkpoint: best_model.pt)


2026-01-29 19:26:54 | INFO | main | Epoch 2/10


2026-01-29 19:26:54 | INFO | main | Epoch 2/10


2026-01-29 19:32:20 | INFO | main |   Train: loss=0.6126, acc=80.77% | Val: loss=0.7216, acc=78.04%


2026-01-29 19:32:20 | INFO | main |   Train: loss=0.6126, acc=80.77% | Val: loss=0.7216, acc=78.04%


2026-01-29 19:32:22 | INFO | main |   New best val acc: 78.04% (checkpoint: best_model.pt)


2026-01-29 19:32:22 | INFO | main |   New best val acc: 78.04% (checkpoint: best_model.pt)


2026-01-29 19:32:22 | INFO | main | Epoch 3/10


2026-01-29 19:32:22 | INFO | main | Epoch 3/10


2026-01-29 19:37:48 | INFO | main |   Train: loss=0.4303, acc=86.64% | Val: loss=0.6785, acc=79.39%


2026-01-29 19:37:48 | INFO | main |   Train: loss=0.4303, acc=86.64% | Val: loss=0.6785, acc=79.39%


2026-01-29 19:37:49 | INFO | main |   New best val acc: 79.39% (checkpoint: best_model.pt)


2026-01-29 19:37:49 | INFO | main |   New best val acc: 79.39% (checkpoint: best_model.pt)


2026-01-29 19:37:49 | INFO | main | Epoch 4/10


2026-01-29 19:37:49 | INFO | main | Epoch 4/10


2026-01-29 19:43:15 | INFO | main |   Train: loss=0.3087, acc=90.19% | Val: loss=0.6091, acc=81.76%


2026-01-29 19:43:15 | INFO | main |   Train: loss=0.3087, acc=90.19% | Val: loss=0.6091, acc=81.76%


2026-01-29 19:43:17 | INFO | main |   New best val acc: 81.76% (checkpoint: best_model.pt)


2026-01-29 19:43:17 | INFO | main |   New best val acc: 81.76% (checkpoint: best_model.pt)


2026-01-29 19:43:17 | INFO | main | Epoch 5/10


2026-01-29 19:43:17 | INFO | main | Epoch 5/10


2026-01-29 19:48:43 | INFO | main |   Train: loss=0.2297, acc=92.61% | Val: loss=0.6646, acc=81.42%


2026-01-29 19:48:43 | INFO | main |   Train: loss=0.2297, acc=92.61% | Val: loss=0.6646, acc=81.42%


2026-01-29 19:48:43 | INFO | main | Epoch 6/10


2026-01-29 19:48:43 | INFO | main | Epoch 6/10


2026-01-29 19:54:09 | INFO | main |   Train: loss=0.1939, acc=93.66% | Val: loss=0.5701, acc=84.33%


2026-01-29 19:54:09 | INFO | main |   Train: loss=0.1939, acc=93.66% | Val: loss=0.5701, acc=84.33%


2026-01-29 19:54:10 | INFO | main |   New best val acc: 84.33% (checkpoint: best_model.pt)


2026-01-29 19:54:10 | INFO | main |   New best val acc: 84.33% (checkpoint: best_model.pt)


2026-01-29 19:54:10 | INFO | main | Epoch 7/10


2026-01-29 19:54:10 | INFO | main | Epoch 7/10


2026-01-29 19:59:38 | INFO | main |   Train: loss=0.1630, acc=94.62% | Val: loss=0.6645, acc=83.44%


2026-01-29 19:59:38 | INFO | main |   Train: loss=0.1630, acc=94.62% | Val: loss=0.6645, acc=83.44%


2026-01-29 19:59:38 | INFO | main | Epoch 8/10


2026-01-29 19:59:38 | INFO | main | Epoch 8/10


2026-01-29 20:05:04 | INFO | main |   Train: loss=0.1253, acc=95.96% | Val: loss=0.9783, acc=77.26%


2026-01-29 20:05:04 | INFO | main |   Train: loss=0.1253, acc=95.96% | Val: loss=0.9783, acc=77.26%


2026-01-29 20:05:04 | INFO | main | Epoch 9/10


2026-01-29 20:05:04 | INFO | main | Epoch 9/10


2026-01-29 20:10:30 | INFO | main |   Train: loss=0.1134, acc=96.23% | Val: loss=0.7971, acc=80.36%


2026-01-29 20:10:30 | INFO | main |   Train: loss=0.1134, acc=96.23% | Val: loss=0.7971, acc=80.36%


2026-01-29 20:10:30 | INFO | main | Epoch 10/10


2026-01-29 20:10:30 | INFO | main | Epoch 10/10


2026-01-29 20:15:56 | INFO | main |   Train: loss=0.0338, acc=99.07% | Val: loss=0.4084, acc=89.11%


2026-01-29 20:15:56 | INFO | main |   Train: loss=0.0338, acc=99.07% | Val: loss=0.4084, acc=89.11%


2026-01-29 20:15:58 | INFO | main |   New best val acc: 89.11% (checkpoint: best_model.pt)


2026-01-29 20:15:58 | INFO | main |   New best val acc: 89.11% (checkpoint: best_model.pt)


2026-01-29 20:15:59 | INFO | main | Final model saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/final_model.pt


2026-01-29 20:15:59 | INFO | main | Final model saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/final_model.pt


2026-01-29 20:15:59 | INFO | main | Training finished. Running final test evaluation...


2026-01-29 20:15:59 | INFO | main | Training finished. Running final test evaluation...


2026-01-29 20:16:20 | INFO | main | Test (CE): loss=0.4597, acc=88.69%


2026-01-29 20:16:20 | INFO | main | Test (CE): loss=0.4597, acc=88.69%


2026-01-29 20:16:40 | INFO | main | Test metrics: accuracy=88.69%, precision=90.01%, recall=89.02%, f1=89.41%, latency=4.98 ms/img, throughput=200.61 img/s


2026-01-29 20:16:40 | INFO | main | Test metrics: accuracy=88.69%, precision=90.01%, recall=89.02%, f1=89.41%, latency=4.98 ms/img, throughput=200.61 img/s


2026-01-29 20:16:40 | INFO | main | Generating confusion matrix...


2026-01-29 20:16:40 | INFO | main | Generating confusion matrix...


2026-01-29 20:16:41 | INFO | main | Confusion matrix saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/confusion_matrix.png


2026-01-29 20:16:41 | INFO | main | Confusion matrix saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/confusion_matrix.png


2026-01-29 20:16:41 | INFO | main | Training history plot saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/training_history.png


2026-01-29 20:16:41 | INFO | main | Training history plot saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/training_history.png


2026-01-29 20:16:41 | INFO | main | Results saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/results.json


2026-01-29 20:16:41 | INFO | main | Results saved to /kaggle/working/exp4_InceptionResNetV2-FullBackbone/results.json


2026-01-29 20:16:41 | INFO | main | 


2026-01-29 20:16:41 | INFO | main | 


2026-01-29 20:16:41 | INFO | main | ‚úÖ Experiment 4 completed successfully!


2026-01-29 20:16:41 | INFO | main | ‚úÖ Experiment 4 completed successfully!


2026-01-29 20:16:41 | INFO | main | üìÅ All outputs saved to: /kaggle/working/exp4_InceptionResNetV2-FullBackbone


2026-01-29 20:16:41 | INFO | main | üìÅ All outputs saved to: /kaggle/working/exp4_InceptionResNetV2-FullBackbone






üéâ EXPERIMENT 4 COMPLETED!

üìä FINAL RESULTS:
  Model: inception_resnet_v2
  Training mode: full
  Best validation accuracy: 89.11%
  Test accuracy: 88.69%
  Test precision: 90.01%
  Test recall: 89.02%
  Test F1: 89.41%
  Inference latency: 4.98 ms/img
  Throughput: 200.61 img/s

üìÅ Outputs saved to: /kaggle/working/exp4_InceptionResNetV2-FullBackbone
  - best_model.pt
  - final_model.pt
  - confusion_matrix.png
  - training_history.png
  - results.json
  - training.log



## üìù How to Use This Notebook

### Running Experiments

1. **Change `EXPERIMENT_ID`** in Cell 2 (line 27) to select which experiment to run:
   - `EXPERIMENT_ID = 1` ‚Üí Inception v3 (Head only)
   - `EXPERIMENT_ID = 2` ‚Üí Inception-ResNet-v2 (Head only)
   - `EXPERIMENT_ID = 3` ‚Üí Inception v3 (Full backbone)
   - `EXPERIMENT_ID = 4` ‚Üí Inception-ResNet-v2 (Full backbone)

2. **Run all cells** to execute the selected experiment

3. **Outputs will be saved** to `/kaggle/working/exp{N}_{ModelName}/`:
   - `best_model.pt` - Best model checkpoint
   - `final_model.pt` - Final model after all epochs
   - `confusion_matrix.png` - Confusion matrix visualization
   - `training_history.png` - Training curves
   - `results.json` - Complete metrics
   - `training.log` - Detailed logs

### Features

‚úÖ **Universal code** - One notebook for all 4 experiments  
‚úÖ **Auto dataset detection** - Works with different Kaggle dataset structures  
‚úÖ **Auto train/val/test split** - Creates splits if not pre-split  
‚úÖ **Inception v3** support added  
‚úÖ **Model saving** - Both best and final checkpoints  
‚úÖ **Confusion matrix** - Visual analysis of predictions  
‚úÖ **Detailed logging** - Timestamps, metrics, throughput  
‚úÖ **Mixed precision** training for faster execution  
‚úÖ **Performance metrics** - Precision, recall, F1, latency, throughput  

### Dataset Structure

This notebook automatically detects and handles two dataset structures:

1. **Pre-split structure** (preferred):
   ```
   dataset/
   ‚îú‚îÄ‚îÄ train/
   ‚îÇ   ‚îú‚îÄ‚îÄ class1/
   ‚îÇ   ‚îú‚îÄ‚îÄ class2/
   ‚îÇ   ...
   ‚îú‚îÄ‚îÄ validation/
   ‚îÇ   ‚îú‚îÄ‚îÄ class1/
   ‚îÇ   ‚îú‚îÄ‚îÄ class2/
   ‚îÇ   ...
   ‚îî‚îÄ‚îÄ test/
       ‚îú‚îÄ‚îÄ class1/
       ‚îú‚îÄ‚îÄ class2/
       ...
   ```

2. **Single directory structure**:
   ```
   dataset/
   ‚îú‚îÄ‚îÄ class1/
   ‚îú‚îÄ‚îÄ class2/
   ...
   ```
   Will automatically split 70/15/15 for train/val/test

### Main Logic Preserved

The core training logic remains the same as the original:
- Same data augmentation pipeline
- Same optimizer and learning rate scheduler
- Same training loop structure
- Same evaluation methodology

Only added: experiment selection, Inception v3, model saving, confusion matrix, enhanced logging, and flexible dataset handling!

In [24]:
import shutil
import os

# Path to your folder
folder_path = "/kaggle/working/exp4_InceptionResNetV2-FullBackbone"  # replace with your folder
zip_path = "/kaggle/working/exp4_InceptionResNetV2-FullBackbone.zip"  # output zip file

# Create a zip archive
shutil.make_archive(base_name=zip_path.replace('.zip',''), format='zip', root_dir=folder_path)

print(f"ZIP created at: {zip_path}")


ZIP created at: /kaggle/working/exp4_InceptionResNetV2-FullBackbone.zip
