# Deep Learning v1 Image Classification Development

## Objectives

- Design and implement a custom neural network from scratch
- Learn fundamental deep learning concepts through hands-on implementation
- Compare performance with shallow learning approaches
- Establish baseline for deep learning model improvements

## Setup and Imports

In [None]:
!pip install torchinfo

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader, random_split
from torchvision import transforms, datasets
from torchinfo import summary
import os
import sys
from PIL import Image
import pickle
from pathlib import Path
from sklearn.metrics import classification_report, confusion_matrix
import time
from typing import Dict, Any, List, Tuple  # Added missing typing imports

# Add parent directory to path for model core imports
sys.path.append('../..')
from ml_models_core.src.base_classifier import BaseImageClassifier
from ml_models_core.src.model_registry import ModelRegistry, ModelMetadata
from ml_models_core.src.utils import ModelUtils
from ml_models_core.src.data_loaders import get_unified_classification_data

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

# Plot settings
plt.style.use('default')
sns.set_palette('husl')

In [None]:
class UnifiedDataset(Dataset):
    """Memory-efficient dataset wrapper for unified classification data."""
    
    def __init__(self, image_paths, labels, class_names, transform=None):
        self.image_paths = image_paths  # Store paths instead of loaded images
        self.labels = labels
        self.class_names = class_names
        self.transform = transform
    
    def __len__(self):
        return len(self.image_paths)
    
    def __getitem__(self, idx):
        # Load image on-demand
        image_path = self.image_paths[idx]
        label = self.labels[idx]
        
        try:
            # Load image from disk
            image = Image.open(image_path).convert('RGB')
        except Exception as e:
            print(f"Error loading {image_path}: {e}")
            # Create blank image if loading fails
            image = Image.new('RGB', (224, 224), (0, 0, 0))
        
        if self.transform:
            image = self.transform(image)
        
        return image, label

# Memory-efficient approach: Get image paths directly from the data manager
print("Getting dataset paths directly from data manager...")

from ml_models_core.src.data_manager import get_dataset_manager
manager = get_dataset_manager()

# Get the unified dataset path
try:
    dataset_path = manager.get_dataset_path('combined_unified_classification')
    if not dataset_path:
        print("Creating unified classification dataset...")
        available_datasets = ['oxford_pets', 'kaggle_vegetables', 'street_foods', 'musical_instruments']
        dataset_path = manager.create_combined_dataset(
            dataset_names=available_datasets,
            output_name="unified_classification",
            class_mapping=None  # Keep original class names
        )
except Exception as e:
    print(f"Error accessing unified dataset: {e}")
    # Fallback to main dataset
    dataset_path = manager.download_dataset('oxford_pets')

print(f"Using dataset at: {dataset_path}")

# Scan the dataset directory for image paths and labels
import os
from pathlib import Path

dataset_path = Path(dataset_path)
all_image_paths = []
all_labels = []
class_names = []
class_to_idx = {}

# Collect all class directories
class_dirs = [d for d in dataset_path.iterdir() 
             if d.is_dir() and not d.name.startswith('.')]

if not class_dirs:
    raise ValueError(f"No class directories found in {dataset_path}")

class_names = sorted([d.name for d in class_dirs])
class_to_idx = {name: idx for idx, name in enumerate(class_names)}

print(f"Found {len(class_names)} classes: {class_names}")

# Collect all images efficiently - just paths, not loading images
valid_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}

print("Scanning for image paths...")
for class_dir in class_dirs:
    class_name = class_dir.name
    class_idx = class_to_idx[class_name]
    
    image_files = [f for f in class_dir.iterdir() 
                   if f.suffix.lower() in valid_extensions]
    
    print(f"  {class_name}: {len(image_files)} images")
    
    for img_path in image_files:
        all_image_paths.append(str(img_path))
        all_labels.append(class_idx)

print(f"\nCollected {len(all_image_paths)} image paths")
print(f"Memory usage: Only storing paths ({len(all_image_paths) * 100} bytes) instead of images ({len(all_image_paths) * 224 * 224 * 3 * 4 / 1e9:.1f} GB)")

# Convert labels to numpy array for consistency
all_labels = np.array(all_labels)

print(f"✅ Memory-efficient data loading completed")
print(f"Sample path: {all_image_paths[0]}")
print(f"Classes: {len(class_names)} total")

In [None]:
# Define transforms with 128x128 image size for optimal balance
transform_train = transforms.Compose([
    transforms.Resize((128, 128)),  # Sweet spot between detail and model capacity
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.RandomRotation(10),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

transform_val = transforms.Compose([
    transforms.Resize((128, 128)),  # Sweet spot between detail and model capacity
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Create memory-efficient dataset with image paths only
full_dataset = UnifiedDataset(all_image_paths, all_labels, class_names, transform=transform_train)

# Split dataset indices (not actual data)
train_size = int(0.7 * len(full_dataset))
val_size = int(0.15 * len(full_dataset))
test_size = len(full_dataset) - train_size - val_size

print(f"Dataset splits:")
print(f"  Total images: {len(full_dataset)}")
print(f"  Training: {train_size}")
print(f"  Validation: {val_size}")
print(f"  Test: {test_size}")

# Create index splits
indices = list(range(len(full_dataset)))
np.random.shuffle(indices)

train_indices = indices[:train_size]
val_indices = indices[train_size:train_size + val_size]
test_indices = indices[train_size + val_size:]

# Create separate dataset instances with appropriate transforms
train_dataset = UnifiedDataset(
    [all_image_paths[i] for i in train_indices],
    [all_labels[i] for i in train_indices],
    class_names,
    transform=transform_train
)

val_dataset = UnifiedDataset(
    [all_image_paths[i] for i in val_indices],
    [all_labels[i] for i in val_indices],
    class_names,
    transform=transform_val
)

test_dataset = UnifiedDataset(
    [all_image_paths[i] for i in test_indices],
    [all_labels[i] for i in test_indices],
    class_names,
    transform=transform_val
)

# Create data loaders - can use larger batch size with 128x128
batch_size = 8  # Increased back to 8 since 128x128 uses less memory than 256x256
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=2)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=2)

print(f"\nDataLoaders created with batch_size={batch_size} for 128x128 images")
print(f"Training samples: {len(train_dataset)}")
print(f"Validation samples: {len(val_dataset)}")
print(f"Test samples: {len(test_dataset)}")
print(f"Number of classes: {len(class_names)}")

# Test loading a single batch to verify everything works
print(f"\nTesting data loading with 128x128 images...")
try:
    sample_batch = next(iter(train_loader))
    print(f"✅ Successfully loaded batch: {sample_batch[0].shape}, {sample_batch[1].shape}")
    print(f"✅ Memory-efficient loading working correctly with 128x128 images")
except Exception as e:
    print(f"❌ Error in data loading: {e}")
    
# Clear any unnecessary variables to free memory
import gc
gc.collect()
print(f"✅ Memory cleanup completed")

In [None]:
# Visualize sample images
def visualize_batch(data_loader, class_names, title="Sample Images"):
    """Visualize a batch of images."""
    data_iter = iter(data_loader)
    images, labels = next(data_iter)
    
    # Denormalize images for visualization
    mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
    images_denorm = images * std + mean
    images_denorm = torch.clamp(images_denorm, 0, 1)
    
    fig, axes = plt.subplots(2, 4, figsize=(12, 6))
    axes = axes.ravel()
    
    for i in range(min(8, len(images))):
        img = images_denorm[i].permute(1, 2, 0)
        axes[i].imshow(img)
        axes[i].set_title(f'{class_names[labels[i]]}')
        axes[i].axis('off')
    
    plt.suptitle(title)
    plt.tight_layout()
    plt.show()

visualize_batch(train_loader, class_names, "Training Samples from Unified Dataset")

## Neural Network Architecture Design

Now let's design our custom CNN architecture. We'll create a modular design with configurable depth and features.

In [None]:
class DeepLearningV1(nn.Module):
    """Custom CNN architecture for image classification with 128x128 input."""
    
    def __init__(self, num_classes=70, input_channels=3, dropout_rate=0.5):
        super(DeepLearningV1, self).__init__()
        
        self.num_classes = num_classes
        self.input_channels = input_channels
        
        # Feature extraction layers - optimized for 128x128 input
        self.conv1 = nn.Conv2d(input_channels, 32, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(32)
        
        self.conv2 = nn.Conv2d(32, 64, kernel_size=6, padding=1)
        self.bn2 = nn.BatchNorm2d(64)
        
        self.conv3 = nn.Conv2d(64, 128, kernel_size=6, padding=1)
        self.bn3 = nn.BatchNorm2d(128)
        
        self.conv4 = nn.Conv2d(128, 256, kernel_size=6, padding=1)
        self.bn4 = nn.BatchNorm2d(256)
        
        self.conv5 = nn.Conv2d(256, 512, kernel_size=6, padding=1)
        self.bn5 = nn.BatchNorm2d(512)
        
        # Pooling and regularization
        self.pool = nn.MaxPool2d(2, 2)
        self.adaptive_pool = nn.AdaptiveAvgPool2d((1, 1))
        self.dropout = nn.Dropout(dropout_rate)
        
        # Classification head
        self.fc = nn.Linear(512, num_classes)
        
        # Initialize weights
        self._initialize_weights()
    
    def _initialize_weights(self):
        """Initialize model weights."""
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)
    
    def forward(self, x):
        """Forward pass through the network."""
        # Input: 128x128x3
        
        # Block 1: 128x128 -> 64x64
        x = self.pool(F.relu(self.bn1(self.conv1(x))))
        
        # Block 2: 64x64 -> 32x32  
        x = self.pool(F.relu(self.bn2(self.conv2(x))))
        
        # Block 3: 32x32 -> 16x16
        x = self.pool(F.relu(self.bn3(self.conv3(x))))
        
        # Block 4: 16x16 -> 8x8
        x = self.pool(F.relu(self.bn4(self.conv4(x))))
        
        # Block 5: 8x8 -> 4x4  
        x = self.pool(F.relu(self.bn5(self.conv5(x))))
        
        # Global average pooling: 4x4 -> 1x1
        x = self.adaptive_pool(x)
        
        # Flatten and classify
        x = x.view(x.size(0), -1)
        x = self.dropout(x)
        x = self.fc(x)
        
        return x
    
    def get_feature_maps(self, x, layer_names=None):
        """Extract feature maps from intermediate layers."""
        features = {}
        
        # Block 1
        x1 = self.pool(F.relu(self.bn1(self.conv1(x))))
        features['block1'] = x1
        
        # Block 2
        x2 = self.pool(F.relu(self.bn2(self.conv2(x1))))
        features['block2'] = x2
        
        # Block 3
        x3 = self.pool(F.relu(self.bn3(self.conv3(x2))))
        features['block3'] = x3
        
        # Block 4
        x4 = self.pool(F.relu(self.bn4(self.conv4(x3))))
        features['block4'] = x4
        
        # Block 5
        x5 = self.pool(F.relu(self.bn5(self.conv5(x4))))
        features['block5'] = x5
        
        return features if layer_names is None else {k: v for k, v in features.items() if k in layer_names}

In [None]:
# Create model instance with correct number of classes for 128x128 input
num_classes = len(class_names)
model = DeepLearningV1(num_classes=num_classes).to(device)

print(f"Model created for {num_classes} classes with 128x128 input")
print(f"Classes: {class_names}")

# Print model summary with correct input size
print("\nModel Architecture Summary:")
print(summary(model, input_size=(batch_size, 3, 128, 128), device=str(device)))

# Calculate total parameters
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f"\nModel Parameters:")
print(f"  Total parameters: {total_params:,}")
print(f"  Trainable parameters: {trainable_params:,}")
print(f"  Model size: {total_params * 4 / 1024 / 1024:.1f} MB (32-bit)")

# Test forward pass with sample data
print(f"\nTesting forward pass...")
try:
    sample_batch = next(iter(train_loader))
    with torch.no_grad():
        sample_input = sample_batch[0].to(device)
        sample_output = model(sample_input)
        print(f"✅ Forward pass successful: {sample_input.shape} -> {sample_output.shape}")
except Exception as e:
    print(f"❌ Forward pass failed: {e}")

## Training Setup and Utilities

In [None]:
class TrainingManager:
    """Manages the training process for deep learning models with early stopping."""
    
    def __init__(self, model, device, class_names, patience=5, min_delta=0.001):
        self.model = model
        self.device = device
        self.class_names = class_names
        
        # Early stopping parameters
        self.patience = patience  # Number of epochs to wait for improvement
        self.min_delta = min_delta  # Minimum change to qualify as improvement
        self.wait = 0  # Counter for patience
        self.stopped_epoch = 0  # Track when early stopping occurred
        
        # Training history
        self.train_losses = []
        self.val_losses = []
        self.train_accuracies = []
        self.val_accuracies = []
        
        # Best model tracking
        self.best_val_loss = float('inf')
        self.best_val_accuracy = 0.0
        self.best_model_state = None
        
    def train_epoch(self, train_loader, criterion, optimizer):
        """Train for one epoch."""
        self.model.train()
        running_loss = 0.0
        correct = 0
        total = 0
        
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = data.to(self.device), target.to(self.device)
            
            optimizer.zero_grad()
            output = self.model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            
            running_loss += loss.item()
            _, predicted = torch.max(output.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()
            
            if batch_idx % 50 == 0:  # Reduce print frequency for many classes
                print(f'Batch {batch_idx}/{len(train_loader)}, Loss: {loss.item():.4f}')
        
        epoch_loss = running_loss / len(train_loader)
        epoch_accuracy = 100. * correct / total
        
        self.train_losses.append(epoch_loss)
        self.train_accuracies.append(epoch_accuracy)
        
        return epoch_loss, epoch_accuracy
    
    def validate_epoch(self, val_loader, criterion):
        """Validate for one epoch."""
        self.model.eval()
        running_loss = 0.0
        correct = 0
        total = 0
        
        with torch.no_grad():
            for data, target in val_loader:
                data, target = data.to(self.device), target.to(self.device)
                output = self.model(data)
                loss = criterion(output, target)
                
                running_loss += loss.item()
                _, predicted = torch.max(output.data, 1)
                total += target.size(0)
                correct += (predicted == target).sum().item()
        
        epoch_loss = running_loss / len(val_loader)
        epoch_accuracy = 100. * correct / total
        
        self.val_losses.append(epoch_loss)
        self.val_accuracies.append(epoch_accuracy)
        
        return epoch_loss, epoch_accuracy
    
    def check_early_stopping(self, val_loss, epoch):
        """Check if training should stop early based on validation loss."""
        # Check if validation loss improved
        if val_loss < self.best_val_loss - self.min_delta:
            self.best_val_loss = val_loss
            self.best_model_state = self.model.state_dict().copy()
            self.wait = 0
            print(f"💚 Validation loss improved to {val_loss:.4f} - saving best model")
        else:
            self.wait += 1
            print(f"⚠️  Validation loss did not improve ({self.wait}/{self.patience})")
            
            if self.wait >= self.patience:
                self.stopped_epoch = epoch
                print(f"🛑 Early stopping triggered at epoch {epoch + 1}")
                print(f"   Best validation loss: {self.best_val_loss:.4f}")
                return True
        
        return False
    
    def train(self, train_loader, val_loader, criterion, optimizer, scheduler, num_epochs):
        """Full training loop with early stopping."""
        print(f"Starting training for up to {num_epochs} epochs...")
        print(f"Training on {len(self.class_names)} classes")
        print(f"Early stopping: patience={self.patience}, min_delta={self.min_delta}")
        start_time = time.time()
        
        for epoch in range(num_epochs):
            print(f"\nEpoch {epoch+1}/{num_epochs}")
            print("-" * 20)
            
            # Training
            train_loss, train_acc = self.train_epoch(train_loader, criterion, optimizer)
            
            # Validation
            val_loss, val_acc = self.validate_epoch(val_loader, criterion)
            
            # Update learning rate
            if scheduler:
                scheduler.step()
            
            print(f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}%")
            print(f"Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%")
            
            # Update best validation accuracy for tracking
            if val_acc > self.best_val_accuracy:
                self.best_val_accuracy = val_acc
            
            print(f"Best Val Acc: {self.best_val_accuracy:.2f}%")
            
            # Check for early stopping
            if self.check_early_stopping(val_loss, epoch):
                break
            
            # Plot progress every 5 epochs or if we're past epoch 10
            if (epoch + 1) % 5 == 0 or epoch >= 10:
                self.plot_training_progress()
        
        training_time = time.time() - start_time
        
        if self.stopped_epoch > 0:
            print(f"\n🏁 Training stopped early at epoch {self.stopped_epoch + 1}")
        else:
            print(f"\n✅ Training completed all {num_epochs} epochs")
            
        print(f"Training time: {training_time:.2f} seconds")
        print(f"Best validation loss: {self.best_val_loss:.4f}")
        print(f"Best validation accuracy: {self.best_val_accuracy:.2f}%")
        
        # Load best model
        if self.best_model_state:
            self.model.load_state_dict(self.best_model_state)
            print("✅ Loaded best model state")
        
        return self.best_val_accuracy
    
    def plot_training_progress(self):
        """Plot training progress with overfitting indicators."""
        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
        
        epochs = range(1, len(self.train_losses) + 1)
        
        # Loss plot
        ax1.plot(epochs, self.train_losses, 'b-', label='Training Loss', linewidth=2)
        ax1.plot(epochs, self.val_losses, 'r-', label='Validation Loss', linewidth=2)
        
        # Highlight potential overfitting
        if len(self.train_losses) > 5:
            # Check if validation loss is increasing while training loss decreases
            recent_train = self.train_losses[-3:]
            recent_val = self.val_losses[-3:]
            
            if len(recent_train) >= 3 and len(recent_val) >= 3:
                train_trend = recent_train[-1] - recent_train[0]
                val_trend = recent_val[-1] - recent_val[0]
                
                if train_trend < 0 and val_trend > 0:  # Train decreasing, val increasing
                    ax1.axvspan(len(epochs)-2, len(epochs), alpha=0.3, color='orange', 
                               label='Potential Overfitting')
        
        ax1.set_title('Training and Validation Loss')
        ax1.set_xlabel('Epoch')
        ax1.set_ylabel('Loss')
        ax1.legend()
        ax1.grid(True, alpha=0.3)
        
        # Mark early stopping point if it occurred
        if self.stopped_epoch > 0:
            ax1.axvline(x=self.stopped_epoch + 1, color='red', linestyle='--', 
                       label=f'Early Stop (Epoch {self.stopped_epoch + 1})')
        
        # Accuracy plot
        ax2.plot(epochs, self.train_accuracies, 'b-', label='Training Accuracy', linewidth=2)
        ax2.plot(epochs, self.val_accuracies, 'r-', label='Validation Accuracy', linewidth=2)
        ax2.set_title('Training and Validation Accuracy')
        ax2.set_xlabel('Epoch')
        ax2.set_ylabel('Accuracy (%)')
        ax2.legend()
        ax2.grid(True, alpha=0.3)
        
        # Mark early stopping point if it occurred
        if self.stopped_epoch > 0:
            ax2.axvline(x=self.stopped_epoch + 1, color='red', linestyle='--', 
                       label=f'Early Stop (Epoch {self.stopped_epoch + 1})')
        
        plt.tight_layout()
        plt.show()
        
        # Print overfitting analysis
        if len(self.train_losses) > 5:
            print("\n📊 Overfitting Analysis:")
            print(f"   Current train loss: {self.train_losses[-1]:.4f}")
            print(f"   Current val loss: {self.val_losses[-1]:.4f}")
            print(f"   Loss gap: {self.val_losses[-1] - self.train_losses[-1]:.4f}")
            
            if self.val_losses[-1] > self.train_losses[-1] + 0.1:
                print("   ⚠️  Large gap suggests possible overfitting")
            else:
                print("   ✅ Loss gap looks healthy")
    
    def evaluate_model(self, test_loader):
        """Evaluate model on test set."""
        self.model.eval()
        correct = 0
        total = 0
        all_predictions = []
        all_targets = []
        
        with torch.no_grad():
            for data, target in test_loader:
                data, target = data.to(self.device), target.to(self.device)
                output = self.model(data)
                _, predicted = torch.max(output.data, 1)
                
                total += target.size(0)
                correct += (predicted == target).sum().item()
                
                all_predictions.extend(predicted.cpu().numpy())
                all_targets.extend(target.cpu().numpy())
        
        test_accuracy = 100. * correct / total
        
        print(f"\n🎯 Test Accuracy: {test_accuracy:.2f}%")
        
        # Classification report (truncated for many classes)
        print("\nClassification Report (first 10 classes):")
        unique_classes = sorted(list(set(all_targets)))
        display_classes = unique_classes[:10]
        
        if len(display_classes) < len(unique_classes):
            print(f"Note: Showing first 10 of {len(unique_classes)} classes")
        
        print(classification_report(all_targets, all_predictions, 
                                  target_names=[self.class_names[i] for i in display_classes],
                                  labels=display_classes))
        
        # Confusion matrix (for manageable number of classes)
        if len(self.class_names) <= 15:
            cm = confusion_matrix(all_targets, all_predictions)
            
            plt.figure(figsize=(12, 10))
            sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                       xticklabels=self.class_names, yticklabels=self.class_names)
            plt.title('Confusion Matrix')
            plt.xlabel('Predicted')
            plt.ylabel('Actual')
            plt.xticks(rotation=45, ha='right')
            plt.yticks(rotation=0)
            plt.tight_layout()
            plt.show()
        else:
            print(f"Confusion matrix skipped (too many classes: {len(self.class_names)})")
        
        return test_accuracy, all_predictions, all_targets

## Model Training

In [None]:
# Training setup
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-4)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

# Create training manager with early stopping
# patience=5 means stop if validation loss doesn't improve for 5 epochs
# min_delta=0.001 means improvement must be at least 0.001 to count
trainer = TrainingManager(model, device, class_names, patience=5, min_delta=0.001)

# Train the model with early stopping
num_epochs = 30  # Set higher limit since early stopping will prevent overfitting
print(f"🚀 Starting training with early stopping monitoring...")
print(f"   - Will stop if validation loss doesn't improve for {trainer.patience} epochs")
print(f"   - Minimum improvement threshold: {trainer.min_delta}")

best_val_accuracy = trainer.train(
    train_loader, val_loader, criterion, optimizer, scheduler, num_epochs
)

# Final training progress plot
print(f"\n📈 Final Training Results:")
trainer.plot_training_progress()

## Model Evaluation

In [None]:
# Evaluate on test set
test_accuracy, predictions, targets = trainer.evaluate_model(test_loader)

In [None]:
# Visualize some test predictions
def visualize_predictions(model, test_loader, class_names, device, num_samples=8):
    """Visualize model predictions on test samples."""
    model.eval()
    
    # Get a batch of test data
    data_iter = iter(test_loader)
    images, labels = next(data_iter)
    
    # Make predictions
    with torch.no_grad():
        images_gpu = images.to(device)
        outputs = model(images_gpu)
        probabilities = F.softmax(outputs, dim=1)
        _, predicted = torch.max(outputs, 1)
    
    # Denormalize images for visualization
    mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
    std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
    images_denorm = images * std + mean
    images_denorm = torch.clamp(images_denorm, 0, 1)
    
    # Plot predictions
    fig, axes = plt.subplots(2, 4, figsize=(16, 8))
    axes = axes.ravel()
    
    for i in range(min(num_samples, len(images))):
        img = images_denorm[i].permute(1, 2, 0)
        true_label = class_names[labels[i]]
        pred_label = class_names[predicted[i]]
        confidence = probabilities[i][predicted[i]].item()
        
        axes[i].imshow(img)
        
        # Color based on correctness
        color = 'green' if labels[i] == predicted[i] else 'red'
        
        axes[i].set_title(
            f'True: {true_label}\nPred: {pred_label}\nConf: {confidence:.2f}',
            color=color
        )
        axes[i].axis('off')
    
    plt.suptitle('Test Predictions (Green=Correct, Red=Incorrect)', fontsize=16)
    plt.tight_layout()
    plt.show()

visualize_predictions(model, test_loader, full_dataset.class_names, device)

## Model Analysis and Insights

In [None]:
# Analyze model performance by class
def analyze_per_class_performance(targets, predictions, class_names):
    """Analyze performance for each class."""
    from sklearn.metrics import precision_recall_fscore_support
    
    precision, recall, f1, support = precision_recall_fscore_support(
        targets, predictions, average=None, labels=range(len(class_names))
    )
    
    # Create DataFrame for easy visualization
    import pandas as pd
    
    performance_df = pd.DataFrame({
        'Class': class_names,
        'Precision': precision,
        'Recall': recall,
        'F1-Score': f1,
        'Support': support
    })
    
    print("Per-Class Performance:")
    print(performance_df.round(3))
    
    # Plot metrics
    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    
    metrics = ['Precision', 'Recall', 'F1-Score']
    for i, metric in enumerate(metrics):
        axes[i].bar(class_names, performance_df[metric])
        axes[i].set_title(f'{metric} by Class')
        axes[i].set_ylabel(metric)
        axes[i].set_ylim(0, 1.1)
        
        # Add value labels on bars
        for j, v in enumerate(performance_df[metric]):
            axes[i].text(j, v + 0.02, f'{v:.3f}', ha='center')
    
    plt.tight_layout()
    plt.show()
    
    return performance_df

performance_df = analyze_per_class_performance(targets, predictions, full_dataset.class_names)

In [None]:
# Feature visualization - show what the model learned
def visualize_conv_filters(model, layer_name='conv1'):
    """Visualize convolutional filters."""
    # Get the layer
    layer = getattr(model, layer_name)
    filters = layer.weight.data.cpu()
    
    # Normalize filters for visualization
    filters = (filters - filters.min()) / (filters.max() - filters.min())
    
    # Plot first 16 filters
    fig, axes = plt.subplots(4, 4, figsize=(12, 12))
    axes = axes.ravel()
    
    for i in range(min(16, filters.shape[0])):
        # Convert filter to displayable format
        filter_img = filters[i].permute(1, 2, 0)
        
        if filter_img.shape[2] == 3:  # RGB filter
            axes[i].imshow(filter_img)
        else:  # Single channel
            axes[i].imshow(filter_img[:, :, 0], cmap='gray')
        
        axes[i].set_title(f'Filter {i+1}')
        axes[i].axis('off')
    
    plt.suptitle(f'Learned Filters in {layer_name}', fontsize=16)
    plt.tight_layout()
    plt.show()

# Visualize first layer filters
visualize_conv_filters(model, 'conv1')

## Model Integration with Core Framework

In [None]:
class DeepLearningV1Classifier(BaseImageClassifier):
    """Deep Learning v1 classifier implementing the base interface."""
    
    def __init__(self, model_name="deep-learning-v1", version="1.0.0"):
        super().__init__(model_name, version)
        self.model = None
        self.class_names = None
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
        self.transform = transforms.Compose([
            transforms.Resize((128, 128)),  # Updated to 128x128 sweet spot
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
        ])
    
    def load_model(self, model_path: str) -> None:
        """Load the trained model."""
        checkpoint = torch.load(model_path, map_location=self.device)
        
        # Recreate model architecture
        num_classes = len(checkpoint['class_names'])
        self.model = DeepLearningV1(num_classes=num_classes)
        self.model.load_state_dict(checkpoint['model_state_dict'])
        self.model.to(self.device)
        self.model.eval()
        
        self.class_names = checkpoint['class_names']
        self._is_loaded = True
    
    def preprocess(self, image: np.ndarray) -> np.ndarray:
        """Preprocess image for prediction."""
        # Convert numpy array to PIL Image
        if image.dtype != np.uint8:
            image = (image * 255).astype(np.uint8)
        
        pil_image = Image.fromarray(image)
        if pil_image.mode != 'RGB':
            pil_image = pil_image.convert('RGB')
        
        # Apply transforms
        tensor_image = self.transform(pil_image)
        
        return tensor_image
    
    def predict(self, image: np.ndarray) -> Dict[str, float]:
        """Make predictions on input image."""
        if not self.is_loaded:
            raise ValueError("Model not loaded. Call load_model() first.")
        
        # Preprocess image
        tensor_image = self.preprocess(image)
        tensor_image = tensor_image.unsqueeze(0).to(self.device)  # Add batch dimension
        
        # Make prediction
        with torch.no_grad():
            outputs = self.model(tensor_image)
            probabilities = F.softmax(outputs, dim=1)
        
        # Convert to class name mapping
        predictions = {}
        for i, prob in enumerate(probabilities[0]):
            predictions[self.class_names[i]] = float(prob.cpu())
        
        return predictions
    
    def get_metadata(self) -> Dict[str, Any]:
        """Get model metadata."""
        return {
            "model_type": "deep_learning_v1",
            "architecture": "Custom CNN with 5 conv layers",
            "input_size": "128x128x3",  # Updated input size
            "classes": self.class_names,
            "parameters": sum(p.numel() for p in self.model.parameters()) if self.model else 0,
            "device": str(self.device),
            "version": self.version
        }
    
    def save_model(self, model_path: str, model, class_names, accuracy, training_history):
        """Save the trained model."""
        checkpoint = {
            'model_state_dict': model.state_dict(),
            'class_names': class_names,
            'accuracy': accuracy,
            'training_history': training_history,
            'model_config': {
                'num_classes': len(class_names),
                'input_size': (128, 128),  # Updated input size
                'architecture': 'DeepLearningV1'
            }
        }
        
        torch.save(checkpoint, model_path)
        print(f"Model saved to {model_path}")

In [None]:
# Save the trained model
deep_v1_classifier = DeepLearningV1Classifier()

# Prepare training history
training_history = {
    'train_losses': trainer.train_losses,
    'val_losses': trainer.val_losses,
    'train_accuracies': trainer.train_accuracies,
    'val_accuracies': trainer.val_accuracies
}

# Save model
model_path = "../models/deep_v1_classifier.pth"
os.makedirs("../models", exist_ok=True)
deep_v1_classifier.save_model(
    model_path, model, class_names, test_accuracy, training_history
)

# Test the saved model
test_classifier = DeepLearningV1Classifier()
test_classifier.load_model(model_path)

# Test prediction on a sample image
sample_batch = next(iter(test_loader))
sample_image = sample_batch[0][0]  # Get first image from batch
sample_label = sample_batch[1][0]  # Get corresponding label

# Convert tensor back to numpy for prediction
mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
sample_image_denorm = sample_image * std + mean
sample_image_denorm = torch.clamp(sample_image_denorm, 0, 1)
sample_image_np = (sample_image_denorm.permute(1, 2, 0).numpy() * 255).astype(np.uint8)

predictions = test_classifier.predict(sample_image_np)
print(f"\nSample prediction: {predictions}")
print(f"Actual class: {class_names[sample_label]}")

# Register model in registry
registry = ModelRegistry()
metadata = ModelMetadata(
    name="deep-learning-v1",
    version="1.0.0",
    model_type="deep_v1",
    accuracy=test_accuracy / 100.0,  # Convert percentage to decimal
    training_date="2024-01-01",
    model_path=model_path,
    config={
        "architecture": "Custom CNN",
        "num_classes": len(class_names),
        "input_size": "64x64x3",
        "epochs_trained": num_epochs,
        "optimizer": "Adam",
        "learning_rate": 0.001
    },
    performance_metrics={
        "test_accuracy": test_accuracy / 100.0,
        "best_val_accuracy": best_val_accuracy / 100.0,
        "final_train_loss": trainer.train_losses[-1],
        "final_val_loss": trainer.val_losses[-1]
    }
)

registry.register_model(metadata)
print(f"\nModel registered with test accuracy: {test_accuracy:.2f}%")
print(f"Total classes trained on: {len(class_names)}")

## Model Comparison and Analysis

In [None]:
# Compare with shallow learning if available
def compare_with_shallow_learning():
    """Compare performance with shallow learning baseline."""
    try:
        # Try to load shallow learning results for comparison
        shallow_registry = registry.get_model("shallow-classifier")
        
        if shallow_registry:
            shallow_accuracy = shallow_registry.accuracy * 100
            deep_accuracy = test_accuracy
            
            print(f"\nModel Comparison:")
            print(f"Shallow Learning Accuracy: {shallow_accuracy:.2f}%")
            print(f"Deep Learning v1 Accuracy: {deep_accuracy:.2f}%")
            print(f"Improvement: {deep_accuracy - shallow_accuracy:.2f}%")
            
            # Plot comparison
            models = ['Shallow Learning', 'Deep Learning v1']
            accuracies = [shallow_accuracy, deep_accuracy]
            
            plt.figure(figsize=(8, 6))
            bars = plt.bar(models, accuracies, color=['skyblue', 'lightcoral'])
            plt.title('Model Performance Comparison')
            plt.ylabel('Accuracy (%)')
            plt.ylim(0, 100)
            
            # Add value labels on bars
            for bar, acc in zip(bars, accuracies):
                plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1,
                        f'{acc:.1f}%', ha='center', va='bottom')
            
            plt.tight_layout()
            plt.show()
        else:
            print("Shallow learning model not found for comparison.")
            
    except Exception as e:
        print(f"Could not compare with shallow learning: {e}")

compare_with_shallow_learning()

## Summary and Insights

### Model Architecture:
- **Custom CNN**: 5 convolutional layers with batch normalization (optimized for 128x128 input)
- **Feature Progression**: 3 → 32 → 64 → 128 → 256 → 512 channels
- **Input Resolution**: 128x128x3 (sweet spot between detail and model capacity)
- **Regularization**: Dropout, batch normalization, data augmentation
- **Global Average Pooling**: Reduces overfitting compared to fully connected layers

### Training Strategy:
- **Optimal Resolution**: 128x128 balances detail capture with computational efficiency
- **Memory-Efficient Loading**: On-demand image loading to handle large datasets
- **Data Augmentation**: Random flips, rotations, color jittering
- **Optimization**: Adam optimizer with learning rate scheduling
- **Early Stopping**: Based on validation accuracy
- **Monitoring**: Real-time loss and accuracy tracking

### Resolution Analysis Results:
1. **64x64**: Best performance initially due to simpler learning task
2. **256x256**: Too much detail for current model capacity, may overfit
3. **128x128**: Expected sweet spot - enough detail without overwhelming the model

### Key Benefits of 128x128:
- **Balanced Complexity**: 4x more detail than 64x64, but manageable for training
- **Better Memory Usage**: More efficient than 256x256, allows larger batch sizes
- **Optimal Learning**: Sufficient detail for discrimination without overfitting
- **Faster Training**: Quicker than 256x256 while maintaining good accuracy

### Memory Considerations:
- **Image Size**: 128x128 = 4x more pixels than 64x64
- **Batch Size**: Can maintain batch_size=8 with good GPU memory usage
- **Model Complexity**: 5 layers handle 128x128 efficiently
- **Training Speed**: Good balance between speed and accuracy

### Expected Performance:
- **Better than 64x64**: More detail for fine-grained classification
- **Better than 256x256**: Avoids overfitting and excessive computational load
- **Optimal Training**: Model capacity matches input complexity
- **Good Convergence**: Expected stable training with good final accuracy

### Production Readiness:
- Model integrated with core framework
- Saved in portable format for deployment
- Compatible with ensemble classifier
- Ready for API integration with 128x128 input standardization

### Next Steps:
1. **Performance Validation**: Confirm 128x128 outperforms both 64x64 and 256x256
2. **Hyperparameter Tuning**: Optimize learning rate and batch size for 128x128
3. **Architecture Refinement**: Consider adding skip connections if needed
4. **Ensemble Integration**: Combine with other models for maximum accuracy