# Lab 03: Model Architecture Comparison

In this notebook, we'll systematically compare different model architectures (EfficientNet-B0 vs B2) to understand the **performance-efficiency trade-off** and select the best model for different deployment scenarios.

## Why This Lab Matters

Choosing the right model architecture is crucial:
- **Mobile apps** need small, fast models
- **Cloud services** can use larger, more accurate models
- **Edge devices** have strict resource constraints

**The key question:** *"Is the extra accuracy worth the computational cost?"*

## High-Level Workflow

![Model Comparison Workflow](https://raw.githubusercontent.com/poridhiEng/lab-asset/refs/heads/main/tensorcode/Deep-learning-with-pytorch/Experiment-Tracking/Tensorboard/lab_03/images/image1.svg)

### Workflow Phases

| Phase | Focus | What Happens |
|-------|-------|--------------|
| **Phase 1: Setup** | Prepare environment and data | Install packages, download dataset, create DataLoaders |
| **Phase 2: Experiments** | Train and track models | Run 4 experiments (B0/B2 × 5/10 epochs), log to TensorBoard |
| **Phase 3: Analysis** | Compare and decide | Visualize results, analyze trade-offs, create decision framework |

## Our Experiment Design

| Experiment | Model | Epochs | Purpose |
|------------|-------|--------|---------|
| 1 | EfficientNet-B0 | 5 | Fastest baseline |
| 2 | EfficientNet-B0 | 10 | Can longer training help? |
| 3 | EfficientNet-B2 | 5 | Larger model, quick training |
| 4 | EfficientNet-B2 | 10 | Maximum performance |

## Part 1: Setup and Imports

**Phase 1 → Setup**

Let's begin by installing the required packages and importing libraries. This prepares our environment for running model comparison experiments.

In [None]:
!pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
!pip install tensorboard matplotlib pandas seaborn tqdm requests -q

### Step 1.1: Import Libraries

Import PyTorch, torchvision, and other utilities for model training and visualization.

In [None]:
import torch
from torch import nn
import torchvision
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from tqdm.auto import tqdm
from typing import Dict, List, Tuple
from pathlib import Path
import requests
import zipfile
from datetime import datetime
import os
import time

# Set style for better plots
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

print(f"PyTorch version: {torch.__version__}")
print(f"Torchvision version: {torchvision.__version__}")

### Step 1.2: Setup Device and Seeds

Configure the device (CPU/GPU) and set random seeds for reproducible experiments.

In [None]:
# Device setup
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")

# Set seed for reproducibility
def set_seed(seed: int = 42):
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    np.random.seed(seed)

set_seed(42)
print("Seeds set for reproducibility")

## Part 2: Understanding the Models

Before we run experiments, let's understand what we're comparing.

### Model Architecture Comparison

| Aspect | EfficientNet-B0 | EfficientNet-B2 | Difference |
|--------|-----------------|-----------------|------------|
| **Parameters** | 5.3M | 9.2M | 1.7x larger |
| **FLOPs** | 0.39B | 1.0B | 2.6x more compute |
| **Input Size** | 224×224 | 260×260 | 1.3x larger |
| **Accuracy (ImageNet)** | 77.1% | 80.1% | +3% |
| **Speed** | Fast | Moderate | ~2x slower |
| **Memory** | ~20MB | ~35MB | 1.8x more |

### When to Use Each Model

**EfficientNet-B0** (Small & Fast): Mobile apps, edge devices, real-time applications

**EfficientNet-B2** (Large & Accurate): Cloud deployment, batch processing, research

**The trade-off:** B2 is almost 2x larger and slower, but only 3% more accurate on ImageNet. Is this worth it for your use case? Let's find out!

## Part 3: Download and Prepare Data

**Phase 1 → Setup**

We'll use the same food classification dataset (pizza, steak, sushi) for fair comparison across all models.

### Step 3.1: Download Dataset

Download the pizza/steak/sushi dataset. This dataset has 225 training images and 75 test images across 3 classes.

In [None]:
def download_dataset() -> Path:
    """Download the pizza_steak_sushi dataset."""
    data_path = Path("data/")
    image_path = data_path / "pizza_steak_sushi"
    
    if image_path.is_dir():
        print(f"Dataset already exists at {image_path}")
        return image_path
    
    # Download and extract
    print(f"Downloading dataset...")
    image_path.mkdir(parents=True, exist_ok=True)
    
    zip_path = data_path / "pizza_steak_sushi.zip"
    url = "https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip"
    
    with open(zip_path, "wb") as f:
        response = requests.get(url)
        f.write(response.content)
    
    with zipfile.ZipFile(zip_path, "r") as zip_ref:
        print("Extracting...")
        zip_ref.extractall(image_path)
    
    os.remove(zip_path)
    print(f"Dataset ready!")
    
    return image_path

# Download dataset
data_path = download_dataset()

# Setup paths
train_dir = data_path / "train"
test_dir = data_path / "test"

# Count images
for split in [train_dir, test_dir]:
    total = sum(len(list(class_dir.glob("*.jpg"))) 
                for class_dir in split.iterdir() if class_dir.is_dir())
    print(f"{split.name}: {total} images")

## Part 4: Create DataLoaders and Models

**Phase 1 → Setup**

Different models require different input sizes. B0 uses 224×224 images, while B2 uses 260×260 images for better accuracy.

### Step 4.1: Get Model-Specific Transforms

In [None]:
# Get transforms for each model
transforms_b0 = torchvision.models.EfficientNet_B0_Weights.DEFAULT.transforms()
transforms_b2 = torchvision.models.EfficientNet_B2_Weights.DEFAULT.transforms()

print("EfficientNet-B0 transforms:")
print(transforms_b0)
print("\nEfficientNet-B2 transforms:")
print(transforms_b2)

BATCH_SIZE = 32

def create_dataloaders(transform):
    """Create DataLoaders with given transform."""
    train_dataset = datasets.ImageFolder(train_dir, transform=transform)
    test_dataset = datasets.ImageFolder(test_dir, transform=transform)
    
    train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
    test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)
    
    return train_loader, test_loader, train_dataset.classes

# Create DataLoaders for each model
train_loader_b0, test_loader_b0, class_names = create_dataloaders(transforms_b0)
train_loader_b2, test_loader_b2, _ = create_dataloaders(transforms_b2)

print(f"\nClasses: {class_names}")
print(f"Train batches: {len(train_loader_b0)}")
print(f"Test batches: {len(test_loader_b0)}")

### Step 4.2: Create Model Functions

We'll use transfer learning: freeze the pretrained base layers and only train the classifier head. This speeds up training significantly.

In [None]:
def create_effnetb0(num_classes: int = 3) -> nn.Module:
    """Create EfficientNet-B0 with frozen base layers."""
    weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT
    model = torchvision.models.efficientnet_b0(weights=weights)
    
    # Freeze base layers
    for param in model.features.parameters():
        param.requires_grad = False
    
    # Replace classifier
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.2, inplace=True),
        nn.Linear(in_features=1280, out_features=num_classes)
    )
    
    return model

def create_effnetb2(num_classes: int = 3) -> nn.Module:
    """Create EfficientNet-B2 with frozen base layers."""
    weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT
    model = torchvision.models.efficientnet_b2(weights=weights)
    
    # Freeze base layers
    for param in model.features.parameters():
        param.requires_grad = False
    
    # Replace classifier
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.3, inplace=True),
        nn.Linear(in_features=1408, out_features=num_classes)
    )
    
    return model

# Compare model sizes
print("MODEL ARCHITECTURE COMPARISON")
print("="*50)

model_b0 = create_effnetb0(len(class_names))
model_b2 = create_effnetb2(len(class_names))

for name, model in [("EfficientNet-B0", model_b0), ("EfficientNet-B2", model_b2)]:
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    model_size_mb = total_params * 4 / 1024 / 1024  # Assuming float32
    
    print(f"\n{name}:")
    print(f"  Total parameters: {total_params:,}")
    print(f"  Trainable parameters: {trainable_params:,}")
    print(f"  Estimated size: {model_size_mb:.1f} MB")

# Clean up
del model_b0, model_b2

## Part 5: Training Functions

**Phase 2 → Experiments**

Create reusable training and evaluation functions that track time per epoch. This helps us compare efficiency.

In [None]:
def train_step(model: nn.Module, dataloader: DataLoader, 
               loss_fn: nn.Module, optimizer: torch.optim.Optimizer,
               device: str) -> Tuple[float, float, float]:
    """Train for one epoch and measure time."""
    model.train()
    train_loss, correct = 0, 0
    start_time = time.time()
    
    for X, y in dataloader:
        X, y = X.to(device), y.to(device)
        
        y_pred = model(X)
        loss = loss_fn(y_pred, y)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        train_loss += loss.item()
        correct += (y_pred.argmax(1) == y).sum().item()
    
    epoch_time = time.time() - start_time
    avg_loss = train_loss / len(dataloader)
    accuracy = 100 * correct / len(dataloader.dataset)
    
    return avg_loss, accuracy, epoch_time

def test_step(model: nn.Module, dataloader: DataLoader,
              loss_fn: nn.Module, device: str) -> Tuple[float, float]:
    """Evaluate the model."""
    model.eval()
    test_loss, correct = 0, 0
    
    with torch.inference_mode():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            
            y_pred = model(X)
            test_loss += loss_fn(y_pred, y).item()
            correct += (y_pred.argmax(1) == y).sum().item()
    
    avg_loss = test_loss / len(dataloader)
    accuracy = 100 * correct / len(dataloader.dataset)
    
    return avg_loss, accuracy

## Part 6: Run Experiments

**Phase 2 → Experiments**

Now we'll run all 4 experiments systematically. Each experiment trains a model and logs metrics to TensorBoard for later comparison.

### Experiment 1: EfficientNet-B0 (5 epochs)

**Baseline:** Start with the smaller, faster model trained for 5 epochs. This is our speed benchmark.

In [None]:
# Setup
set_seed(42)
model_b0_5ep = create_effnetb0(len(class_names)).to(device)
optimizer = torch.optim.Adam(model_b0_5ep.parameters(), lr=0.001)
loss_fn = nn.CrossEntropyLoss()

# TensorBoard writer
writer_b0_5ep = SummaryWriter("runs/model_comparison/effnetb0_5epochs")

print("EXPERIMENT 1: EfficientNet-B0 (5 epochs)")
print("="*50)

# Training
results_b0_5ep = {"train_loss": [], "train_acc": [], "test_loss": [], "test_acc": [], "epoch_time": []}

for epoch in range(5):
    train_loss, train_acc, epoch_time = train_step(
        model_b0_5ep, train_loader_b0, loss_fn, optimizer, device
    )
    test_loss, test_acc = test_step(
        model_b0_5ep, test_loader_b0, loss_fn, device
    )
    
    # Store results
    results_b0_5ep["train_loss"].append(train_loss)
    results_b0_5ep["train_acc"].append(train_acc)
    results_b0_5ep["test_loss"].append(test_loss)
    results_b0_5ep["test_acc"].append(test_acc)
    results_b0_5ep["epoch_time"].append(epoch_time)
    
    # Log to TensorBoard
    writer_b0_5ep.add_scalars("Loss", {"train": train_loss, "test": test_loss}, epoch)
    writer_b0_5ep.add_scalars("Accuracy", {"train": train_acc, "test": test_acc}, epoch)
    
    print(f"Epoch {epoch+1}: Test Acc: {test_acc:.2f}% | Time: {epoch_time:.1f}s")

writer_b0_5ep.close()

total_time_b0_5ep = sum(results_b0_5ep["epoch_time"])
print(f"\nComplete! Final accuracy: {results_b0_5ep['test_acc'][-1]:.2f}%")
print(f"Total training time: {total_time_b0_5ep:.1f}s")

### Experiment 2: EfficientNet-B0 (10 epochs)

**Question:** Does training longer help the smaller model? Can B0 close the accuracy gap with more epochs?

In [None]:
# Setup
set_seed(42)
model_b0_10ep = create_effnetb0(len(class_names)).to(device)
optimizer = torch.optim.Adam(model_b0_10ep.parameters(), lr=0.001)

# TensorBoard writer
writer_b0_10ep = SummaryWriter("runs/model_comparison/effnetb0_10epochs")

print("EXPERIMENT 2: EfficientNet-B0 (10 epochs)")
print("="*50)

# Training
results_b0_10ep = {"train_loss": [], "train_acc": [], "test_loss": [], "test_acc": [], "epoch_time": []}

for epoch in range(10):
    train_loss, train_acc, epoch_time = train_step(
        model_b0_10ep, train_loader_b0, loss_fn, optimizer, device
    )
    test_loss, test_acc = test_step(
        model_b0_10ep, test_loader_b0, loss_fn, device
    )
    
    # Store results
    results_b0_10ep["train_loss"].append(train_loss)
    results_b0_10ep["train_acc"].append(train_acc)
    results_b0_10ep["test_loss"].append(test_loss)
    results_b0_10ep["test_acc"].append(test_acc)
    results_b0_10ep["epoch_time"].append(epoch_time)
    
    # Log to TensorBoard
    writer_b0_10ep.add_scalars("Loss", {"train": train_loss, "test": test_loss}, epoch)
    writer_b0_10ep.add_scalars("Accuracy", {"train": train_acc, "test": test_acc}, epoch)
    
    print(f"Epoch {epoch+1}: Test Acc: {test_acc:.2f}% | Time: {epoch_time:.1f}s")

writer_b0_10ep.close()

total_time_b0_10ep = sum(results_b0_10ep["epoch_time"])
print(f"\nComplete! Final accuracy: {results_b0_10ep['test_acc'][-1]:.2f}%")
print(f"Total training time: {total_time_b0_10ep:.1f}s")

### Experiment 3: EfficientNet-B2 (5 epochs)

**Larger Model:** Now let's try the bigger model with the same 5 epochs. Will it outperform B0 despite having less training time per image?

In [None]:
# Setup
set_seed(42)
model_b2_5ep = create_effnetb2(len(class_names)).to(device)
optimizer = torch.optim.Adam(model_b2_5ep.parameters(), lr=0.001)

# TensorBoard writer
writer_b2_5ep = SummaryWriter("runs/model_comparison/effnetb2_5epochs")

print("EXPERIMENT 3: EfficientNet-B2 (5 epochs)")
print("="*50)

# Training
results_b2_5ep = {"train_loss": [], "train_acc": [], "test_loss": [], "test_acc": [], "epoch_time": []}

for epoch in range(5):
    train_loss, train_acc, epoch_time = train_step(
        model_b2_5ep, train_loader_b2, loss_fn, optimizer, device
    )
    test_loss, test_acc = test_step(
        model_b2_5ep, test_loader_b2, loss_fn, device
    )
    
    # Store results
    results_b2_5ep["train_loss"].append(train_loss)
    results_b2_5ep["train_acc"].append(train_acc)
    results_b2_5ep["test_loss"].append(test_loss)
    results_b2_5ep["test_acc"].append(test_acc)
    results_b2_5ep["epoch_time"].append(epoch_time)
    
    # Log to TensorBoard
    writer_b2_5ep.add_scalars("Loss", {"train": train_loss, "test": test_loss}, epoch)
    writer_b2_5ep.add_scalars("Accuracy", {"train": train_acc, "test": test_acc}, epoch)
    
    print(f"Epoch {epoch+1}: Test Acc: {test_acc:.2f}% | Time: {epoch_time:.1f}s")

writer_b2_5ep.close()

total_time_b2_5ep = sum(results_b2_5ep["epoch_time"])
print(f"\nComplete! Final accuracy: {results_b2_5ep['test_acc'][-1]:.2f}%")
print(f"Total training time: {total_time_b2_5ep:.1f}s")

### Experiment 4: EfficientNet-B2 (10 epochs)

**Maximum Performance:** The largest model with the longest training time. This is our accuracy benchmark — how much better can we do with more resources?

In [None]:
# Setup
set_seed(42)
model_b2_10ep = create_effnetb2(len(class_names)).to(device)
optimizer = torch.optim.Adam(model_b2_10ep.parameters(), lr=0.001)

# TensorBoard writer
writer_b2_10ep = SummaryWriter("runs/model_comparison/effnetb2_10epochs")

print("EXPERIMENT 4: EfficientNet-B2 (10 epochs)")
print("="*50)

# Training
results_b2_10ep = {"train_loss": [], "train_acc": [], "test_loss": [], "test_acc": [], "epoch_time": []}

for epoch in range(10):
    train_loss, train_acc, epoch_time = train_step(
        model_b2_10ep, train_loader_b2, loss_fn, optimizer, device
    )
    test_loss, test_acc = test_step(
        model_b2_10ep, test_loader_b2, loss_fn, device
    )
    
    # Store results
    results_b2_10ep["train_loss"].append(train_loss)
    results_b2_10ep["train_acc"].append(train_acc)
    results_b2_10ep["test_loss"].append(test_loss)
    results_b2_10ep["test_acc"].append(test_acc)
    results_b2_10ep["epoch_time"].append(epoch_time)
    
    # Log to TensorBoard
    writer_b2_10ep.add_scalars("Loss", {"train": train_loss, "test": test_loss}, epoch)
    writer_b2_10ep.add_scalars("Accuracy", {"train": train_acc, "test": test_acc}, epoch)
    
    print(f"Epoch {epoch+1}: Test Acc: {test_acc:.2f}% | Time: {epoch_time:.1f}s")

writer_b2_10ep.close()

total_time_b2_10ep = sum(results_b2_10ep["epoch_time"])
print(f"\nComplete! Final accuracy: {results_b2_10ep['test_acc'][-1]:.2f}%")
print(f"Total training time: {total_time_b2_10ep:.1f}s")

## Part 7: View Results in TensorBoard

**Phase 3 → Analysis**

Now let's visualize all 4 experiments together to compare their performance side-by-side.

In [None]:
import requests, subprocess, time

ip = requests.get("https://ifconfig.me").text.strip()

subprocess.Popen(
    ["tensorboard", "--logdir=runs/model_comparison", "--port=6006", "--host=0.0.0.0"],
    stdout=subprocess.DEVNULL,
    stderr=subprocess.DEVNULL
)

time.sleep(2)
print(f"TensorBoard running at: http://{ip}:6006")

Access TensorBoard in your browser to compare all 4 experiments side-by-side. You can:
- Compare accuracy and loss curves across all experiments
- Use the smoothing slider to reduce noise
- Toggle experiments on/off for detailed comparisons
- Analyze which model converges faster

### TensorBoard Results

<table>
<tr>
<td><img src="https://raw.githubusercontent.com/poridhiEng/lab-asset/refs/heads/main/tensorcode/Deep-learning-with-pytorch/Experiment-Tracking/Tensorboard/lab_03/images/b0-5e.png" alt="B0 5 epochs" width="400"/></td>
<td><img src="https://raw.githubusercontent.com/poridhiEng/lab-asset/refs/heads/main/tensorcode/Deep-learning-with-pytorch/Experiment-Tracking/Tensorboard/lab_03/images/b0-10e.png" alt="B0 10 epochs" width="400"/></td>
</tr>
<tr>
<td align="center"><b>EfficientNet-B0 (5 epochs)</b></td>
<td align="center"><b>EfficientNet-B0 (10 epochs)</b></td>
</tr>
</table>

<table>
<tr>
<td><img src="https://raw.githubusercontent.com/poridhiEng/lab-asset/refs/heads/main/tensorcode/Deep-learning-with-pytorch/Experiment-Tracking/Tensorboard/lab_03/images/b2-5e.png" alt="B2 5 epochs" width="400"/></td>
<td><img src="https://raw.githubusercontent.com/poridhiEng/lab-asset/refs/heads/main/tensorcode/Deep-learning-with-pytorch/Experiment-Tracking/Tensorboard/lab_03/images/b2-10e.png" alt="B2 10 epochs" width="400"/></td>
</tr>
<tr>
<td align="center"><b>EfficientNet-B2 (5 epochs)</b></td>
<td align="center"><b>EfficientNet-B2 (10 epochs)</b></td>
</tr>
</table>

**Key Observations from TensorBoard:**
- **B2 converges faster** than B0 in early epochs
- **B0 with 10 epochs** closes the gap significantly (+5% improvement)
- **B2 plateaus** after ~7 epochs (diminishing returns)
- **Loss curves** show B2 has slightly better convergence

## Part 8: Compare Results

**Phase 3 → Analysis**

Now let's create comprehensive comparison visualizations and analyze which model performs best.

### Step 8.1: Results Summary Table

In [None]:
# Create results summary
experiments = [
    {"Model": "B0", "Epochs": 5, "Final Acc": results_b0_5ep['test_acc'][-1], 
     "Best Acc": max(results_b0_5ep['test_acc']), "Time (s)": total_time_b0_5ep,
     "Time/Epoch": total_time_b0_5ep/5},
    
    {"Model": "B0", "Epochs": 10, "Final Acc": results_b0_10ep['test_acc'][-1], 
     "Best Acc": max(results_b0_10ep['test_acc']), "Time (s)": total_time_b0_10ep,
     "Time/Epoch": total_time_b0_10ep/10},
    
    {"Model": "B2", "Epochs": 5, "Final Acc": results_b2_5ep['test_acc'][-1], 
     "Best Acc": max(results_b2_5ep['test_acc']), "Time (s)": total_time_b2_5ep,
     "Time/Epoch": total_time_b2_5ep/5},
    
    {"Model": "B2", "Epochs": 10, "Final Acc": results_b2_10ep['test_acc'][-1], 
     "Best Acc": max(results_b2_10ep['test_acc']), "Time (s)": total_time_b2_10ep,
     "Time/Epoch": total_time_b2_10ep/10},
]

results_df = pd.DataFrame(experiments)
results_df = results_df.sort_values('Final Acc', ascending=False)

print("EXPERIMENT RESULTS (Sorted by Performance)")
print("="*70)
print(results_df.to_string(index=False, float_format='%.2f'))

# Best overall
best = results_df.iloc[0]
print(f"\nBEST MODEL: {best['Model']} trained for {best['Epochs']} epochs")
print(f"   Accuracy: {best['Final Acc']:.2f}%")
print(f"   Training time: {best['Time (s)']:.1f}s")

### Step 8.2: Performance vs Efficiency Visualization

Visualize the trade-off: accuracy vs training time. This reveals which models are most efficient.

In [None]:
# Create visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Accuracy Comparison
x_pos = np.arange(4)
labels = ['B0-5ep', 'B0-10ep', 'B2-5ep', 'B2-10ep']
accuracies = [
    results_b0_5ep['test_acc'][-1],
    results_b0_10ep['test_acc'][-1],
    results_b2_5ep['test_acc'][-1],
    results_b2_10ep['test_acc'][-1]
]

colors = ['#3498db', '#2980b9', '#e74c3c', '#c0392b']
bars1 = axes[0].bar(x_pos, accuracies, color=colors, alpha=0.8)
axes[0].set_xlabel('Configuration')
axes[0].set_ylabel('Test Accuracy (%)')
axes[0].set_title('Model Performance Comparison', fontweight='bold')
axes[0].set_xticks(x_pos)
axes[0].set_xticklabels(labels)
axes[0].grid(True, axis='y', alpha=0.3)

# Add value labels on bars
for bar, acc in zip(bars1, accuracies):
    height = bar.get_height()
    axes[0].text(bar.get_x() + bar.get_width()/2., height + 0.5,
                f'{acc:.1f}%', ha='center', va='bottom')

# Plot 2: Time Comparison
times = [
    total_time_b0_5ep,
    total_time_b0_10ep,
    total_time_b2_5ep,
    total_time_b2_10ep
]

bars2 = axes[1].bar(x_pos, times, color=colors, alpha=0.8)
axes[1].set_xlabel('Configuration')
axes[1].set_ylabel('Training Time (seconds)')
axes[1].set_title('Training Time Comparison', fontweight='bold')
axes[1].set_xticks(x_pos)
axes[1].set_xticklabels(labels)
axes[1].grid(True, axis='y', alpha=0.3)

# Add value labels
for bar, time in zip(bars2, times):
    height = bar.get_height()
    axes[1].text(bar.get_x() + bar.get_width()/2., height + 1,
                f'{time:.0f}s', ha='center', va='bottom')

plt.tight_layout()
plt.savefig('model_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

### Step 8.3: Learning Curves Comparison

Compare how each model learns over time. Do they converge at the same rate? Does one plateau earlier?

In [None]:
# Plot learning curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Prepare data for 10-epoch experiments (for fair comparison)
epochs_10 = range(1, 11)
epochs_5 = range(1, 6)

# Plot 1: Test Accuracy Over Time
axes[0].plot(epochs_10, results_b0_10ep['test_acc'], 'o-', label='B0 (10 epochs)', 
            color='#3498db', linewidth=2, markersize=6)
axes[0].plot(epochs_10, results_b2_10ep['test_acc'], 's-', label='B2 (10 epochs)', 
            color='#e74c3c', linewidth=2, markersize=6)
axes[0].plot(epochs_5, results_b0_5ep['test_acc'], 'o--', label='B0 (5 epochs)', 
            color='#3498db', alpha=0.6, linewidth=2, markersize=6)
axes[0].plot(epochs_5, results_b2_5ep['test_acc'], 's--', label='B2 (5 epochs)', 
            color='#e74c3c', alpha=0.6, linewidth=2, markersize=6)

axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Test Accuracy (%)')
axes[0].set_title('Learning Curves: Test Accuracy', fontweight='bold')
axes[0].legend(loc='lower right')
axes[0].grid(True, alpha=0.3)

# Plot 2: Test Loss Over Time
axes[1].plot(epochs_10, results_b0_10ep['test_loss'], 'o-', label='B0 (10 epochs)', 
            color='#3498db', linewidth=2, markersize=6)
axes[1].plot(epochs_10, results_b2_10ep['test_loss'], 's-', label='B2 (10 epochs)', 
            color='#e74c3c', linewidth=2, markersize=6)
axes[1].plot(epochs_5, results_b0_5ep['test_loss'], 'o--', label='B0 (5 epochs)', 
            color='#3498db', alpha=0.6, linewidth=2, markersize=6)
axes[1].plot(epochs_5, results_b2_5ep['test_loss'], 's--', label='B2 (5 epochs)', 
            color='#e74c3c', alpha=0.6, linewidth=2, markersize=6)

axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Test Loss')
axes[1].set_title('Learning Curves: Test Loss', fontweight='bold')
axes[1].legend(loc='upper right')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig('learning_curves.png', dpi=150, bbox_inches='tight')
plt.show()

## Part 9: Decision Framework

**Phase 3 → Analysis**

Based on our experimental results, let's create a decision framework for choosing the right model for different scenarios.

In [None]:
print("DECISION FRAMEWORK")
print("="*70)

# Calculate metrics
b0_5_efficiency = results_b0_5ep['test_acc'][-1] / total_time_b0_5ep
b0_10_efficiency = results_b0_10ep['test_acc'][-1] / total_time_b0_10ep
b2_5_efficiency = results_b2_5ep['test_acc'][-1] / total_time_b2_5ep
b2_10_efficiency = results_b2_10ep['test_acc'][-1] / total_time_b2_10ep

scenarios = [
    {
        "Scenario": "Mobile/Edge Deployment",
        "Recommendation": "EfficientNet-B0 (5 epochs)",
        "Why": "Smallest model, fastest inference",
        "Accuracy": f"{results_b0_5ep['test_acc'][-1]:.1f}%",
        "Time": f"{total_time_b0_5ep:.0f}s"
    },
    {
        "Scenario": "Balanced Performance",
        "Recommendation": "EfficientNet-B0 (10 epochs)",
        "Why": "Good accuracy, still fast",
        "Accuracy": f"{results_b0_10ep['test_acc'][-1]:.1f}%",
        "Time": f"{total_time_b0_10ep:.0f}s"
    },
    {
        "Scenario": "Quick Prototyping",
        "Recommendation": "EfficientNet-B2 (5 epochs)",
        "Why": "Better accuracy, moderate time",
        "Accuracy": f"{results_b2_5ep['test_acc'][-1]:.1f}%",
        "Time": f"{total_time_b2_5ep:.0f}s"
    },
    {
        "Scenario": "Maximum Accuracy",
        "Recommendation": "EfficientNet-B2 (10 epochs)",
        "Why": "Best performance overall",
        "Accuracy": f"{results_b2_10ep['test_acc'][-1]:.1f}%",
        "Time": f"{total_time_b2_10ep:.0f}s"
    }
]

scenario_df = pd.DataFrame(scenarios)
print(scenario_df.to_string(index=False))

print("\nEFFICIENCY ANALYSIS (Accuracy per second):")
print(f"  B0 (5 epochs):  {b0_5_efficiency:.3f}")
print(f"  B0 (10 epochs): {b0_10_efficiency:.3f}")
print(f"  B2 (5 epochs):  {b2_5_efficiency:.3f}")
print(f"  B2 (10 epochs): {b2_10_efficiency:.3f}")

best_efficiency = max(b0_5_efficiency, b0_10_efficiency, b2_5_efficiency, b2_10_efficiency)
if b0_5_efficiency == best_efficiency:
    print("\nMost efficient: B0 with 5 epochs")
elif b0_10_efficiency == best_efficiency:
    print("\nMost efficient: B0 with 10 epochs")
elif b2_5_efficiency == best_efficiency:
    print("\nMost efficient: B2 with 5 epochs")
else:
    print("\nMost efficient: B2 with 10 epochs")

## Part 10: Summary and Key Takeaways

Congratulations! You've completed a systematic model architecture comparison.

### What We Discovered

| Metric | Finding |
|--------|---------|
| **Best Accuracy** | B2-10ep: 93% (best epoch) |
| **Most Efficient** | B0-5ep: 0.47 accuracy/second |
| **Sweet Spot** | B0-10ep: 89% in ~6 minutes |
| **Size Difference** | B2 is 1.9x larger (29MB vs 15MB) |

### Practical Insights

1. **B2 is only 2-3% more accurate** despite being 2x larger and slower
2. **Longer training helps B0 more** (+5%) than B2 (+0%)
3. **B0 is sufficient** for most applications (84-89% accuracy)
4. **Efficiency matters** — B0 trains 2.5x faster per epoch

### Deployment Recommendations

| Scenario | Choose | Why |
|----------|--------|-----|
| **Mobile/Edge** | B0 (5-10 ep) | Small size, fast inference |
| **Cloud/Server** | B2 (5-10 ep) | Accuracy is priority |
| **Prototyping** | B0 (5 ep) | Fastest iteration |
| **Production** | B0 (10 ep) | Balanced performance |

### Next Steps

1. **Try other models** — ResNet, MobileNet, VisionTransformer
2. **Optimize further** — Quantization, pruning, distillation
3. **Test learning rates** — Larger models may need different schedules
4. **Add augmentation** — May help smaller models close the gap
5. **Apply to your data** — Results vary by domain

**Key Lesson:** The best model depends on your constraints, not just accuracy!