# Segmentation Model Benchmarking & Comparative Analysis

This notebook provides a professional workflow for benchmarking and comparing segmentation models (DeepLabV3+, GhanaSegNet, SegFormer, UNet) on Ghanaian food image datasets. It includes file upload, metric visualization, and thesis-ready summary.

## Upload Your Results Files
Before running the analysis, use the cell below to upload your JSON result files (`deeplabv3plus_results.json`, `ghanasegnet_results.json`, `segformer_results.json`, `unet_results.json`).
Click the "Choose Files" button when you run the upload cell and select all four files.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/EricBaidoo/GhanaSegNet/blob/main/analysis/Segmentation_Model_Benchmarking_Analysis.ipynb)

---

In [None]:
# Install required libraries
!pip install matplotlib

In [None]:
# Upload results files (Colab only)
from google.colab import files
uploaded = files.upload()

In [None]:
# Load results
import json
import matplotlib.pyplot as plt

FILES = [
    'deeplabv3plus_results.json',
    'ghanasegnet_results.json',
    'segformer_results.json',
    'unet_results.json'
 ]
models = []
for fname in FILES:
    with open(fname, 'r') as f:
        models.append(json.load(f))

In [None]:
# Plot training metrics
metrics = ['val_iou', 'val_loss', 'val_accuracy']
for metric in metrics:
    plt.figure(figsize=(8, 5))
    for m in models:
        epochs = [h['epoch'] for h in m['training_history']]
        values = [h[metric] for h in m['training_history']]
        plt.plot(epochs, values, label=m['model_name'])
    plt.xlabel('Epoch')
    plt.ylabel(metric.replace('_', ' ').title())
    plt.title(f'{metric.replace('_', ' ').title()} Over Epochs')
    plt.legend()
    plt.tight_layout()
    plt.show()

In [None]:
# Model comparison summary
for m in models:
    print(f"Model: {m['model_name']}")
    print(f"  Best IoU: {m['best_iou']:.4f}")
    print(f"  Final Epoch: {m['final_epoch']}")
    print(f"  Parameters: {m['trainable_parameters']:,}")
    print()

### Thesis-Ready Comparative Summary
The comparative analysis of four segmentation models—DeepLabV3+, GhanaSegNet, SegFormer, and UNet—reveals distinct performance characteristics. DeepLabV3+ achieved the highest IoU (0.2544), indicating superior segmentation accuracy, but it is also the most parameter-heavy. GhanaSegNet, while slightly trailing in IoU (0.2447), is significantly more parameter-efficient, making it attractive for resource-constrained applications. SegFormer and UNet plateaued early, suggesting limited learning capacity or data/model constraints under current settings. All models were trained under identical conditions for fairness. The results highlight GhanaSegNet’s competitive performance and efficiency, with recommendations to further improve its accuracy through advanced augmentation, loss functions, or architectural refinements. Future work may explore deeper architectures, alternative backbones, or larger datasets to enhance segmentation outcomes.

## Comprehensive Model Analysis for Master’s Thesis
This section expands the analysis to meet the standards of a master’s thesis, including quantitative tables, statistical analysis, qualitative results, resource usage, and recommendations.

### 1. Quantitative Metrics Table
Below, we summarize the key metrics for each model, including IoU, accuracy, loss, parameter count, training time, and inference speed.

In [None]:
# Create a summary table of key metrics for each model
import pandas as pd
metrics_data = []
for m in models:
    metrics_data.append({
        'Model': m['model_name'],
        'Best IoU': m['best_iou'],
        'Final Accuracy': m.get('final_accuracy', None),
        'Final Loss': m.get('final_loss', None),
        'Parameters': m['trainable_parameters'],
        'Training Time (s)': m.get('training_time', None),
        'Inference Speed (img/s)': m.get('inference_speed', None)
    })
df_metrics = pd.DataFrame(metrics_data)
df_metrics

### 2. Statistical Analysis of Metrics
We compute mean, standard deviation, and (if available) confidence intervals for the main metrics.

In [None]:
# Compute statistics for IoU and accuracy (if multiple runs are available)
if 'runs' in models[0]:
    iou_stats = {}
    for m in models:
        ious = [run['best_iou'] for run in m['runs']]
        iou_stats[m['model_name']] = {
            'mean': np.mean(ious),
            'std': np.std(ious),
            '95% CI': 1.96 * np.std(ious) / np.sqrt(len(ious))
        }
    pd.DataFrame(iou_stats).T
else:
    print('Multiple runs not available for statistical analysis.')

### 3. Qualitative Results: Visual Comparison
Below, we visualize sample predictions from each model for qualitative comparison.

In [None]:
# Visualize sample predictions (requires prediction images or arrays in your results)
import matplotlib.pyplot as plt
import numpy as np
def plot_predictions(model, sample_idx=0):
    if 'sample_predictions' in model:
        preds = model['sample_predictions'][sample_idx]
        fig, axs = plt.subplots(1, len(preds), figsize=(15, 5))
        for i, (title, img) in enumerate(preds.items()):
            axs[i].imshow(img, cmap='gray')
            axs[i].set_title(title)
            axs[i].axis('off')
        plt.suptitle(f"{model['model_name']} - Sample {sample_idx}")
        plt.show()
    else:
        print(f"No sample predictions found for {model['model_name']}")

# Example: plot_predictions(models[0], sample_idx=0)

### 4. Learning Curves and Overfitting Analysis
We plot training and validation curves to assess convergence and overfitting.

In [None]:
# Plot training and validation curves for each model
for m in models:
    epochs = [h['epoch'] for h in m['training_history']]
    val_iou = [h['val_iou'] for h in m['training_history']]
    train_iou = [h.get('train_iou', None) for h in m['training_history']]
    val_loss = [h['val_loss'] for h in m['training_history']]
    train_loss = [h.get('train_loss', None) for h in m['training_history']]
    plt.figure(figsize=(12, 4))
    plt.subplot(1, 2, 1)
    plt.plot(epochs, val_iou, label='Val IoU')
    if all(v is not None for v in train_iou):
        plt.plot(epochs, train_iou, label='Train IoU')
    plt.xlabel('Epoch')
    plt.ylabel('IoU')
    plt.title(f'{m["model_name"]} IoU Curves')
    plt.legend()
    plt.subplot(1, 2, 2)
    plt.plot(epochs, val_loss, label='Val Loss')
    if all(v is not None for v in train_loss):
        plt.plot(epochs, train_loss, label='Train Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.title(f'{m["model_name"]} Loss Curves')
    plt.legend()
    plt.tight_layout()
    plt.show()

### 5. Resource Usage Analysis
We compare models in terms of parameter count, memory usage, and inference speed.

In [None]:
# Display resource usage table (parameters, memory, speed)
resource_data = []
for m in models:
    resource_data.append({
        'Model': m['model_name'],
        'Parameters': m['trainable_parameters'],
        'Memory (MB)': m.get('memory_usage', None),
        'Inference Speed (img/s)': m.get('inference_speed', None)
    })
df_resource = pd.DataFrame(resource_data)
df_resource

### 6. Discussion and Recommendations
We interpret the results, discuss limitations, and provide recommendations for future work and deployment.