# Sentinel Phase 2: ImageNet-C Benchmark

This notebook benchmarks Sentinel against OpenCV baselines on ImageNet-C corruptions.

**Hardware Requirements:** GPU (T4 or better)

**Runtime:** ~30-45 minutes for 3 corruption types

## Setup: Install Sentinel

In [None]:
# Clone Sentinel repo
!git clone https://github.com/kelsierlol/sentinel.git
%cd sentinel

# Install dependencies
!pip install -e . -q
!pip install opencv-python matplotlib -q

print("\n‚úÖ Sentinel installed!")

# Verify installation
import sentinel
print(f"Sentinel version: {sentinel.__version__}")

## Download ImageNet-C (Subset)

We'll download just a few corruption types for quick testing.

**Option 1:** Download mini subset (recommended for testing)
**Option 2:** Download full ImageNet-C (~6GB) - use Kaggle dataset

In [None]:
# Option 1: Download mini subset from Kaggle (faster)
# You'll need to upload your kaggle.json to Colab

from google.colab import files
import os

# Upload kaggle.json
print("Upload your kaggle.json file:")
uploaded = files.upload()

# Setup Kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

# Download ImageNet-C
!kaggle datasets download -d sayakpaul/imagenet-c
!unzip -q imagenet-c.zip -d imagenet_c

print("\n‚úÖ ImageNet-C downloaded!")
!ls imagenet_c/

### Alternative: Use Google Drive

If you already have ImageNet-C in your Drive:

In [None]:
# Uncomment if using Google Drive
# from google.colab import drive
# drive.mount('/content/drive')
# 
# imagenet_c_path = "/content/drive/MyDrive/imagenet-c"
# imagenet_val_path = "/content/drive/MyDrive/imagenet/val"

## Quick Test: 3 Corruption Types

Let's start with 3 corruption types for a quick benchmark (~15 mins).

In [None]:
# Set paths
imagenet_c_path = "./imagenet_c"  # Adjust if needed
imagenet_val_path = "./imagenet_val"  # You'll need clean ImageNet val set

# For quick testing, we'll use synthetic clean images
# In production, use real ImageNet validation set

corruption_types = [
    "gaussian_noise",
    "defocus_blur",
    "motion_blur",
]

print("Testing on 3 corruption types:")
for c in corruption_types:
    print(f"  - {c}")

## Run Benchmark

In [None]:
import sys
sys.path.append('./colab')

from phase2_benchmark import run_benchmark

# Run benchmark
results = run_benchmark(
    imagenet_c_path=imagenet_c_path,
    imagenet_val_path=imagenet_val_path,
    corruption_types=corruption_types,
    severity=3,  # Medium severity
    max_samples_per_corruption=500,  # 500 samples per corruption
    max_clean_samples=500,  # 500 clean samples
    output_path="benchmark_results.json",
)

print("\n‚úÖ Benchmark complete!")

## Visualize Results

In [None]:
import json
import matplotlib.pyplot as plt
import numpy as np

# Load results
with open("benchmark_results.json", "r") as f:
    results = json.load(f)

# Extract data
corruptions = [r["corruption"] for r in results["corruption_results"]]
sentinel_auroc = [r["sentinel_auroc"] for r in results["corruption_results"]]
opencv_blur_auroc = [r["opencv_blur_auroc"] for r in results["corruption_results"]]
opencv_hist_auroc = [r["opencv_histogram_auroc"] for r in results["corruption_results"]]

# Plot comparison
fig, ax = plt.subplots(1, 1, figsize=(10, 6))

x = np.arange(len(corruptions))
width = 0.25

ax.bar(x - width, sentinel_auroc, width, label='Sentinel', color='#2ecc71')
ax.bar(x, opencv_blur_auroc, width, label='OpenCV Blur', color='#e74c3c')
ax.bar(x + width, opencv_hist_auroc, width, label='OpenCV Histogram', color='#95a5a6')

ax.set_xlabel('Corruption Type', fontsize=12)
ax.set_ylabel('AUROC', fontsize=12)
ax.set_title('Sentinel vs OpenCV Baselines on ImageNet-C', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(corruptions, rotation=45, ha='right')
ax.legend()
ax.grid(axis='y', alpha=0.3)
ax.set_ylim([0, 1.0])

plt.tight_layout()
plt.savefig('benchmark_comparison.png', dpi=150, bbox_inches='tight')
plt.show()

print("\nüìä Results visualized!")

## Summary Statistics

In [None]:
print("=" * 80)
print("BENCHMARK SUMMARY")
print("=" * 80)

summary = results["summary"]

print(f"\nMean AUROC:")
print(f"  Sentinel:        {summary['sentinel_mean_auroc']:.4f}")
print(f"  OpenCV Blur:     {summary['opencv_blur_mean_auroc']:.4f}")
print(f"  OpenCV Histogram: {summary['opencv_hist_mean_auroc']:.4f}")

improvement_blur = (summary['sentinel_mean_auroc'] - summary['opencv_blur_mean_auroc']) * 100
improvement_hist = (summary['sentinel_mean_auroc'] - summary['opencv_hist_mean_auroc']) * 100

print(f"\nImprovement:")
print(f"  vs OpenCV Blur:  {improvement_blur:+.1f}%")
print(f"  vs OpenCV Hist:  {improvement_hist:+.1f}%")

# Check if target met
target_improvement = 10.0  # 10% improvement target

if improvement_blur >= target_improvement:
    print(f"\n‚úÖ SUCCESS: Beat OpenCV by {improvement_blur:.1f}% (target: {target_improvement}%)")
else:
    print(f"\n‚ö†Ô∏è  Improvement: {improvement_blur:.1f}% (target: {target_improvement}%)")

print("=" * 80)

## Per-Corruption Breakdown

In [None]:
import pandas as pd

# Create DataFrame for nice display
df_results = []
for r in results["corruption_results"]:
    df_results.append({
        "Corruption": r["corruption"],
        "Sentinel AUROC": f"{r['sentinel_auroc']:.4f}",
        "Sentinel F1": f"{r['sentinel_f1']:.4f}",
        "OpenCV Blur AUROC": f"{r['opencv_blur_auroc']:.4f}",
        "OpenCV Hist AUROC": f"{r['opencv_histogram_auroc']:.4f}",
        "Samples": r["num_samples"],
    })

df = pd.DataFrame(df_results)
print("\nDetailed Results:")
print(df.to_string(index=False))

## Download Results

Download the results JSON and comparison plot.

In [None]:
from google.colab import files

# Download results
files.download('benchmark_results.json')
files.download('benchmark_comparison.png')

print("\n‚úÖ Results downloaded!")

---

## Full Benchmark (Optional)

Run on all 15 ImageNet-C corruption types (takes ~2 hours).

In [None]:
# Uncomment to run full benchmark
# all_corruptions = [
#     "gaussian_noise", "shot_noise", "impulse_noise",
#     "defocus_blur", "glass_blur", "motion_blur", "zoom_blur",
#     "snow", "frost", "fog", "brightness",
#     "contrast", "elastic_transform", "pixelate", "jpeg_compression"
# ]
# 
# full_results = run_benchmark(
#     imagenet_c_path=imagenet_c_path,
#     imagenet_val_path=imagenet_val_path,
#     corruption_types=all_corruptions,
#     severity=3,
#     max_samples_per_corruption=1000,
#     max_clean_samples=1000,
#     output_path="full_benchmark_results.json",
# )