# TTA Comparison Suite
## ConvNeXt-Tiny vs. Swin-Tiny

This notebook aggregates results from `results/experiment_logs/experiment_registry.csv` to perform a multi-dimensional comparison of adaptation strategies.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os

CSV_PATH = "../results/experiment_logs/experiment_registry.csv"

if not os.path.isfile(CSV_PATH):
    print(f"Error: {CSV_PATH} not found. Please run Notebook 01 and 02 first.")
else:
    df = pd.read_csv(CSV_PATH)
    display(df.tail(10))

## Analysis 1: The Domain Gap

Comparing Direct Transfer F1 performance across architectures.

In [None]:
if os.path.isfile(CSV_PATH):
    baseline_df = df[df['Adaptation'] == 'None']
    
    plt.figure(figsize=(10, 6))
    sns.barplot(data=baseline_df, x='Model', y='F1', palette='viridis')
    plt.title("The Domain Gap: Direct Transfer F1 (CNN vs ViT)")
    plt.ylim(0, 1)
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.show()

## Analysis 2: Adaptation Performance

Comparing Baseline vs TENT vs EATA.

In [None]:
if os.path.isfile(CSV_PATH):
    plt.figure(figsize=(12, 7))
    sns.barplot(data=df, x='Model', y='F1', hue='Adaptation', palette='magma')
    plt.title("Adaptation Strategies: F1 Score Comparison")
    plt.ylim(0, 1)
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.legend(title="Method", loc='upper right')
    plt.show()

## Analysis 3: Safety vs. Accuracy

The Goal: Top-Right (High F1, High Rust Recall).
TENT often collapses on Rust (Safety Hazard) while gaining Accuracy.

In [None]:
if os.path.isfile(CSV_PATH):
    plt.figure(figsize=(10, 8))
    # Note: Using Rust_Recall which was added to the registry
    scatter = sns.scatterplot(data=df, x='F1', y='Rust_Recall', hue='Adaptation', style='Model', s=200, palette='Set1')
    
    plt.title("Safety-Accuracy Trade-off")
    plt.xlabel("Overall F1 Score (Accuracy)")
    plt.ylabel("Rust Recall (Safety)")
    plt.xlim(0.3, 1.0)
    plt.ylim(0.0, 1.05)
    
    plt.axhline(0.8, color='red', linestyle='--', alpha=0.5, label="Safety Threshold")
    plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
    plt.grid(True, alpha=0.3)
    plt.show()