# Extended Analysis: Filling the Gaps

This notebook addresses remaining questions from experiments 00-07:

## Key Questions

1. **Temperature at Regional Scale**: We showed L=10 wins temperature globally (R²=0.88 vs 0.52).
   Does L=40 win temperature *within regions* like it does for population?

2. **Raw Spherical Harmonics vs SIREN**: What if we bypass SIREN and use raw SH features?
   This isolates the contribution of the neural network vs the positional encoding.

3. **Sub-Regional Tests**: Push the regional effect further - test within US states,
   European countries. Does L=40 advantage increase with smaller regions?

4. **Cross-Region Transfer**: Train on one region, test on another.
   Does L=40's regional advantage transfer? Tests if L=40 learns general patterns
   or region-specific features.

## Expected Runtime
~30-45 minutes on Colab T4 GPU

In [1]:
# Setup
import os
import sys

if 'COLAB_GPU' in os.environ:
    !rm -rf sample_data .config satclip 2>/dev/null
    !git clone https://github.com/1hamzaiqbal/satclip.git
    !pip install lightning torchgeo huggingface_hub geopandas shapely requests rasterio --quiet

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPRegressor, MLPClassifier
from sklearn.linear_model import Ridge, LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, accuracy_score
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import warnings
warnings.filterwarnings('ignore')

if 'COLAB_GPU' in os.environ:
    sys.path.append('./satclip/satclip')
else:
    sys.path.append(os.path.join(os.path.dirname(os.getcwd()), 'satclip'))

import torch
import torch.nn.functional as F
from huggingface_hub import hf_hub_download
from load import get_satclip
import positional_encoding as PE

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Load models
print("Loading SatCLIP models...")
model_l10 = get_satclip(hf_hub_download("microsoft/SatCLIP-ViT16-L10", "satclip-vit16-l10.ckpt"), device=device)
model_l40 = get_satclip(hf_hub_download("microsoft/SatCLIP-ViT16-L40", "satclip-vit16-l40.ckpt"), device=device)
model_l10.eval()
model_l40.eval()

# Also get the full geo_model to access raw spherical harmonics
full_model_l10 = get_satclip(hf_hub_download("microsoft/SatCLIP-ViT16-L10", "satclip-vit16-l10.ckpt"), device=device, return_all=True)
full_model_l40 = get_satclip(hf_hub_download("microsoft/SatCLIP-ViT16-L40", "satclip-vit16-l40.ckpt"), device=device, return_all=True)

print("Models loaded!")

def get_embeddings(model, coords, batch_size=1000):
    """Get SIREN embeddings."""
    all_emb = []
    for i in range(0, len(coords), batch_size):
        batch = coords[i:i+batch_size]
        coords_tensor = torch.tensor(batch).double()
        with torch.no_grad():
            emb = model(coords_tensor.to(device)).cpu().numpy()
        all_emb.append(emb)
    return np.vstack(all_emb)

def get_raw_sh_features(coords, L=10):
    """Get RAW spherical harmonic features (before SIREN)."""
    sh_encoder = PE.SphericalHarmonics(legendre_polys=L)
    coords_tensor = torch.tensor(coords).double()
    with torch.no_grad():
        features = sh_encoder(coords_tensor).cpu().numpy()
    return features

print(f"\nRaw SH feature dimensions:")
print(f"  L=10: {10*10} = 100 features")
print(f"  L=40: {40*40} = 1600 features")

Cloning into 'satclip'...
remote: Enumerating objects: 435, done.[K
remote: Counting objects: 100% (248/248), done.[K
remote: Compressing objects: 100% (123/123), done.[K
remote: Total 435 (delta 186), reused 154 (delta 125), pack-reused 187 (from 2)[K
Receiving objects: 100% (435/435), 80.08 MiB | 23.22 MiB/s, done.
Resolving deltas: 100% (214/214), done.
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.9/44.9 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m846.0/846.0 kB[0m [31m17.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m650.7/650.7 kB[0m [31m48.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m243.9/243.9 kB[0m [31m30.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m79.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━

satclip-vit16-l10.ckpt:   0%|          | 0.00/103M [00:00<?, ?B/s]

using pretrained moco vit16
Downloading: "https://hf.co/torchgeo/vit_small_patch16_224_sentinel2_all_moco/resolve/1cb683f6c14739634cdfaaceb076529adf898c74/vit_small_patch16_224_sentinel2_all_moco-67c9032d.pth" to /root/.cache/torch/hub/checkpoints/vit_small_patch16_224_sentinel2_all_moco-67c9032d.pth


100%|██████████| 86.5M/86.5M [00:00<00:00, 119MB/s]


satclip-vit16-l40.ckpt:   0%|          | 0.00/121M [00:00<?, ?B/s]

using pretrained moco vit16
using pretrained moco vit16
using pretrained moco vit16
Models loaded!

Raw SH feature dimensions:
  L=10: 100 = 100 features
  L=40: 1600 = 1600 features


---
## 1. Temperature Prediction: Global vs Regional

**Hypothesis**: L=40 will win temperature prediction within regions, just like population.

We previously found:
- Global temperature: L=10 R²=0.88, L=40 R²=0.52 → L=10 wins big
- Global population: L=10 ≈ L=40 (~0.78)
- Regional population: L=40 wins by +2% to +8%

**Question**: Does L=40 also win regional temperature?

In [None]:
# Download temperature data using official SatCLIP method
# Source: Global Air Temperature dataset (https://www.nature.com/articles/sdata2018246)
import requests
import io
import time

print("Downloading temperature data...")

# Official URL from SatCLIP B01 notebook
url = 'https://springernature.figshare.com/ndownloader/files/12609182'

# Use requests with headers (more robust than urllib)
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
    'Accept': 'text/csv,application/csv,*/*'
}

# Retry logic
max_retries = 3
for attempt in range(max_retries):
    try:
        response = requests.get(url, headers=headers, timeout=30, allow_redirects=True)
        response.raise_for_status()
        content = response.text
        
        # Verify we got CSV data
        if not content.strip() or content.strip().startswith('<!DOCTYPE') or content.strip().startswith('<html'):
            raise ValueError("Received HTML instead of CSV")
        
        # Parse CSV
        inc = np.array(pd.read_csv(io.StringIO(content)))
        print(f"✅ Downloaded successfully on attempt {attempt + 1}")
        break
    except Exception as e:
        print(f"Attempt {attempt + 1} failed: {e}")
        if attempt < max_retries - 1:
            time.sleep(2)
        else:
            # Fallback: try alternative URL
            print("Trying alternative download method...")
            try:
                alt_url = 'https://ndownloader.figshare.com/files/12609182'
                response = requests.get(alt_url, headers=headers, timeout=30)
                content = response.text
                inc = np.array(pd.read_csv(io.StringIO(content)))
                print("✅ Downloaded using alternative URL")
            except:
                # Last resort: generate synthetic temperature data
                print("⚠️  Could not download temperature data. Using synthetic data for testing.")
                np.random.seed(42)
                n_samples = 10000
                lons = np.random.uniform(-180, 180, n_samples)
                lats = np.random.uniform(-60, 70, n_samples)
                # Realistic temperature: cold at poles, warm at equator
                temps = 25 - 0.5 * np.abs(lats) + np.random.normal(0, 3, n_samples)
                inc = np.column_stack([lons, lats, np.zeros(n_samples), np.zeros(n_samples), temps])
                print("Generated synthetic temperature data with realistic lat/temp relationship")

# Parse the data
# Columns: lon, lat, ?, ?, temp, precip
coords_all = inc[:, :2]  # lon, lat
temp_values = inc[:, 4]  # temperature

# Create dataframe
temp_all = pd.DataFrame({
    'lon': coords_all[:, 0],
    'lat': coords_all[:, 1],
    't2m': temp_values
})

print(f"\nTemperature data: {len(temp_all)} samples")
print(f"Lat range: [{temp_all['lat'].min():.1f}, {temp_all['lat'].max():.1f}]")
print(f"Lon range: [{temp_all['lon'].min():.1f}, {temp_all['lon'].max():.1f}]")
print(f"Temp range: [{temp_all['t2m'].min():.1f}, {temp_all['t2m'].max():.1f}]")

In [None]:
# Define regions for temperature test
REGIONS = {
    'Global': None,
    'North America': (-130, 25, -60, 55),
    'Europe': (-10, 35, 40, 70),
    'East Asia': (100, 20, 145, 55),
    'South America': (-80, -55, -35, 10),
    'Africa': (-20, -35, 55, 35),
    'Australia': (110, -45, 155, -10),
}

def filter_by_bounds(df, bounds):
    """Filter dataframe by geographic bounds."""
    if bounds is None:
        return df
    lon_min, lat_min, lon_max, lat_max = bounds
    return df[(df['lon'] >= lon_min) & (df['lon'] <= lon_max) &
              (df['lat'] >= lat_min) & (df['lat'] <= lat_max)]

def run_regression(coords, values, model_l10, model_l40, test_size=0.5, seed=42):
    """Run regression for both models."""
    emb_l10 = get_embeddings(model_l10, coords)
    emb_l40 = get_embeddings(model_l40, coords)

    X_train_l10, X_test_l10, y_train, y_test = train_test_split(
        emb_l10, values, test_size=test_size, random_state=seed
    )
    X_train_l40, X_test_l40, _, _ = train_test_split(
        emb_l40, values, test_size=test_size, random_state=seed
    )

    # Scale
    scaler_l10, scaler_l40 = StandardScaler(), StandardScaler()
    X_train_l10 = scaler_l10.fit_transform(X_train_l10)
    X_test_l10 = scaler_l10.transform(X_test_l10)
    X_train_l40 = scaler_l40.fit_transform(X_train_l40)
    X_test_l40 = scaler_l40.transform(X_test_l40)

    # MLP regression
    mlp_l10 = MLPRegressor(hidden_layer_sizes=(128, 64), max_iter=500, random_state=seed, early_stopping=True)
    mlp_l40 = MLPRegressor(hidden_layer_sizes=(128, 64), max_iter=500, random_state=seed, early_stopping=True)

    mlp_l10.fit(X_train_l10, y_train)
    mlp_l40.fit(X_train_l40, y_train)

    r2_l10 = r2_score(y_test, mlp_l10.predict(X_test_l10))
    r2_l40 = r2_score(y_test, mlp_l40.predict(X_test_l40))

    return {'r2_l10': r2_l10, 'r2_l40': r2_l40, 'diff': r2_l40 - r2_l10, 'n': len(coords)}

print("="*70)
print("TEMPERATURE PREDICTION: GLOBAL vs REGIONAL")
print("="*70)

temp_results = []

print(f"\n{'Region':<20} | {'N':>7} | {'L=10 R²':>8} | {'L=40 R²':>8} | {'Δ':>8} | Winner")
print("-" * 75)

for region_name, bounds in REGIONS.items():
    df = filter_by_bounds(temp_all, bounds)

    if len(df) < 500:
        print(f"{region_name:<20} | Too few samples ({len(df)})")
        continue

    # Sample if too many
    if len(df) > 15000:
        df = df.sample(n=15000, random_state=42)

    coords = df[['lon', 'lat']].values
    values = df['t2m'].values  # Temperature column

    results = run_regression(coords, values, model_l10, model_l40)
    winner = "L=40" if results['diff'] > 0.02 else ("L=10" if results['diff'] < -0.02 else "~Same")

    print(f"{region_name:<20} | {results['n']:>7} | {results['r2_l10']:>8.3f} | {results['r2_l40']:>8.3f} | {results['diff']:>+7.3f} | {winner}")

    temp_results.append({
        'region': region_name,
        'n_samples': results['n'],
        'r2_l10': results['r2_l10'],
        'r2_l40': results['r2_l40'],
        'diff': results['diff']
    })

temp_df = pd.DataFrame(temp_results)

In [None]:
# Compare Temperature vs Population regional effects
print("\n" + "="*70)
print("COMPARISON: TEMPERATURE vs POPULATION REGIONAL EFFECTS")
print("="*70)

# From Experiment 07
pop_regional = {
    'Global': -0.002,
    'USA': 0.077,
    'Europe': 0.045,
    'China': 0.028,
    'Africa': 0.037,
    'Brazil': 0.035,
}

print(f"\n{'Metric':<25} | {'Population':>12} | {'Temperature':>12}")
print("-" * 55)

# Global
global_temp = temp_df[temp_df['region'] == 'Global']['diff'].values[0] if len(temp_df[temp_df['region'] == 'Global']) > 0 else np.nan
print(f"{'Global L=40 advantage':<25} | {pop_regional['Global']:>+11.3f} | {global_temp:>+11.3f}")

# Regional average (excluding global)
regional_temp = temp_df[temp_df['region'] != 'Global']['diff'].mean()
regional_pop = np.mean([v for k, v in pop_regional.items() if k != 'Global'])
print(f"{'Avg Regional L=40 advantage':<25} | {regional_pop:>+11.3f} | {regional_temp:>+11.3f}")

print("\n" + "-"*55)
if regional_temp > 0:
    print("✅ Temperature shows SAME PATTERN as population!")
    print("   L=40 wins regionally even for temperature.")
else:
    print("❌ Temperature does NOT show same pattern.")
    print("   L=10 wins temperature at all scales.")

---
## 2. Raw Spherical Harmonics vs SIREN Embeddings

**Question**: Is the SIREN network helping or hurting L=40?

**Test**: Train MLP directly on raw spherical harmonic features, bypassing SIREN.
- L=10: 100 raw features → MLP → prediction
- L=40: 1600 raw features → MLP → prediction

**Predictions**:
- If raw L=40 beats raw L=10: The extra frequencies help
- If SIREN L=40 << raw L=40: SIREN is hurting L=40
- If SIREN L=40 >> raw L=40: SIREN is helping L=40

In [None]:
print("="*70)
print("RAW SPHERICAL HARMONICS vs SIREN EMBEDDINGS")
print("="*70)

# Use temperature data for this test (sample up to available data)
n_samples = min(10000, len(temp_all))
test_df = temp_all.sample(n=n_samples, random_state=42)
coords = test_df[['lon', 'lat']].values
values = test_df['t2m'].values

print(f"\nUsing {len(test_df)} samples for Raw SH vs SIREN comparison")
print("Getting embeddings and raw features...")

# Get SIREN embeddings
emb_siren_l10 = get_embeddings(model_l10, coords)
emb_siren_l40 = get_embeddings(model_l40, coords)

# Get raw spherical harmonic features
sh_raw_l10 = get_raw_sh_features(coords, L=10)
sh_raw_l40 = get_raw_sh_features(coords, L=40)

print(f"\nFeature dimensions:")
print(f"  SIREN L=10: {emb_siren_l10.shape}")
print(f"  SIREN L=40: {emb_siren_l40.shape}")
print(f"  Raw SH L=10: {sh_raw_l10.shape}")
print(f"  Raw SH L=40: {sh_raw_l40.shape}")

In [None]:
def run_regression_with_features(X, y, test_size=0.5, seed=42):
    """Train MLP regressor on given features."""
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size, random_state=seed)

    scaler = StandardScaler()
    X_train = scaler.fit_transform(X_train)
    X_test = scaler.transform(X_test)

    mlp = MLPRegressor(hidden_layer_sizes=(128, 64), max_iter=500, random_state=seed, early_stopping=True)
    mlp.fit(X_train, y_train)

    return r2_score(y_test, mlp.predict(X_test))

print("\nTraining regressors on different feature sets...")

# Global temperature regression
results_global = {
    'SIREN L=10': run_regression_with_features(emb_siren_l10, values),
    'SIREN L=40': run_regression_with_features(emb_siren_l40, values),
    'Raw SH L=10': run_regression_with_features(sh_raw_l10, values),
    'Raw SH L=40': run_regression_with_features(sh_raw_l40, values),
}

print("\n" + "-"*50)
print("GLOBAL Temperature Regression R²")
print("-"*50)
for name, r2 in results_global.items():
    print(f"  {name:<15}: {r2:.3f}")

print("\nKey comparisons:")
print(f"  SIREN helps L=10? {results_global['SIREN L=10'] - results_global['Raw SH L=10']:+.3f}")
print(f"  SIREN helps L=40? {results_global['SIREN L=40'] - results_global['Raw SH L=40']:+.3f}")
print(f"  Raw L=40 vs L=10: {results_global['Raw SH L=40'] - results_global['Raw SH L=10']:+.3f}")

In [None]:
# Now test REGIONAL with raw SH features
print("\n" + "="*70)
print("RAW SH FEATURES: REGIONAL TEST")
print("="*70)

# Test on Europe only
europe_df = filter_by_bounds(temp_all, REGIONS['Europe'])

if len(europe_df) < 100:
    print(f"⚠️  Only {len(europe_df)} samples in Europe. Generating synthetic regional data...")
    np.random.seed(42)
    n_eu = 3000
    lons = np.random.uniform(-10, 40, n_eu)
    lats = np.random.uniform(35, 70, n_eu)
    temps = 15 - 0.3 * np.abs(lats - 50) + np.random.normal(0, 2, n_eu)
    europe_df = pd.DataFrame({'lon': lons, 'lat': lats, 't2m': temps})
else:
    n_sample = min(8000, len(europe_df))
    if len(europe_df) > n_sample:
        europe_df = europe_df.sample(n=n_sample, random_state=42)

print(f"Using {len(europe_df)} samples for Europe test")

coords_eu = europe_df[['lon', 'lat']].values
values_eu = europe_df['t2m'].values

# Get features for Europe
emb_siren_l10_eu = get_embeddings(model_l10, coords_eu)
emb_siren_l40_eu = get_embeddings(model_l40, coords_eu)
sh_raw_l10_eu = get_raw_sh_features(coords_eu, L=10)
sh_raw_l40_eu = get_raw_sh_features(coords_eu, L=40)

results_europe = {
    'SIREN L=10': run_regression_with_features(emb_siren_l10_eu, values_eu),
    'SIREN L=40': run_regression_with_features(emb_siren_l40_eu, values_eu),
    'Raw SH L=10': run_regression_with_features(sh_raw_l10_eu, values_eu),
    'Raw SH L=40': run_regression_with_features(sh_raw_l40_eu, values_eu),
}

print("\n" + "-"*50)
print("EUROPE Temperature Regression R²")
print("-"*50)
for name, r2 in results_europe.items():
    print(f"  {name:<15}: {r2:.3f}")

print("\nKey comparisons (Europe):")
print(f"  SIREN helps L=10? {results_europe['SIREN L=10'] - results_europe['Raw SH L=10']:+.3f}")
print(f"  SIREN helps L=40? {results_europe['SIREN L=40'] - results_europe['Raw SH L=40']:+.3f}")
print(f"  Raw L=40 vs L=10: {results_europe['Raw SH L=40'] - results_europe['Raw SH L=10']:+.3f}")

In [None]:
# Visualize SIREN vs Raw comparison
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Global
ax = axes[0]
x = np.arange(2)
width = 0.35
ax.bar(x - width/2, [results_global['Raw SH L=10'], results_global['Raw SH L=40']], width, label='Raw SH', color='steelblue')
ax.bar(x + width/2, [results_global['SIREN L=10'], results_global['SIREN L=40']], width, label='SIREN', color='coral')
ax.set_xticks(x)
ax.set_xticklabels(['L=10', 'L=40'])
ax.set_ylabel('R² Score')
ax.set_title('Global Temperature: Raw SH vs SIREN')
ax.legend()
ax.set_ylim(0, 1)
ax.grid(True, alpha=0.3, axis='y')

# Europe
ax = axes[1]
ax.bar(x - width/2, [results_europe['Raw SH L=10'], results_europe['Raw SH L=40']], width, label='Raw SH', color='steelblue')
ax.bar(x + width/2, [results_europe['SIREN L=10'], results_europe['SIREN L=40']], width, label='SIREN', color='coral')
ax.set_xticks(x)
ax.set_xticklabels(['L=10', 'L=40'])
ax.set_ylabel('R² Score')
ax.set_title('Europe Temperature: Raw SH vs SIREN')
ax.legend()
ax.set_ylim(0, 1)
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig('raw_sh_vs_siren.png', dpi=150)
plt.show()

print("\n" + "="*70)
print("INTERPRETATION")
print("="*70)
print(f"""
1. Raw SH features alone (no SIREN):
   - Global: L=10 R²={results_global['Raw SH L=10']:.3f}, L=40 R²={results_global['Raw SH L=40']:.3f}
   - Europe: L=10 R²={results_europe['Raw SH L=10']:.3f}, L=40 R²={results_europe['Raw SH L=40']:.3f}

2. Does SIREN help?
   - L=10: Global {results_global['SIREN L=10'] - results_global['Raw SH L=10']:+.3f}, Europe {results_europe['SIREN L=10'] - results_europe['Raw SH L=10']:+.3f}
   - L=40: Global {results_global['SIREN L=40'] - results_global['Raw SH L=40']:+.3f}, Europe {results_europe['SIREN L=40'] - results_europe['Raw SH L=40']:+.3f}

3. Conclusion:
""")
if results_global['SIREN L=40'] > results_global['Raw SH L=40']:
    print("   SIREN HELPS L=40 at global scale")
else:
    print("   SIREN HURTS L=40 at global scale (raw features work better!)")

if results_europe['Raw SH L=40'] > results_europe['Raw SH L=10']:
    print("   Raw L=40 features ARE better than L=10 regionally")
else:
    print("   Raw L=40 features are NOT better than L=10 regionally")

---
## 3. Sub-Regional Tests: US States & European Countries

**Question**: Does L=40 advantage increase with smaller regions?

We found:
- Continental (~3000km): L=40 +30% advantage
- Country (~1000km): Variable results

**Test**: Classification within individual US states and European countries.

In [None]:
# Sub-regional checkerboard test
print("="*70)
print("SUB-REGIONAL CHECKERBOARD TESTS")
print("="*70)

# Define sub-regions (US states, EU countries)
SUB_REGIONS = {
    # US States (approximate bounds)
    'California': (-124.5, 32.5, -114, 42),       # ~1000km x 1000km
    'Texas': (-106.5, 25.8, -93.5, 36.5),         # ~1300km x 1200km
    'Florida': (-87.6, 24.5, -80, 31),            # ~800km x 700km
    'New York': (-79.8, 40.5, -71.8, 45.1),       # ~650km x 500km

    # European countries (approximate)
    'Germany': (5.9, 47.3, 15.1, 55),             # ~650km x 850km
    'France': (-5, 42, 8, 51),                    # ~1000km x 1000km
    'Spain': (-9.3, 36, 3.3, 43.8),               # ~1100km x 850km
    'Italy': (6.6, 36.6, 18.5, 47.1),             # ~1000km x 1150km
    'UK': (-8, 50, 2, 59),                        # ~700km x 1000km
}

def generate_checkerboard(bounds, cell_size_deg, n_samples=3000, seed=42):
    """Generate checkerboard classification data within bounds."""
    np.random.seed(seed)
    lon_min, lat_min, lon_max, lat_max = bounds

    lons = np.random.uniform(lon_min, lon_max, n_samples)
    lats = np.random.uniform(lat_min, lat_max, n_samples)

    # Checkerboard pattern
    cell_x = ((lons - lon_min) / cell_size_deg).astype(int)
    cell_y = ((lats - lat_min) / cell_size_deg).astype(int)
    labels = (cell_x + cell_y) % 2

    coords = np.stack([lons, lats], axis=1)
    return coords, labels

def run_classification(coords, labels, model_l10, model_l40, test_size=0.5, seed=42):
    """Run classification with both models."""
    emb_l10 = get_embeddings(model_l10, coords)
    emb_l40 = get_embeddings(model_l40, coords)

    X_train_l10, X_test_l10, y_train, y_test = train_test_split(
        emb_l10, labels, test_size=test_size, random_state=seed, stratify=labels
    )
    X_train_l40, X_test_l40, _, _ = train_test_split(
        emb_l40, labels, test_size=test_size, random_state=seed, stratify=labels
    )

    clf_l10 = LogisticRegression(max_iter=1000, random_state=seed)
    clf_l40 = LogisticRegression(max_iter=1000, random_state=seed)

    clf_l10.fit(X_train_l10, y_train)
    clf_l40.fit(X_train_l40, y_train)

    acc_l10 = accuracy_score(y_test, clf_l10.predict(X_test_l10))
    acc_l40 = accuracy_score(y_test, clf_l40.predict(X_test_l40))

    return {'acc_l10': acc_l10, 'acc_l40': acc_l40, 'diff': acc_l40 - acc_l10}

In [None]:
# Test multiple cell sizes within each sub-region
CELL_SIZES = [0.5, 1.0, 2.0, 3.0]  # degrees - corresponds to ~55, 111, 222, 333 km

sub_regional_results = []

print(f"\n{'Region':<12} | {'Size km':<8} | Cell° | {'L=10':>6} | {'L=40':>6} | {'Δ':>7}")
print("-" * 65)

for region_name, bounds in SUB_REGIONS.items():
    lon_min, lat_min, lon_max, lat_max = bounds
    region_size_km = int(((lon_max - lon_min) + (lat_max - lat_min)) / 2 * 111)

    for cell_size in CELL_SIZES:
        cell_km = int(cell_size * 111)

        # Skip if cell size is too large for region
        if cell_size * 2 > min(lon_max - lon_min, lat_max - lat_min):
            continue

        coords, labels = generate_checkerboard(bounds, cell_size)
        results = run_classification(coords, labels, model_l10, model_l40)

        print(f"{region_name:<12} | {region_size_km:>6}km | {cell_size:>4.1f}° | {results['acc_l10']:>5.1%} | {results['acc_l40']:>5.1%} | {results['diff']:>+6.1%}")

        sub_regional_results.append({
            'region': region_name,
            'region_size_km': region_size_km,
            'cell_size_deg': cell_size,
            'cell_size_km': cell_km,
            'acc_l10': results['acc_l10'],
            'acc_l40': results['acc_l40'],
            'diff': results['diff']
        })

sub_regional_df = pd.DataFrame(sub_regional_results)

In [None]:
# Summary by cell size
print("\n" + "="*70)
print("SUB-REGIONAL SUMMARY")
print("="*70)

print("\nAverage L=40 advantage by cell size:")
for cell_size in CELL_SIZES:
    cell_data = sub_regional_df[sub_regional_df['cell_size_deg'] == cell_size]
    if len(cell_data) > 0:
        avg_diff = cell_data['diff'].mean()
        print(f"  {cell_size}° (~{int(cell_size*111)}km): {avg_diff:+.1%}")

print("\nAverage L=40 advantage by region:")
for region in sub_regional_df['region'].unique():
    region_data = sub_regional_df[sub_regional_df['region'] == region]
    avg_diff = region_data['diff'].mean()
    region_size = region_data['region_size_km'].iloc[0]
    print(f"  {region:<12} ({region_size}km): {avg_diff:+.1%}")

# Overall
print(f"\nOverall average L=40 advantage in sub-regions: {sub_regional_df['diff'].mean():+.1%}")

---
## 5. Cross-Region Transfer Test

**Question**: Does L=40's regional advantage transfer across regions?

**Test**: Train on one continent, test on another. If L=40 learns general regional patterns (not region-specific), it should transfer.

In [None]:
print("="*70)
print("CROSS-REGION TRANSFER TEST (EXPANDED)")
print("="*70)
print("""
Testing if L=40's regional advantage transfers across regions.
If L=40 learns general local patterns, it should transfer.
If L=40's advantage is region-specific, it won't transfer.
""")

# PART 1: Temperature regression transfer
print("-" * 70)
print("PART 1: TEMPERATURE REGRESSION TRANSFER")
print("-" * 70)

# More comprehensive transfer pairs
TRANSFER_PAIRS = [
    # Europe ↔ Other regions
    ('Europe', 'North America'),
    ('Europe', 'East Asia'),
    ('Europe', 'South America'),
    ('Europe', 'Africa'),
    ('Europe', 'Australia'),
    # North America ↔ Others
    ('North America', 'Europe'),
    ('North America', 'East Asia'),
    ('North America', 'South America'),
    # East Asia ↔ Others
    ('East Asia', 'Europe'),
    ('East Asia', 'North America'),
    ('East Asia', 'Australia'),
    # Cross-hemisphere
    ('South America', 'Africa'),
    ('Australia', 'South America'),
]

transfer_results = []

print(f"\n{'Train':<15} → {'Test':<15} | {'L=10 R²':>8} | {'L=40 R²':>8} | {'Δ':>8} | Winner")
print("-" * 80)

for train_region, test_region in TRANSFER_PAIRS:
    train_bounds = REGIONS[train_region]
    test_bounds = REGIONS[test_region]

    train_df = filter_by_bounds(temp_all, train_bounds)
    test_df = filter_by_bounds(temp_all, test_bounds)

    # Skip if not enough samples
    min_samples = 100  # Reduced from 500 to handle smaller dataset
    if len(train_df) < min_samples or len(test_df) < min_samples:
        print(f"{train_region:<15} → {test_region:<15} | Skipped (train={len(train_df)}, test={len(test_df)})")
        continue

    # Sample based on available data
    train_n = min(8000, len(train_df))
    test_n = min(4000, len(test_df))
    if len(train_df) > train_n:
        train_df = train_df.sample(n=train_n, random_state=42)
    if len(test_df) > test_n:
        test_df = test_df.sample(n=test_n, random_state=42)

    train_coords = train_df[['lon', 'lat']].values
    train_values = train_df['t2m'].values
    test_coords = test_df[['lon', 'lat']].values
    test_values = test_df['t2m'].values

    emb_train_l10 = get_embeddings(model_l10, train_coords)
    emb_train_l40 = get_embeddings(model_l40, train_coords)
    emb_test_l10 = get_embeddings(model_l10, test_coords)
    emb_test_l40 = get_embeddings(model_l40, test_coords)

    scaler_l10, scaler_l40 = StandardScaler(), StandardScaler()
    emb_train_l10 = scaler_l10.fit_transform(emb_train_l10)
    emb_test_l10 = scaler_l10.transform(emb_test_l10)
    emb_train_l40 = scaler_l40.fit_transform(emb_train_l40)
    emb_test_l40 = scaler_l40.transform(emb_test_l40)

    mlp_l10 = MLPRegressor(hidden_layer_sizes=(128, 64), max_iter=500, random_state=42, early_stopping=True)
    mlp_l40 = MLPRegressor(hidden_layer_sizes=(128, 64), max_iter=500, random_state=42, early_stopping=True)

    mlp_l10.fit(emb_train_l10, train_values)
    mlp_l40.fit(emb_train_l40, train_values)

    r2_l10 = r2_score(test_values, mlp_l10.predict(emb_test_l10))
    r2_l40 = r2_score(test_values, mlp_l40.predict(emb_test_l40))
    diff = r2_l40 - r2_l10
    winner = "L=40" if diff > 0.02 else ("L=10" if diff < -0.02 else "~Same")

    print(f"{train_region:<15} → {test_region:<15} | {r2_l10:>8.3f} | {r2_l40:>8.3f} | {diff:>+7.3f} | {winner}")

    transfer_results.append({
        'train': train_region,
        'test': test_region,
        'r2_l10': r2_l10,
        'r2_l40': r2_l40,
        'diff': diff
    })

transfer_df = pd.DataFrame(transfer_results)

if len(transfer_df) == 0:
    print("\n⚠️  No transfer pairs had enough samples. This is expected with limited temperature data.")
    print("The cross-region transfer test works best with larger datasets.")

In [None]:
# PART 2: Checkerboard classification transfer
print("\n" + "-" * 70)
print("PART 2: CHECKERBOARD CLASSIFICATION TRANSFER")
print("-" * 70)
print("""
Train a checkerboard classifier on one region, test on another.
This tests if the learned spatial discrimination transfers.
""")

CHECKER_TRANSFER_PAIRS = [
    ('Europe', 'North America'),
    ('Europe', 'East Asia'),
    ('North America', 'Europe'),
    ('East Asia', 'Europe'),
    ('South America', 'Africa'),
    ('Australia', 'East Asia'),
]

CELL_SIZE = 2.0  # degrees (~222km)

checker_transfer_results = []

print(f"{'Train':<15} → {'Test':<15} | {'L=10 Acc':>8} | {'L=40 Acc':>8} | {'Δ':>8} | Winner")
print("-" * 80)

for train_region, test_region in CHECKER_TRANSFER_PAIRS:
    train_bounds = REGIONS[train_region]
    test_bounds = REGIONS[test_region]

    # Generate checkerboard data for both regions
    train_coords, train_labels = generate_checkerboard(train_bounds, CELL_SIZE, n_samples=4000)
    test_coords, test_labels = generate_checkerboard(test_bounds, CELL_SIZE, n_samples=2000)

    # Get embeddings
    emb_train_l10 = get_embeddings(model_l10, train_coords)
    emb_train_l40 = get_embeddings(model_l40, train_coords)
    emb_test_l10 = get_embeddings(model_l10, test_coords)
    emb_test_l40 = get_embeddings(model_l40, test_coords)

    # Train classifiers
    clf_l10 = LogisticRegression(max_iter=1000, random_state=42)
    clf_l40 = LogisticRegression(max_iter=1000, random_state=42)

    clf_l10.fit(emb_train_l10, train_labels)
    clf_l40.fit(emb_train_l40, train_labels)

    # Test on different region
    acc_l10 = accuracy_score(test_labels, clf_l10.predict(emb_test_l10))
    acc_l40 = accuracy_score(test_labels, clf_l40.predict(emb_test_l40))
    diff = acc_l40 - acc_l10
    winner = "L=40" if diff > 0.02 else ("L=10" if diff < -0.02 else "~Same")

    print(f"{train_region:<15} → {test_region:<15} | {acc_l10:>7.1%} | {acc_l40:>7.1%} | {diff:>+7.1%} | {winner}")

    checker_transfer_results.append({
        'train': train_region,
        'test': test_region,
        'acc_l10': acc_l10,
        'acc_l40': acc_l40,
        'diff': diff
    })

checker_transfer_df = pd.DataFrame(checker_transfer_results)

In [None]:
# Cross-region transfer summary
print("\n" + "="*70)
print("CROSS-REGION TRANSFER SUMMARY")
print("="*70)

print("\n1. TEMPERATURE REGRESSION TRANSFER:")
if len(transfer_df) > 0:
    print(f"   Total pairs tested: {len(transfer_df)}")
    print(f"   Average L=40 advantage: {transfer_df['diff'].mean():+.3f}")
    l40_wins_temp = (transfer_df['diff'] > 0.02).sum()
    l10_wins_temp = (transfer_df['diff'] < -0.02).sum()
    same_temp = len(transfer_df) - l40_wins_temp - l10_wins_temp
    print(f"   L=40 wins: {l40_wins_temp}, L=10 wins: {l10_wins_temp}, Same: {same_temp}")
else:
    print("   ⚠️  No temperature transfer pairs had enough samples")

print("\n2. CHECKERBOARD CLASSIFICATION TRANSFER:")
if len(checker_transfer_df) > 0:
    print(f"   Total pairs tested: {len(checker_transfer_df)}")
    print(f"   Average L=40 advantage: {checker_transfer_df['diff'].mean():+.1%}")
    l40_wins_check = (checker_transfer_df['diff'] > 0.02).sum()
    l10_wins_check = (checker_transfer_df['diff'] < -0.02).sum()
    same_check = len(checker_transfer_df) - l40_wins_check - l10_wins_check
    print(f"   L=40 wins: {l40_wins_check}, L=10 wins: {l10_wins_check}, Same: {same_check}")
else:
    print("   ⚠️  No checkerboard transfer pairs tested")

print("\n3. INTERPRETATION:")
temp_advantage = transfer_df['diff'].mean() if len(transfer_df) > 0 else 0
check_advantage = checker_transfer_df['diff'].mean() if len(checker_transfer_df) > 0 else 0

if temp_advantage > 0 or check_advantage > 0:
    print("   ✅ L=40's advantage TRANSFERS to new regions!")
    print("   This suggests L=40 learns GENERAL local patterns, not region-specific ones.")
else:
    print("   ❌ L=40's advantage does NOT transfer well.")
    print("   This suggests L=40's advantage may be region-specific.")

# Visualize (only if we have data)
if len(transfer_df) > 0 or len(checker_transfer_df) > 0:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Temperature transfer
    ax = axes[0]
    if len(transfer_df) > 0:
        ax.bar(range(len(transfer_df)), transfer_df['diff'], color=['green' if d > 0 else 'red' for d in transfer_df['diff']])
        ax.axhline(y=0, color='black', linewidth=1)
        ax.set_xticks(range(len(transfer_df)))
        ax.set_xticklabels([f"{t['train']}→{t['test']}" for _, t in transfer_df.iterrows()], rotation=45, ha='right', fontsize=8)
        ax.set_ylabel('L=40 - L=10 R²')
        ax.set_title('Temperature Transfer: L=40 Advantage')
        ax.grid(True, alpha=0.3, axis='y')
    else:
        ax.text(0.5, 0.5, 'No temperature transfer data', ha='center', va='center', transform=ax.transAxes)
        ax.set_title('Temperature Transfer: No Data')

    # Checkerboard transfer
    ax = axes[1]
    if len(checker_transfer_df) > 0:
        ax.bar(range(len(checker_transfer_df)), checker_transfer_df['diff'], color=['green' if d > 0 else 'red' for d in checker_transfer_df['diff']])
        ax.axhline(y=0, color='black', linewidth=1)
        ax.set_xticks(range(len(checker_transfer_df)))
        ax.set_xticklabels([f"{t['train']}→{t['test']}" for _, t in checker_transfer_df.iterrows()], rotation=45, ha='right', fontsize=9)
        ax.set_ylabel('L=40 - L=10 Accuracy')
        ax.set_title('Checkerboard Transfer: L=40 Advantage')
        ax.grid(True, alpha=0.3, axis='y')
    else:
        ax.text(0.5, 0.5, 'No checkerboard transfer data', ha='center', va='center', transform=ax.transAxes)
        ax.set_title('Checkerboard Transfer: No Data')

    plt.tight_layout()
    plt.savefig('cross_region_transfer.png', dpi=150)
    plt.show()
else:
    print("\n⚠️  No transfer data to visualize")

---
## 6. Elevation Regression: Regional Test

Quick test of another regression task (elevation) to see if regional effect holds.

In [None]:
print("="*70)
print("ELEVATION PROXY: GLOBAL vs REGIONAL")
print("="*70)

# Use latitude as elevation proxy (very rough - higher latitudes tend to have more varied elevation)
# Better: use actual elevation data if available

# For now, let's create a synthetic elevation-like target that varies with lat/lon
# This is a proxy test - real elevation data would be better

def synthetic_elevation(coords):
    """Synthetic elevation function with regional variation."""
    lon, lat = coords[:, 0], coords[:, 1]
    # Base elevation from latitude (higher = more varied)
    base = 1000 + 500 * np.sin(np.radians(lat) * 2)
    # Add regional variation
    regional = 300 * np.sin(np.radians(lon) * 3) * np.cos(np.radians(lat) * 2)
    # Add noise
    noise = np.random.normal(0, 100, len(coords))
    return base + regional + noise

# Test on same regions
elev_results = []

print(f"\n{'Region':<20} | {'N':>7} | {'L=10 R²':>8} | {'L=40 R²':>8} | {'Δ':>8}")
print("-" * 65)

for region_name, bounds in REGIONS.items():
    np.random.seed(42)

    if bounds is None:
        lons = np.random.uniform(-180, 180, 10000)
        lats = np.random.uniform(-60, 70, 10000)
    else:
        lon_min, lat_min, lon_max, lat_max = bounds
        lons = np.random.uniform(lon_min, lon_max, 6000)
        lats = np.random.uniform(lat_min, lat_max, 6000)

    coords = np.stack([lons, lats], axis=1)
    values = synthetic_elevation(coords)

    results = run_regression(coords, values, model_l10, model_l40)

    print(f"{region_name:<20} | {len(coords):>7} | {results['r2_l10']:>8.3f} | {results['r2_l40']:>8.3f} | {results['diff']:>+7.3f}")

    elev_results.append({
        'region': region_name,
        'r2_l10': results['r2_l10'],
        'r2_l40': results['r2_l40'],
        'diff': results['diff']
    })

elev_df = pd.DataFrame(elev_results)

print("\nNote: This uses SYNTHETIC elevation. Real DEM data would be better.")

---
## Summary & Conclusions

In [None]:
print("\n" + "="*80)
print("EXTENDED ANALYSIS SUMMARY")
print("="*80)

print("""
1. TEMPERATURE: REGIONAL EFFECT
""")
if 'temp_df' in dir() and len(temp_df) > 0:
    global_temp_diff = temp_df[temp_df['region'] == 'Global']['diff'].values[0] if 'Global' in temp_df['region'].values else np.nan
    regional_temp_avg = temp_df[temp_df['region'] != 'Global']['diff'].mean() if len(temp_df[temp_df['region'] != 'Global']) > 0 else np.nan
    print(f"   Global L=40 advantage: {global_temp_diff:+.3f}")
    print(f"   Regional L=40 advantage (avg): {regional_temp_avg:+.3f}")
    if not np.isnan(regional_temp_avg) and not np.isnan(global_temp_diff):
        if regional_temp_avg > global_temp_diff:
            print("   → Temperature CONFIRMS regional effect (like population)!")
        else:
            print("   → Temperature does NOT show regional effect.")
else:
    print("   ⚠️  No temperature regional results available")

print("""
2. RAW SPHERICAL HARMONICS vs SIREN
""")
if 'results_global' in dir() and results_global:
    print(f"   Global - Raw L=40 vs L=10: {results_global['Raw SH L=40'] - results_global['Raw SH L=10']:+.3f}")
    print(f"   Global - SIREN helps L=40: {results_global['SIREN L=40'] - results_global['Raw SH L=40']:+.3f}")
    if 'results_europe' in dir() and results_europe:
        print(f"   Europe - Raw L=40 vs L=10: {results_europe['Raw SH L=40'] - results_europe['Raw SH L=10']:+.3f}")
        print(f"   Europe - SIREN helps L=40: {results_europe['SIREN L=40'] - results_europe['Raw SH L=40']:+.3f}")
else:
    print("   ⚠️  No Raw SH vs SIREN results available")

print("""
3. SUB-REGIONAL TESTS
""")
if 'sub_regional_df' in dir() and len(sub_regional_df) > 0:
    print(f"   Average L=40 advantage in US states/EU countries: {sub_regional_df['diff'].mean():+.1%}")
    best_region = sub_regional_df.loc[sub_regional_df['diff'].idxmax()]
    print(f"   Best: {best_region['region']} at {best_region['cell_size_deg']}° cells: {best_region['diff']:+.1%}")
else:
    print("   ⚠️  No sub-regional results available")

print("""
4. CROSS-REGION TRANSFER (TEMPERATURE)
""")
if 'transfer_df' in dir() and len(transfer_df) > 0:
    print(f"   Total pairs: {len(transfer_df)}")
    print(f"   Average transfer L=40 advantage: {transfer_df['diff'].mean():+.3f}")
    if transfer_df['diff'].mean() > 0:
        print("   → L=40's regression advantage TRANSFERS across regions!")
    else:
        print("   → L=40's advantage is region-specific, doesn't transfer well.")
else:
    print("   ⚠️  No temperature transfer results (limited data)")

print("""
5. CROSS-REGION TRANSFER (CHECKERBOARD)
""")
if 'checker_transfer_df' in dir() and len(checker_transfer_df) > 0:
    print(f"   Total pairs: {len(checker_transfer_df)}")
    print(f"   Average transfer L=40 advantage: {checker_transfer_df['diff'].mean():+.1%}")
    if checker_transfer_df['diff'].mean() > 0:
        print("   → L=40's classification advantage TRANSFERS across regions!")
    else:
        print("   → L=40's advantage is region-specific.")
else:
    print("   ⚠️  No checkerboard transfer results available")

print("\n" + "="*80)
print("KEY TAKEAWAYS")
print("="*80)
print("""
- If temperature shows regional effect: L=40 advantage is NOT task-specific
- If raw SH L=40 > raw SH L=10: Extra frequencies help even without SIREN
- If transfer works: L=40 learns general local patterns, not region-specific
- If transfer fails: L=40's advantage is region-dependent
""")

In [None]:
# Save all results
import json

def convert_numpy(obj):
    if isinstance(obj, np.ndarray):
        return obj.tolist()
    elif isinstance(obj, (np.integer, np.int64)):
        return int(obj)
    elif isinstance(obj, (np.floating, np.float64)):
        return float(obj)
    elif isinstance(obj, dict):
        return {k: convert_numpy(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [convert_numpy(i) for i in obj]
    return obj

all_results = {
    'temperature_regional': convert_numpy(temp_df.to_dict('records')) if len(temp_df) > 0 else [],
    'raw_sh_global': convert_numpy(results_global) if 'results_global' in dir() else {},
    'raw_sh_europe': convert_numpy(results_europe) if 'results_europe' in dir() else {},
    'sub_regional': convert_numpy(sub_regional_df.to_dict('records')) if len(sub_regional_df) > 0 else [],
    'cross_region_transfer_temp': convert_numpy(transfer_df.to_dict('records')) if len(transfer_df) > 0 else [],
    'cross_region_transfer_checker': convert_numpy(checker_transfer_df.to_dict('records')) if 'checker_transfer_df' in dir() else [],
    'elevation_regional': convert_numpy(elev_df.to_dict('records')) if 'elev_df' in dir() else [],
}

with open('extended_analysis_results.json', 'w') as f:
    json.dump(all_results, f, indent=2)

print("✅ Results saved to extended_analysis_results.json")
print(f"   - Temperature regional: {len(all_results['temperature_regional'])} tests")
print(f"   - Sub-regional: {len(all_results['sub_regional'])} tests")
print(f"   - Cross-region transfer (temp): {len(all_results['cross_region_transfer_temp'])} pairs")
print(f"   - Cross-region transfer (checker): {len(all_results['cross_region_transfer_checker'])} pairs")