# Loop 13 LB Feedback Analysis

**Submission:** exp_012 (Numba SA)
**CV Score:** 70.6151
**LB Score:** 70.6151
**Gap:** 0.0000 (perfect match)

## Key Findings

1. **CV-LB gap is ZERO** - Our validation is perfectly calibrated
2. **Numba SA found ZERO improvements** - Baseline is at a strong local optimum
3. **All 13 experiments stuck at ~70.6** - Local optimization approaches exhausted

## The Real Problem

The target is 68.884125. We need to improve by 1.73 points.

Top kernels achieve ~68.5 by ensembling from 15+ external Kaggle datasets.
We don't have access to those datasets.

**Our only path forward: Generate diverse solutions ourselves.**

In [None]:
import pandas as pd
import numpy as np
import math
import glob

# Tree shape
TX = [0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125]
TY = [0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5]

def get_polygon_bounds(cx, cy, deg):
    rad = deg * math.pi / 180.0
    s = math.sin(rad)
    c = math.cos(rad)
    x_coords = [TX[i] * c - TY[i] * s + cx for i in range(len(TX))]
    y_coords = [TX[i] * s + TY[i] * c + cy for i in range(len(TY))]
    return min(x_coords), max(x_coords), min(y_coords), max(y_coords)

def calculate_per_n_scores(csv_path):
    try:
        df = pd.read_csv(csv_path)
        scores = {}
        for n in range(1, 201):
            n_df = df[df['id'].str.startswith(f'{n:03d}_')]
            if len(n_df) != n:
                continue
            global_x_min = float('inf')
            global_x_max = float('-inf')
            global_y_min = float('inf')
            global_y_max = float('-inf')
            for _, row in n_df.iterrows():
                x = float(str(row['x']).replace('s', ''))
                y = float(str(row['y']).replace('s', ''))
                deg = float(str(row['deg']).replace('s', ''))
                x_min, x_max, y_min, y_max = get_polygon_bounds(x, y, deg)
                global_x_min = min(global_x_min, x_min)
                global_x_max = max(global_x_max, x_max)
                global_y_min = min(global_y_min, y_min)
                global_y_max = max(global_y_max, y_max)
            width = global_x_max - global_x_min
            height = global_y_max - global_y_min
            side = max(width, height)
            score = side * side / n
            scores[n] = score
        return scores
    except Exception as e:
        return None

print("Analysis functions loaded.")

In [None]:
# Load baseline scores
baseline_path = "/home/nonroot/snapshots/santa-2025/21337353543/submission/submission.csv"
baseline_scores = calculate_per_n_scores(baseline_path)

print(f"Baseline total score: {sum(baseline_scores.values()):.6f}")
print(f"Target score: 68.884125")
print(f"Gap to target: {sum(baseline_scores.values()) - 68.884125:.6f}")
print(f"\nPer-N improvement needed: {(sum(baseline_scores.values()) - 68.884125) / 200:.6f} on average")

In [None]:
# Analyze where improvements are possible
print("\nPer-N score analysis:")
print("-" * 60)
print(f"{'N':>5} {'Score':>12} {'Contribution %':>15}")
print("-" * 60)

total = sum(baseline_scores.values())
for n in [1, 2, 3, 4, 5, 10, 20, 50, 100, 150, 200]:
    if n in baseline_scores:
        score = baseline_scores[n]
        pct = (score / total) * 100
        print(f"{n:5d} {score:12.6f} {pct:15.2f}%")

print("-" * 60)
print(f"\nN=1 contributes {baseline_scores[1]/total*100:.2f}% of total score")
print(f"N=1-10 contribute {sum(baseline_scores[n] for n in range(1,11))/total*100:.2f}% of total score")

In [None]:
# Check all snapshots for best per-N
snapshot_csvs = glob.glob("/home/nonroot/snapshots/santa-2025/*/submission/submission.csv")
print(f"Found {len(snapshot_csvs)} snapshot submissions")

# Track best per-N
best_per_n = {n: (baseline_scores[n], "baseline") for n in baseline_scores}
improvements = []

for csv_path in snapshot_csvs:
    scores = calculate_per_n_scores(csv_path)
    if scores is None:
        continue
    snapshot_id = csv_path.split("/")[-3]
    for n, score in scores.items():
        if n in best_per_n and score < best_per_n[n][0]:
            improvement = best_per_n[n][0] - score
            improvements.append((n, improvement, snapshot_id))
            best_per_n[n] = (score, snapshot_id)

print(f"\nFound {len(improvements)} improvements over baseline")
total_best = sum(score for score, _ in best_per_n.values())
print(f"Best possible from snapshots: {total_best:.6f}")
print(f"Improvement over baseline: {sum(baseline_scores.values()) - total_best:.6f}")

## Key Insight

The best we can achieve from existing snapshots is ~70.52, which is still 1.64 points away from target.

**The top kernels achieve 68.5 by:**
1. Running C++ SA with millions of iterations (1.6M per N)
2. Ensembling from 15+ external Kaggle datasets
3. Accumulating improvements over 900+ submissions

**Our constraints:**
- No access to external Kaggle datasets
- Python-based optimization is too slow
- All local snapshots converge to same optimum

## Possible Paths Forward

1. **Genetic Algorithm** - Can explore multiple basins simultaneously
2. **Random restart from scratch** - Generate diverse starting points
3. **Asymmetric solutions** - Discussion suggests asymmetric beats symmetric
4. **Focus on small N** - N=1-20 have the most improvement potential