# Evolver Loop 2 - LB Feedback Analysis

## Submission Results
- **exp_001 (002_valid_baseline)**: CV 70.6151 → LB 70.6151 (gap: +0.0000)

## Key Insights
1. CV-LB gap is essentially zero - our validation is accurate
2. Target is 68.881647, we need to improve by 1.733 points
3. N=1 is already optimal at 45° rotation
4. Top kernels use C++ binaries (forbidden) - we must implement in Python

In [None]:
import pandas as pd
import numpy as np
import json

# Load the valid baseline metrics
with open('/home/code/experiments/002_valid_baseline/metrics.json', 'r') as f:
    metrics = json.load(f)

print(f"Current Score: {metrics['cv_score']:.6f}")
print(f"Target Score: 68.881647")
print(f"Gap to Target: {metrics['cv_score'] - 68.881647:.6f}")
print(f"\nN=1 optimal angle: {metrics['n1_optimal_angle']}°")
print(f"N=1 optimal side: {metrics['n1_optimal_side']:.6f}")
print(f"N=1 score contribution: {metrics['per_n_scores']['1']['score']:.6f}")

In [None]:
# Analyze score contribution by N range
per_n = metrics['per_n_scores']

ranges = [
    ('N=1-10', 1, 10),
    ('N=11-50', 11, 50),
    ('N=51-100', 51, 100),
    ('N=101-150', 101, 150),
    ('N=151-200', 151, 200)
]

print("Score Contribution by N Range:")
print("="*50)
total = 0
for name, start, end in ranges:
    range_score = sum(per_n[str(n)]['score'] for n in range(start, end+1))
    total += range_score
    pct = range_score / metrics['cv_score'] * 100
    print(f"{name}: {range_score:.4f} ({pct:.1f}%)")
print(f"\nTotal: {total:.4f}")

In [None]:
# Find N values with highest potential for improvement
# Higher score contribution = more potential
print("Top 20 N values by score contribution (highest potential):")
print("="*60)

n_scores = [(int(n), data['score'], data['side']) for n, data in per_n.items()]
n_scores.sort(key=lambda x: x[1], reverse=True)

for n, score, side in n_scores[:20]:
    print(f"N={n:3d}: score={score:.6f}, side={side:.6f}")

In [None]:
# Calculate theoretical minimum for N=1
# A single tree at 45° has minimum bounding box
import math

TX = [0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125]
TY = [0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5]

def get_bbox_side(angle_deg):
    """Get bounding box side for a tree at given angle."""
    angle_rad = math.radians(angle_deg)
    cos_a = math.cos(angle_rad)
    sin_a = math.sin(angle_rad)
    
    xs = [cos_a * tx - sin_a * ty for tx, ty in zip(TX, TY)]
    ys = [sin_a * tx + cos_a * ty for tx, ty in zip(TX, TY)]
    
    return max(max(xs) - min(xs), max(ys) - min(ys))

# Verify N=1 optimal
print("Verifying N=1 optimal angle:")
for angle in [0, 30, 45, 60, 90]:
    side = get_bbox_side(angle)
    score = side * side / 1
    print(f"Angle {angle:3d}°: side={side:.6f}, score={score:.6f}")

In [None]:
# Strategy Analysis
print("\n" + "="*60)
print("STRATEGY ANALYSIS")
print("="*60)

print("""
## Current Situation:
- Valid baseline: 70.6151
- Target: 68.8816
- Gap: 1.733 points (2.5%)

## What Top Kernels Do (FORBIDDEN for us):
1. C++ simulated annealing with fractional translation
2. Per-N ensemble from 15+ sources
3. bbox3 binary optimizer

## What We MUST Do (Python only):
1. Implement fractional translation in Python
2. Implement local search with rotation optimization
3. Create per-N tracking system
4. Accumulate improvements across experiments

## Priority Order:
1. N=2-20: Highest score contribution per N, small enough for Python optimization
2. N=21-50: Medium contribution, still tractable
3. N=51-200: Lower priority, need efficient algorithms

## Key Insight from Kernels:
- Fractional translation uses step sizes: 0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001
- 8 directions: (0,1), (0,-1), (1,0), (-1,0), (1,1), (1,-1), (-1,1), (-1,-1)
- Keep improvements, revert if no improvement
""")

In [None]:
# Check available snapshots for ensemble
import os
import glob

snapshot_dir = '/home/nonroot/snapshots/santa-2025'
if os.path.exists(snapshot_dir):
    snapshots = os.listdir(snapshot_dir)
    print(f"Available snapshots: {len(snapshots)}")
    for snap in snapshots[:5]:
        sub_path = f"{snapshot_dir}/{snap}/submission/submission.csv"
        if os.path.exists(sub_path):
            print(f"  {snap}: {sub_path}")
else:
    print("No snapshots directory found")

## Next Steps

### Experiment 003: Fractional Translation in Python

**Goal**: Implement fractional translation optimization for N=2-20 in pure Python

**Algorithm**:
1. Load baseline solution
2. For each N from 2 to 20:
   - For each tree in the configuration:
     - Try moving by tiny amounts in 8 directions
     - Keep if bounding box shrinks and no overlap
3. Track per-N improvements
4. Create ensemble with best per-N solutions

**Expected Gain**: 0.1-0.3 points from small N optimization

### Experiment 004: Per-N Ensemble

**Goal**: Collect best per-N solutions from multiple valid sources

**Sources**:
1. Current baseline (70.6151)
2. Other valid snapshots
3. Our optimized solutions

**Expected Gain**: 0.2-0.5 points from ensemble