# Loop 11 Strategic Analysis

## Key Observations

1. **Target (68.919154) is BELOW current public LB best (71.19)**
   - This means the target represents what winning teams achieved
   - Our baseline (70.66) is already competitive with public LB
   - The gap (1.74 points) requires techniques not publicly shared

2. **11 experiments have been run - all stuck at 70.66**
   - Local optimization (SA, bbox3) shows negligible improvement
   - The baseline is at an extremely strong local optimum
   - Overlap repair works but doesn't improve score

3. **What hasn't been tried:**
   - Long optimization runs (3+ hours)
   - Different starting configurations (not just saspav)
   - Fundamentally different algorithms (constraint programming, genetic algorithms)

In [1]:
import pandas as pd
import numpy as np
import json

# Load session state
with open('/home/code/session_state.json', 'r') as f:
    state = json.load(f)

# Analyze experiments
print("=" * 60)
print("EXPERIMENT HISTORY ANALYSIS")
print("=" * 60)

for exp in state['experiments']:
    print(f"{exp['id']}: {exp['name'][:40]:40s} | CV: {exp['cv_score']:.6f}")

print("\n" + "=" * 60)
print("SCORE DISTRIBUTION")
print("=" * 60)
scores = [exp['cv_score'] for exp in state['experiments']]
print(f"Min: {min(scores):.9f}")
print(f"Max: {max(scores):.9f}")
print(f"Range: {max(scores) - min(scores):.9f}")
print(f"Unique scores: {len(set([round(s, 6) for s in scores]))}")

EXPERIMENT HISTORY ANALYSIS
exp_000: 001_baseline                             | CV: 70.659959
exp_001: 002_cpp_optimizer                        | CV: 70.659959
exp_002: 003_lattice_construction                 | CV: 70.659959
exp_003: 004_lattice_plus_sa                      | CV: 70.659959
exp_004: 005_comprehensive_ensemble               | CV: 51.423527
exp_005: Valid Ensemble with Overlap Checking     | CV: 70.659959
exp_006: Eazy Optimizer C++                       | CV: 70.659944
exp_007: Rotation Optimization (fix_direction)    | CV: 70.659959
exp_008: Multi-Seed bbox3 with Overlap Repair     | CV: 70.659959
exp_009: Tree Removal Technique (Chistyakov Appro | CV: 70.659959
exp_010: bbox3 with Overlap Repair (10 min)       | CV: 70.659958

SCORE DISTRIBUTION
Min: 51.423527000
Max: 70.659959000
Range: 19.236432000
Unique scores: 4


In [2]:
# Check what the target actually means
print("\n" + "=" * 60)
print("TARGET ANALYSIS")
print("=" * 60)

target = 68.919154
best_cv = min(scores)
gap = best_cv - target

print(f"Target: {target}")
print(f"Best CV: {best_cv:.6f}")
print(f"Gap: {gap:.6f} ({gap/target*100:.2f}%)")

# Per-N analysis
print(f"\nAverage improvement needed per N: {gap/200:.6f}")
print(f"This is {gap/200/best_cv*100:.4f}% of current score per N")


TARGET ANALYSIS
Target: 68.919154
Best CV: 51.423527
Gap: -17.495627 (-25.39%)

Average improvement needed per N: -0.087478
This is -0.1701% of current score per N


In [3]:
# Analyze the baseline to understand where improvements might come from
baseline_path = '/home/code/external_data/saspav/santa-2025.csv'
df = pd.read_csv(baseline_path)

print("\n" + "=" * 60)
print("BASELINE STRUCTURE")
print("=" * 60)
print(f"Total rows: {len(df)}")
print(f"Columns: {list(df.columns)}")
print(f"\nFirst few rows:")
print(df.head())


BASELINE STRUCTURE
Total rows: 20100
Columns: ['id', 'x', 'y', 'deg']

First few rows:
      id                       x                       y  \
0  001_0    s-48.196086194214246     s58.770984615214225   
1  002_0   s0.154097069621355887  s-0.038540742694794648   
2  002_1  s-0.154097069621372845  s-0.561459257305224058   
3  003_0      s1.123655816140301      s0.781101815992563   
4  003_1       s1.23405569584216      s1.275999500663759   

                       deg  
0                    s45.0  
1  s203.629377730656841550  
2   s23.629377730656791812  
3        s111.125132292893  
4         s66.370622269343  


In [4]:
# Calculate per-N scores for baseline
from decimal import Decimal, getcontext
from shapely.geometry import Polygon
from shapely import affinity

getcontext().prec = 30

def get_tree_polygon(x, y, deg):
    """Create tree polygon at given position and rotation."""
    TX = [0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125]
    TY = [0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5]
    poly = Polygon(zip(TX, TY))
    rotated = affinity.rotate(poly, deg, origin=(0, 0))
    return affinity.translate(rotated, x, y)

def get_score_for_n(df, n):
    """Calculate score for a specific N."""
    group = df[df['id'].str.startswith(f'{n:03d}_')]
    if len(group) == 0:
        return None
    
    # Parse coordinates
    xs = group['x'].astype(str).str.lstrip('s').astype(float).values
    ys = group['y'].astype(str).str.lstrip('s').astype(float).values
    degs = group['deg'].astype(str).str.lstrip('s').astype(float).values
    
    # Get all polygon points
    all_x, all_y = [], []
    for x, y, deg in zip(xs, ys, degs):
        poly = get_tree_polygon(x, y, deg)
        coords = list(poly.exterior.coords)
        all_x.extend([c[0] for c in coords])
        all_y.extend([c[1] for c in coords])
    
    # Calculate bounding box
    side = max(max(all_x) - min(all_x), max(all_y) - min(all_y))
    return side**2 / n

print("\n" + "=" * 60)
print("PER-N SCORE ANALYSIS")
print("=" * 60)

per_n_scores = {}
for n in range(1, 201):
    score = get_score_for_n(df, n)
    if score:
        per_n_scores[n] = score

# Find worst efficiency N values
theoretical_min = {n: 0.35 * 0.35 for n in range(1, 201)}  # Single tree area
efficiency = {n: per_n_scores[n] / theoretical_min[n] for n in per_n_scores}

worst_n = sorted(efficiency.items(), key=lambda x: x[1], reverse=True)[:10]
print("\nWorst efficiency N values (highest improvement potential):")
for n, eff in worst_n:
    print(f"  N={n:3d}: score={per_n_scores[n]:.6f}, efficiency={eff:.2f}x theoretical")

best_n = sorted(efficiency.items(), key=lambda x: x[1])[:10]
print("\nBest efficiency N values:")
for n, eff in best_n:
    print(f"  N={n:3d}: score={per_n_scores[n]:.6f}, efficiency={eff:.2f}x theoretical")


PER-N SCORE ANALYSIS



Worst efficiency N values (highest improvement potential):
  N=  1: score=0.661250, efficiency=5.40x theoretical
  N=  2: score=0.450779, efficiency=3.68x theoretical
  N=  3: score=0.434745, efficiency=3.55x theoretical
  N=  5: score=0.416850, efficiency=3.40x theoretical
  N=  4: score=0.416545, efficiency=3.40x theoretical
  N=  7: score=0.399897, efficiency=3.26x theoretical
  N=  6: score=0.399610, efficiency=3.26x theoretical
  N=  9: score=0.387415, efficiency=3.16x theoretical
  N=  8: score=0.385407, efficiency=3.15x theoretical
  N= 15: score=0.379203, efficiency=3.10x theoretical

Best efficiency N values:
  N=181: score=0.329946, efficiency=2.69x theoretical
  N=156: score=0.329987, efficiency=2.69x theoretical
  N=182: score=0.329988, efficiency=2.69x theoretical
  N=180: score=0.331001, efficiency=2.70x theoretical
  N=155: score=0.332074, efficiency=2.71x theoretical
  N=168: score=0.332475, efficiency=2.71x theoretical
  N=179: score=0.332595, efficiency=2.72x theoret

In [5]:
# Calculate total score and contribution by N ranges
print("\n" + "=" * 60)
print("SCORE CONTRIBUTION BY N RANGE")
print("=" * 60)

total = sum(per_n_scores.values())
print(f"Total score: {total:.6f}")

ranges = [(1, 10), (11, 50), (51, 100), (101, 150), (151, 200)]
for start, end in ranges:
    range_score = sum(per_n_scores[n] for n in range(start, end+1))
    pct = range_score / total * 100
    print(f"  N={start:3d}-{end:3d}: {range_score:.4f} ({pct:.1f}%)")

# How much improvement needed per range to hit target?
print("\nImprovement needed per range to hit target:")
for start, end in ranges:
    range_score = sum(per_n_scores[n] for n in range(start, end+1))
    range_pct = range_score / total
    range_gap = gap * range_pct
    print(f"  N={start:3d}-{end:3d}: {range_gap:.4f} points ({range_gap/range_score*100:.2f}% of current)")


SCORE CONTRIBUTION BY N RANGE
Total score: 70.659959
  N=  1- 10: 4.3291 (6.1%)
  N= 11- 50: 14.7126 (20.8%)
  N= 51-100: 17.6323 (25.0%)
  N=101-150: 17.1408 (24.3%)
  N=151-200: 16.8452 (23.8%)

Improvement needed per range to hit target:
  N=  1- 10: -1.0719 points (-24.76% of current)
  N= 11- 50: -3.6429 points (-24.76% of current)
  N= 51-100: -4.3658 points (-24.76% of current)
  N=101-150: -4.2441 points (-24.76% of current)
  N=151-200: -4.1709 points (-24.76% of current)


In [6]:
# Key insight: What would it take to reach the target?
print("\n" + "=" * 60)
print("PATH TO TARGET ANALYSIS")
print("=" * 60)

print(f"\nCurrent total: {total:.6f}")
print(f"Target: {target}")
print(f"Gap: {gap:.6f}")

# If we improved each N by the same percentage
pct_improvement_needed = gap / total * 100
print(f"\nUniform improvement needed: {pct_improvement_needed:.2f}% per N")

# If we only improved N=1-10 (worst efficiency)
n1_10_score = sum(per_n_scores[n] for n in range(1, 11))
print(f"\nN=1-10 current score: {n1_10_score:.4f}")
print(f"If we improved N=1-10 by 50%: saves {n1_10_score * 0.5:.4f} points")
print(f"  (Still need {gap - n1_10_score * 0.5:.4f} more points)")

# What if we found optimal packings for small N?
print(f"\nN=1 current: {per_n_scores[1]:.6f}")
print(f"N=1 theoretical (single tree): {0.35 * 0.35:.6f}")
print(f"N=1 potential improvement: {per_n_scores[1] - 0.35*0.35:.6f}")


PATH TO TARGET ANALYSIS

Current total: 70.659959
Target: 68.919154
Gap: -17.495627

Uniform improvement needed: -24.76% per N

N=1-10 current score: 4.3291
If we improved N=1-10 by 50%: saves 2.1646 points
  (Still need -19.6602 more points)

N=1 current: 0.661250
N=1 theoretical (single tree): 0.122500
N=1 potential improvement: 0.538750


In [7]:
# Fix: Use only VALID scores (exclude exp_004 which had overlaps)
print("=" * 60)
print("CORRECTED ANALYSIS (excluding invalid scores)")
print("=" * 60)

valid_scores = [exp['cv_score'] for exp in state['experiments'] if exp['id'] != 'exp_004']
best_valid_cv = min(valid_scores)

target = 68.919154
gap = best_valid_cv - target

print(f"Best VALID CV: {best_valid_cv:.6f}")
print(f"Target: {target}")
print(f"Gap: {gap:.6f} ({gap/target*100:.2f}%)")
print(f"\nAverage improvement needed per N: {gap/200:.6f}")

# The real question: can we reach 68.92 from 70.66?
print(f"\n" + "=" * 60)
print("REALISTIC PATH TO TARGET")
print("=" * 60)
print(f"Current: {best_valid_cv:.6f}")
print(f"Target: {target}")
print(f"Gap: {gap:.6f} points")
print(f"Percentage improvement needed: {gap/best_valid_cv*100:.2f}%")
print(f"\nThis means we need to reduce the score by {gap:.2f} points")
print(f"Or improve efficiency by {gap/best_valid_cv*100:.2f}% across all N values")

CORRECTED ANALYSIS (excluding invalid scores)
Best VALID CV: 70.659944
Target: 68.919154
Gap: 1.740790 (2.53%)

Average improvement needed per N: 0.008704

REALISTIC PATH TO TARGET
Current: 70.659944
Target: 68.919154
Gap: 1.740790 points
Percentage improvement needed: 2.46%

This means we need to reduce the score by 1.74 points
Or improve efficiency by 2.46% across all N values


## Strategic Conclusions

### The Problem
1. We're stuck at a strong local optimum (70.66)
2. Local optimization (SA, bbox3) provides negligible improvement (<0.000001)
3. The target (68.92) requires 1.74 points improvement (2.5%)
4. This is BELOW the current public leaderboard best (71.19)

### What This Means
- The target represents what winning teams achieved with techniques NOT publicly shared
- Simple optimization of the saspav baseline won't reach the target
- We need fundamentally different approaches

### Possible Paths Forward

1. **Long optimization runs (3+ hours)** - The bbox3-runner runs for 3 hours. We've only tried 10 minutes.

2. **Different starting configurations** - All our experiments start from saspav. What if there's a better basin of attraction?

3. **Constraint programming / SAT solvers** - Exact methods might find provably optimal solutions for small N

4. **Genetic algorithms with diverse population** - Might escape local optima

5. **Focus on worst-efficiency N values** - N=1-10 have the worst efficiency and highest improvement potential

6. **Manual optimization for small N** - For N=1-5, we might be able to find optimal configurations analytically