# Loop 33 Strategic Analysis

## Critical Situation
- Current score: 70.316492
- Target: 68.870074
- Gap: 1.446 points (2.10%)
- 14 consecutive experiments with NO improvement
- We are at the PUBLIC SOLUTION CEILING

In [None]:
# Verify we are at the public ceiling
import glob
import pandas as pd
import numpy as np
import math

TX = np.array([0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125])
TY = np.array([0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5])

def parse_s(s):
    if isinstance(s, str) and s.startswith('s'):
        return float(s[1:])
    return float(s)

def get_bounds(x, y, deg):
    rad = math.radians(deg)
    cos_a, sin_a = math.cos(rad), math.sin(rad)
    rx = TX * cos_a - TY * sin_a + x
    ry = TX * sin_a + TY * cos_a + y
    return rx.min(), rx.max(), ry.min(), ry.max()

def compute_total_score(df):
    total = 0
    for n in range(1, 201):
        pattern = f'{n:03d}_'
        cfg = df[df['id'].str.startswith(pattern)]
        if len(cfg) != n:
            return float('inf')
        
        minx = miny = float('inf')
        maxx = maxy = float('-inf')
        
        for _, row in cfg.iterrows():
            x = parse_s(row['x'])
            y = parse_s(row['y'])
            deg = parse_s(row['deg'])
            x0, x1, y0, y1 = get_bounds(x, y, deg)
            minx = min(minx, x0)
            maxx = max(maxx, x1)
            miny = min(miny, y0)
            maxy = max(maxy, y1)
        
        side = max(maxx - minx, maxy - miny)
        total += side**2 / n
    return total

# Check all CSV files
csv_files = glob.glob('/home/code/data/external/**/*.csv', recursive=True)
csv_files += glob.glob('/home/code/data/external/*.csv')

scores = []
for f in csv_files:
    try:
        df = pd.read_csv(f)
        if 'id' in df.columns and 'x' in df.columns:
            score = compute_total_score(df)
            if score < 100:
                scores.append((f, score))
    except:
        pass

scores.sort(key=lambda x: x[1])
print(f'Best scores from {len(scores)} external files:')
for f, score in scores[:10]:
    print(f'  {score:.6f}')

## Key Insight

The best score in ALL external data is 70.316492 - exactly our current score.

This means:
1. We have already extracted the best per-N solutions from all available sources
2. No further ensemble improvements are possible from existing data
3. To improve, we need either:
   - NEW external data sources with better solutions
   - Run optimization for MUCH longer (hours/days)
   - Implement a fundamentally different algorithm

In [None]:
# Analyze per-N scores to find where improvement is possible
df = pd.read_csv('/home/submission/submission.csv')

per_n_scores = {}
for n in range(1, 201):
    pattern = f'{n:03d}_'
    cfg = df[df['id'].str.startswith(pattern)]
    if len(cfg) != n:
        continue
    
    minx = miny = float('inf')
    maxx = maxy = float('-inf')
    
    for _, row in cfg.iterrows():
        x = parse_s(row['x'])
        y = parse_s(row['y'])
        deg = parse_s(row['deg'])
        x0, x1, y0, y1 = get_bounds(x, y, deg)
        minx = min(minx, x0)
        maxx = max(maxx, x1)
        miny = min(miny, y0)
        maxy = max(maxy, y1)
    
    side = max(maxx - minx, maxy - miny)
    per_n_scores[n] = side**2 / n

# Calculate theoretical minimum (perfect packing)
# Tree area is approximately 0.3325 (calculated from polygon)
tree_area = 0.3325

print('Per-N analysis:')
print('N | Current Score | Theoretical Min | Gap')
print('-' * 50)
for n in [1, 2, 3, 5, 10, 20, 50, 100, 200]:
    current = per_n_scores[n]
    # Theoretical min: sqrt(n * tree_area) for side, then side^2/n
    theoretical = tree_area  # Perfect packing efficiency
    gap = current - theoretical
    print(f'{n:3d} | {current:.6f} | {theoretical:.6f} | {gap:.6f}')

## Strategy Options

### Option 1: Extended Optimization (BLOCKED - bbox3 incompatible)
The bbox3 binary requires GLIBC_2.34 which is not available on this system.

### Option 2: Compile bbox3.cpp from source
Need to find the source code and compile it.

### Option 3: Implement custom C++ optimizer
Write our own SA optimizer in C++ and compile it.

### Option 4: Search for NEW external data
Look for recently published kernels or datasets with better solutions.

### Option 5: Focus on specific N values
N=1 contributes 0.661 to the score. If we can improve N=1, that's significant.