# Loop 39 Analysis: LB Feedback and Strategy Assessment

## Key Finding: We are at a LOCAL OPTIMUM

The submission 038_backward_iteration_v3 scored **70.3067** on LB, matching CV exactly.

After 40 experiments:
- All optimization methods (SA, GA, bbox3, backward/forward iteration) find 0 improvements
- The solution is at an EXTREMELY strong local optimum
- Gap to target: 1.44 points (2.09%)

In [None]:
import pandas as pd
import numpy as np
from numba import njit
import math
import glob
import os

@njit
def score_group(xs,ys,degs,tx,ty):
    n=xs.size; V=tx.size
    mnx=1e300; mny=1e300; mxx=-1e300; mxy=-1e300
    for i in range(n):
        r=degs[i]*math.pi/180.0
        c=math.cos(r); s=math.sin(r)
        xi=xs[i]; yi=ys[i]
        for j in range(V):
            X=c*tx[j]-s*ty[j]+xi
            Y=s*tx[j]+c*ty[j]+yi
            if X<mnx: mnx=X
            if X>mxx: mxx=X
            if Y<mny: mny=Y
            if Y>mxy: mxy=Y
    side=max(mxx-mnx,mxy-mny)
    return side*side/n

def strip(a):
    return np.array([float(str(v).replace('s','')) for v in a],np.float64)

TX = np.array([0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125])
TY = np.array([0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5])

In [None]:
# Load current submission and analyze per-N scores
df = pd.read_csv('/home/submission/submission.csv')
df['N'] = df['id'].str.split('_').str[0].astype(int)

per_n_scores = {}
for n in range(1, 201):
    g = df[df['N'] == n]
    xs = strip(g['x'].to_numpy())
    ys = strip(g['y'].to_numpy())
    ds = strip(g['deg'].to_numpy())
    per_n_scores[n] = score_group(xs, ys, ds, TX, TY)

total = sum(per_n_scores.values())
print(f'Current total score: {total:.6f}')
print(f'Target: 68.866853')
print(f'Gap: {total - 68.866853:.6f} ({(total - 68.866853)/68.866853*100:.2f}%)')

# Show per-N score distribution
print('\n=== Per-N Score Distribution ===')
print(f'Min: {min(per_n_scores.values()):.6f} at N={min(per_n_scores, key=per_n_scores.get)}')
print(f'Max: {max(per_n_scores.values()):.6f} at N={max(per_n_scores, key=per_n_scores.get)}')
print(f'Mean: {np.mean(list(per_n_scores.values())):.6f}')
print(f'Std: {np.std(list(per_n_scores.values())):.6f}')

In [None]:
# Analyze where the gap comes from
# If target is 68.87 and we have 70.31, we need to reduce by 1.44 total
# That's 1.44/200 = 0.0072 per N on average

print('=== Gap Analysis ===')
print(f'Total gap: {total - 68.866853:.6f}')
print(f'Average gap per N: {(total - 68.866853)/200:.6f}')

# Show top 20 N values with highest scores (most room for improvement)
print('\n=== Top 20 N values with highest scores ===')
sorted_scores = sorted(per_n_scores.items(), key=lambda x: x[1], reverse=True)
for n, sc in sorted_scores[:20]:
    print(f'N={n:3d}: {sc:.6f}')

In [None]:
# Calculate theoretical minimum scores
# For N trees, the minimum possible score is achieved when trees are packed perfectly
# Tree area is approximately 0.35 (rough estimate from polygon)
# For perfect packing: total_area = N * tree_area, side = sqrt(total_area), score = side^2/N = tree_area

# But trees can't overlap, so the actual minimum depends on the shape
# Let's calculate the packing efficiency for each N

print('=== Packing Efficiency Analysis ===')
print('Packing efficiency = (N * tree_bbox_area) / (side^2)')
print('Higher efficiency = better packing')

# Tree bounding box
tree_width = 0.7  # from -0.35 to 0.35
tree_height = 1.0  # from -0.2 to 0.8
tree_bbox_area = tree_width * tree_height

print(f'\nTree bounding box area: {tree_bbox_area:.4f}')

efficiencies = {}
for n, sc in per_n_scores.items():
    # score = side^2 / n, so side^2 = score * n
    side_sq = sc * n
    efficiency = (n * tree_bbox_area) / side_sq
    efficiencies[n] = efficiency

print('\n=== Packing Efficiency by N ===')
for n in [1, 2, 3, 5, 10, 20, 50, 100, 200]:
    print(f'N={n:3d}: efficiency={efficiencies[n]:.4f} (score={per_n_scores[n]:.6f})')

## Key Insight: The Problem Structure

1. **Small N (1-10)**: These have the HIGHEST per-N scores and LOWEST packing efficiency
   - N=1: 0.661 (efficiency ~1.06 - nearly optimal for single tree)
   - N=2-10: 0.38-0.45 (efficiency ~0.5-0.9)

2. **Large N (100-200)**: These have LOWER per-N scores and HIGHER packing efficiency
   - N=100-200: ~0.33-0.34 (efficiency ~2.0+)

3. **The gap of 1.44 points** is distributed across ALL N values
   - To reach target, we need ~0.007 improvement per N on average
   - But small N have more room for improvement

## Why We're Stuck

1. **All local search methods exhausted**: SA, GA, bbox3, backward/forward iteration all find 0 improvements
2. **The solution is at a strong local optimum**: Any small perturbation makes it worse
3. **Better solutions exist but have overlaps**: The "better" scores in snapshots are from invalid (overlapping) configurations

## What's Needed to Break Through

1. **Different starting point**: Start from a completely different configuration
2. **Global search**: Methods that can escape local optima (e.g., restart from random)
3. **External data**: Top teams may have access to better solutions from external sources
4. **Novel algorithms**: Constraint programming, branch-and-bound, etc.