# Loop 7 LB Feedback Analysis

**Submission**: exp_006 (011_long_optimization)
**CV Score**: 70.6600
**LB Score**: 70.6600
**Gap**: 0.0000 (CV = LB, as expected for this optimization problem)

## Key Observations
1. CV = LB confirms this is a pure optimization problem with no train/test split
2. 33 minutes of C++ optimization found ZERO improvement
3. The baseline is at an extremely tight local optimum with no slack
4. Gap to target: 70.66 - 68.92 = 1.74 points (2.46%)

In [1]:
import pandas as pd
import numpy as np
from numba import njit
import math

# Load current best submission
df = pd.read_csv('/home/code/external_data/saspav_latest/santa-2025.csv')
print(f"Total rows: {len(df)}")
print(df.head())

Total rows: 20100
      id                          x                         y  \
0  001_0  s-48.19608619421424577922  s58.77098461521422478882   
1  002_0    s0.15409706962135588659  s-0.03854074269479464826   
2  002_1   s-0.15409706962137284525  s-0.56145925730522405761   
3  003_0    s1.12365581614030096702   s0.78110181599256300888   
4  003_1    s1.23405569584216001644   s1.27599950066375900093   

                         deg  
0   s45.00000000000000000000  
1  s203.62937773065684154972  
2   s23.62937773065679181173  
3  s111.12513229289299943048  
4   s66.37062226934300213088  


In [2]:
# Scoring function
@njit
def make_polygon_template():
    tw=0.15; th=0.2; bw=0.7; mw=0.4; ow=0.25
    tip=0.8; t1=0.5; t2=0.25; base=0.0; tbot=-th
    x=np.array([0,ow/2,ow/4,mw/2,mw/4,bw/2,tw/2,tw/2,-tw/2,-tw/2,-bw/2,-mw/4,-mw/2,-ow/4,-ow/2],np.float64)
    y=np.array([tip,t1,t1,t2,t2,base,base,tbot,tbot,base,base,t2,t2,t1,t1],np.float64)
    return x,y

@njit
def score_group(xs, ys, degs, tx, ty):
    n = xs.size
    V = tx.size
    mnx = 1e300; mny = 1e300; mxx = -1e300; mxy = -1e300
    for i in range(n):
        r = degs[i] * math.pi / 180.0
        c = math.cos(r); s = math.sin(r)
        xi = xs[i]; yi = ys[i]
        for j in range(V):
            X = c * tx[j] - s * ty[j] + xi
            Y = s * tx[j] + c * ty[j] + yi
            if X < mnx: mnx = X
            if X > mxx: mxx = X
            if Y < mny: mny = Y
            if Y > mxy: mxy = Y
    side = max(mxx - mnx, mxy - mny)
    return side * side / n

def strip(a):
    return np.array([float(str(v).replace('s','')) for v in a], np.float64)

tx, ty = make_polygon_template()

In [3]:
# Calculate per-N scores
df['N'] = df['id'].astype(str).str.split('_').str[0].astype(int)

scores = []
for n, g in df.groupby('N'):
    xs = strip(g['x'].to_numpy())
    ys = strip(g['y'].to_numpy())
    ds = strip(g['deg'].to_numpy())
    sc = score_group(xs, ys, ds, tx, ty)
    scores.append({'N': n, 'score': sc, 'side': np.sqrt(sc * n)})

scores_df = pd.DataFrame(scores)
print(f"Total score: {scores_df['score'].sum():.6f}")
print(f"\nTop 10 worst-packed (highest score/N):")
print(scores_df.nlargest(10, 'score')[['N', 'score', 'side']])

Total score: 70.659958

Top 10 worst-packed (highest score/N):
     N     score      side
0    1  0.661250  0.813173
1    2  0.450779  0.949504
2    3  0.434745  1.142031
4    5  0.416850  1.443692
3    4  0.416545  1.290806
6    7  0.399897  1.673104
5    6  0.399610  1.548438
8    9  0.387415  1.867280
7    8  0.385407  1.755921
14  15  0.379203  2.384962


In [4]:
# Analyze which N values have the most room for improvement
# Theoretical minimum for N trees in a square is when they're perfectly packed
# For this tree shape, the theoretical packing efficiency is unknown

# Let's look at the efficiency metric: score / N vs N
scores_df['efficiency'] = scores_df['score']  # Already normalized by N
scores_df['side_per_tree'] = scores_df['side'] / np.sqrt(scores_df['N'])

print("\nEfficiency analysis (lower is better):")
print(f"Best efficiency (lowest score): N={scores_df.loc[scores_df['score'].idxmin(), 'N']}, score={scores_df['score'].min():.6f}")
print(f"Worst efficiency (highest score): N={scores_df.loc[scores_df['score'].idxmax(), 'N']}, score={scores_df['score'].max():.6f}")

# Group by N ranges
scores_df['N_range'] = pd.cut(scores_df['N'], bins=[0, 10, 50, 100, 150, 200], labels=['1-10', '11-50', '51-100', '101-150', '151-200'])
print("\nScore contribution by N range:")
print(scores_df.groupby('N_range')['score'].agg(['sum', 'mean', 'count']))


Efficiency analysis (lower is better):
Best efficiency (lowest score): N=181, score=0.329946
Worst efficiency (highest score): N=1, score=0.661250

Score contribution by N range:
               sum      mean  count
N_range                            
1-10      4.329128  0.432913     10
11-50    14.712640  0.367816     40
51-100   17.632268  0.352645     50
101-150  17.140770  0.342815     50
151-200  16.845152  0.336903     50


  print(scores_df.groupby('N_range')['score'].agg(['sum', 'mean', 'count']))


In [5]:
# The key insight from the super-fast SA kernel:
# 1. Use 2-tree unit cells with translations
# 2. Automatically explore all viable grid sizes
# 3. Apply deletion cascade

# Let's check what grid sizes could work for large N
print("Grid configurations for large N (from egortrushin kernel):")
print("N=72: [4,9] -> 4*9*2 = 72")
print("N=100: [5,10] -> 5*10*2 = 100")
print("N=110: [5,11] -> 5*11*2 = 110")
print("N=144: [6,12] -> 6*12*2 = 144")
print("N=156: [6,13] -> 6*13*2 = 156")
print("N=196: [7,14] -> 7*14*2 = 196")
print("N=200: [7,15] -> 7*15*2 = 210, take first 200")

print("\nThese are the N values where lattice approach is most effective.")
print("The super-fast SA kernel shows ~0.15 improvement in 2 minutes!")

Grid configurations for large N (from egortrushin kernel):
N=72: [4,9] -> 4*9*2 = 72
N=100: [5,10] -> 5*10*2 = 100
N=110: [5,11] -> 5*11*2 = 110
N=144: [6,12] -> 6*12*2 = 144
N=156: [6,13] -> 6*13*2 = 156
N=196: [7,14] -> 7*14*2 = 196
N=200: [7,15] -> 7*15*2 = 210, take first 200

These are the N values where lattice approach is most effective.
The super-fast SA kernel shows ~0.15 improvement in 2 minutes!


In [6]:
# What's the gap breakdown?
target = 68.919154
current = scores_df['score'].sum()
gap = current - target

print(f"Current score: {current:.6f}")
print(f"Target score: {target:.6f}")
print(f"Gap: {gap:.6f} ({100*gap/current:.2f}%)")

# If we could improve large N by 5%, how much would that help?
large_n_score = scores_df[scores_df['N'] > 100]['score'].sum()
print(f"\nLarge N (>100) contribution: {large_n_score:.6f} ({100*large_n_score/current:.1f}%)")
print(f"5% improvement on large N: {0.05 * large_n_score:.6f}")
print(f"Would close {100 * 0.05 * large_n_score / gap:.1f}% of the gap")

Current score: 70.659958
Target score: 68.919154
Gap: 1.740804 (2.46%)

Large N (>100) contribution: 33.985922 (48.1%)
5% improvement on large N: 1.699296
Would close 97.6% of the gap


## Strategy for Next Experiment

### Key Insight from Research

The **super-fast SA with translations** kernel (jiweiliu) shows a complete workflow that:
1. Uses 2-tree unit cells with grid translations
2. Automatically explores ALL viable grid sizes (not just hand-picked ones)
3. Applies deletion cascade (backward propagation)
4. Gets ~0.15 improvement in under 2 minutes!

### The Problem with Our Current Approach

We've been trying to optimize an ALREADY OPTIMIZED solution. The baseline is at a tight local optimum.

### The Solution: Generate NEW Configurations from Scratch

The lattice approach generates DIFFERENT configurations that may be in DIFFERENT basins:
1. Start with 2 trees at (0,0) and (0.5, 0.5) with angles 0 and 180
2. Optimize the 2-tree unit cell with SA
3. Translate the unit cell in a grid pattern
4. Run SA on the full configuration
5. Apply deletion cascade to propagate improvements to smaller N

### Next Experiment: Implement Super-Fast SA with Translations

We should implement the jiweiliu kernel approach which:
- Automatically explores all viable grid sizes
- Uses Numba for acceleration
- Applies deletion cascade
- Gets ~0.15 improvement quickly

This is fundamentally different from optimizing the current baseline!