# Loop 6 Analysis: Deletion Cascade and Grid-Based Initialization

## Current Status
- Best CV: 70.675510 (exp_005 corner extraction)
- Best LB: 70.676145 (exp_003, exp_004)
- Target: 68.922808
- Gap: 1.75 points (2.5% reduction needed)

## Key Insight from jiweiliu kernel
The jiweiliu kernel claims ~0.15 improvement in under 2 minutes using:
1. Grid-based initialization with SA
2. Deletion cascade to propagate good large configs to smaller sizes
3. Iterative mixing with guided refinement

In [None]:
import pandas as pd
import numpy as np
import json

# Load current submission
df = pd.read_csv('/home/submission/submission.csv')
print(f'Loaded {len(df)} rows')

In [None]:
# Tree polygon template
TX = np.array([0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125])
TY = np.array([0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5])

def strip(val):
    return float(str(val).replace('s', ''))

def score_group(xs, ys, degs):
    n = len(xs)
    all_px, all_py = [], []
    for x, y, deg in zip(xs, ys, degs):
        rad = np.radians(deg)
        c, s = np.cos(rad), np.sin(rad)
        px = TX * c - TY * s + x
        py = TX * s + TY * c + y
        all_px.extend(px)
        all_py.extend(py)
    side = max(max(all_px) - min(all_px), max(all_py) - min(all_py))
    return side * side / n

# Calculate per-N scores
scores = {}
for n in range(1, 201):
    group = df[df['id'].str.startswith(f'{n:03d}_')]
    xs = np.array([strip(x) for x in group['x']])
    ys = np.array([strip(y) for y in group['y']])
    degs = np.array([strip(d) for d in group['deg']])
    scores[n] = score_group(xs, ys, degs)

total = sum(scores.values())
print(f'Total score: {total:.6f}')
print(f'Target: 68.922808')
print(f'Gap: {total - 68.922808:.6f}')

In [None]:
# Analyze which N values need the most improvement
worst_n = sorted(scores.items(), key=lambda x: x[1], reverse=True)[:30]
print('\nTop 30 worst N values (highest contribution to score):')
print('N\tScore\t\tCumulative')
cumulative = 0
for n, score in worst_n:
    cumulative += score
    print(f'{n}\t{score:.6f}\t{cumulative:.6f}')

In [None]:
# Calculate theoretical minimum for each N
# For N trees, the theoretical minimum is when trees are packed perfectly
# Tree dimensions: width ~0.7, height ~1.0
# Area per tree ~0.7 * 1.0 = 0.7
# For N trees, minimum area ~0.7 * N
# Side length ~sqrt(0.7 * N)
# Score contribution = side^2 / N = 0.7 (constant if perfectly packed)

print('\nTheoretical analysis:')
print('If all N values achieved score 0.35 (very tight packing):')
print(f'  Total would be: {0.35 * 200:.2f}')
print('\nCurrent average score per N:', total / 200)
print('Target average score per N:', 68.922808 / 200)

In [None]:
# Identify N values where we might find improvements
# The jiweiliu kernel uses grid-based initialization which works well for larger N
# Let's see which N values have the worst efficiency

print('\nN values with worst efficiency (score > 0.4):')
high_score_n = [(n, s) for n, s in scores.items() if s > 0.4]
high_score_n.sort(key=lambda x: x[1], reverse=True)
for n, s in high_score_n[:20]:
    print(f'  N={n}: {s:.6f}')

In [None]:
# The jiweiliu kernel's key techniques:
# 1. Grid-based initialization: ncols x nrows x 2 trees per cell
# 2. SA optimization on grid configurations
# 3. Deletion cascade: iteratively remove tree that minimizes bounding box

# Let's understand the grid configurations
print('\nGrid configurations that could be generated:')
for ncols in range(1, 11):
    for nrows in range(1, 11):
        n_base = 2 * ncols * nrows
        if n_base <= 200:
            print(f'  {ncols}x{nrows} grid: {n_base} trees (base)')

## Key Findings

1. **Current score**: 70.675510, gap to target is 1.75 points
2. **Worst N values**: N=1, N=2, N=3 contribute most to the score
3. **jiweiliu kernel techniques**:
   - Grid-based initialization generates NOVEL configurations
   - Deletion cascade propagates good large configs to smaller sizes
   - Claims ~0.15 improvement in under 2 minutes

## Recommended Next Steps

1. **IMPLEMENT jiweiliu deletion cascade** - This is the highest priority
2. **IMPLEMENT grid-based initialization with SA** - Generates novel solutions
3. **IMPLEMENT guided refinement from sacuscreed** - Iterative improvement

The target of 68.92 is BELOW the current #1 on the leaderboard (71.19). This means we need to GENERATE fundamentally new solutions, not just optimize existing ones.