# Loop 6 Analysis: Deletion Cascade and Grid-Based Initialization

## Current Status
- Best CV: 70.675510 (exp_005 corner extraction)
- Best LB: 70.676145 (exp_003, exp_004)
- Target: 68.922808
- Gap: 1.75 points (2.5% reduction needed)

## Key Insight from jiweiliu kernel
The jiweiliu kernel claims ~0.15 improvement in under 2 minutes using:
1. Grid-based initialization with SA
2. Deletion cascade to propagate good large configs to smaller sizes
3. Iterative mixing with guided refinement

In [1]:
import pandas as pd
import numpy as np
import json

# Load current submission
df = pd.read_csv('/home/submission/submission.csv')
print(f'Loaded {len(df)} rows')

Loaded 20100 rows


In [2]:
# Tree polygon template
TX = np.array([0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125])
TY = np.array([0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5])

def strip(val):
    return float(str(val).replace('s', ''))

def score_group(xs, ys, degs):
    n = len(xs)
    all_px, all_py = [], []
    for x, y, deg in zip(xs, ys, degs):
        rad = np.radians(deg)
        c, s = np.cos(rad), np.sin(rad)
        px = TX * c - TY * s + x
        py = TX * s + TY * c + y
        all_px.extend(px)
        all_py.extend(py)
    side = max(max(all_px) - min(all_px), max(all_py) - min(all_py))
    return side * side / n

# Calculate per-N scores
scores = {}
for n in range(1, 201):
    group = df[df['id'].str.startswith(f'{n:03d}_')]
    xs = np.array([strip(x) for x in group['x']])
    ys = np.array([strip(y) for y in group['y']])
    degs = np.array([strip(d) for d in group['deg']])
    scores[n] = score_group(xs, ys, degs)

total = sum(scores.values())
print(f'Total score: {total:.6f}')
print(f'Target: 68.922808')
print(f'Gap: {total - 68.922808:.6f}')

Total score: 70.675510
Target: 68.922808
Gap: 1.752702


In [3]:
# Analyze which N values need the most improvement
worst_n = sorted(scores.items(), key=lambda x: x[1], reverse=True)[:30]
print('\nTop 30 worst N values (highest contribution to score):')
print('N\tScore\t\tCumulative')
cumulative = 0
for n, score in worst_n:
    cumulative += score
    print(f'{n}\t{score:.6f}\t{cumulative:.6f}')


Top 30 worst N values (highest contribution to score):
N	Score		Cumulative
1	0.661250	0.661250
2	0.450779	1.112029
3	0.434745	1.546774
5	0.416850	1.963624
4	0.416545	2.380169
7	0.399897	2.780065
6	0.399610	3.179676
9	0.387415	3.567091
8	0.385407	3.952498
15	0.379203	4.331701
10	0.376630	4.708331
21	0.376451	5.084782
20	0.376057	5.460839
11	0.375736	5.836575
22	0.375258	6.211833
16	0.374128	6.585961
26	0.373997	6.959958
12	0.372724	7.332682
13	0.372323	7.705005
25	0.372144	8.077149
14	0.370569	8.447718
31	0.370329	8.818048
17	0.370040	9.188088
43	0.370040	9.558128
37	0.369528	9.927656
33	0.369358	10.297014
34	0.369041	10.666055
18	0.368771	11.034826
23	0.368752	11.403578
19	0.368615	11.772193


In [4]:
# Calculate theoretical minimum for each N
# For N trees, the theoretical minimum is when trees are packed perfectly
# Tree dimensions: width ~0.7, height ~1.0
# Area per tree ~0.7 * 1.0 = 0.7
# For N trees, minimum area ~0.7 * N
# Side length ~sqrt(0.7 * N)
# Score contribution = side^2 / N = 0.7 (constant if perfectly packed)

print('\nTheoretical analysis:')
print('If all N values achieved score 0.35 (very tight packing):')
print(f'  Total would be: {0.35 * 200:.2f}')
print('\nCurrent average score per N:', total / 200)
print('Target average score per N:', 68.922808 / 200)


Theoretical analysis:
If all N values achieved score 0.35 (very tight packing):
  Total would be: 70.00

Current average score per N: 0.3533775524371157
Target average score per N: 0.34461404


In [5]:
# Identify N values where we might find improvements
# The jiweiliu kernel uses grid-based initialization which works well for larger N
# Let's see which N values have the worst efficiency

print('\nN values with worst efficiency (score > 0.4):')
high_score_n = [(n, s) for n, s in scores.items() if s > 0.4]
high_score_n.sort(key=lambda x: x[1], reverse=True)
for n, s in high_score_n[:20]:
    print(f'  N={n}: {s:.6f}')


N values with worst efficiency (score > 0.4):
  N=1: 0.661250
  N=2: 0.450779
  N=3: 0.434745
  N=5: 0.416850
  N=4: 0.416545


In [6]:
# The jiweiliu kernel's key techniques:
# 1. Grid-based initialization: ncols x nrows x 2 trees per cell
# 2. SA optimization on grid configurations
# 3. Deletion cascade: iteratively remove tree that minimizes bounding box

# Let's understand the grid configurations
print('\nGrid configurations that could be generated:')
for ncols in range(1, 11):
    for nrows in range(1, 11):
        n_base = 2 * ncols * nrows
        if n_base <= 200:
            print(f'  {ncols}x{nrows} grid: {n_base} trees (base)')


Grid configurations that could be generated:
  1x1 grid: 2 trees (base)
  1x2 grid: 4 trees (base)
  1x3 grid: 6 trees (base)
  1x4 grid: 8 trees (base)
  1x5 grid: 10 trees (base)
  1x6 grid: 12 trees (base)
  1x7 grid: 14 trees (base)
  1x8 grid: 16 trees (base)
  1x9 grid: 18 trees (base)
  1x10 grid: 20 trees (base)
  2x1 grid: 4 trees (base)
  2x2 grid: 8 trees (base)
  2x3 grid: 12 trees (base)
  2x4 grid: 16 trees (base)
  2x5 grid: 20 trees (base)
  2x6 grid: 24 trees (base)
  2x7 grid: 28 trees (base)
  2x8 grid: 32 trees (base)
  2x9 grid: 36 trees (base)
  2x10 grid: 40 trees (base)
  3x1 grid: 6 trees (base)
  3x2 grid: 12 trees (base)
  3x3 grid: 18 trees (base)
  3x4 grid: 24 trees (base)
  3x5 grid: 30 trees (base)
  3x6 grid: 36 trees (base)
  3x7 grid: 42 trees (base)
  3x8 grid: 48 trees (base)
  3x9 grid: 54 trees (base)
  3x10 grid: 60 trees (base)
  4x1 grid: 8 trees (base)
  4x2 grid: 16 trees (base)
  4x3 grid: 24 trees (base)
  4x4 grid: 32 trees (base)
  4x5 g

## Key Findings

1. **Current score**: 70.675510, gap to target is 1.75 points
2. **Worst N values**: N=1, N=2, N=3 contribute most to the score
3. **jiweiliu kernel techniques**:
   - Grid-based initialization generates NOVEL configurations
   - Deletion cascade propagates good large configs to smaller sizes
   - Claims ~0.15 improvement in under 2 minutes

## Recommended Next Steps

1. **IMPLEMENT jiweiliu deletion cascade** - This is the highest priority
2. **IMPLEMENT grid-based initialization with SA** - Generates novel solutions
3. **IMPLEMENT guided refinement from sacuscreed** - Iterative improvement

The target of 68.92 is BELOW the current #1 on the leaderboard (71.19). This means we need to GENERATE fundamentally new solutions, not just optimize existing ones.