# Loop 29 Strategic Analysis

## Current Situation
- **Current best**: 70.316492
- **Target**: 68.873342 (gap: 1.44 points, 2.10%)
- **Current LB #1**: 68.894566 (gap: 1.42 points)
- **Experiments done**: 30
- **Submissions used**: 13/100

## Key Insight
Our target (68.873342) is actually LOWER than the current leaderboard leader (68.894566).
This is an aggressive target that would put us at #1 if achieved.

In [1]:
import pandas as pd
import numpy as np
from numba import njit
import math

# Load current best submission
df = pd.read_csv('/home/submission/submission.csv')
df['N'] = df['id'].str.split('_').str[0].astype(int)

@njit
def make_polygon_template():
    tw=0.15; th=0.2; bw=0.7; mw=0.4; ow=0.25
    tip=0.8; t1=0.5; t2=0.25; base=0.0; tbot=-th
    x=np.array([0,ow/2,ow/4,mw/2,mw/4,bw/2,tw/2,tw/2,-tw/2,-tw/2,-bw/2,-mw/4,-mw/2,-ow/4,-ow/2],np.float64)
    y=np.array([tip,t1,t1,t2,t2,base,base,tbot,tbot,base,base,t2,t2,t1,t1],np.float64)
    return x,y

@njit
def score_group(xs,ys,degs,tx,ty):
    n=xs.size; V=tx.size
    mnx=1e300; mny=1e300; mxx=-1e300; mxy=-1e300
    for i in range(n):
        r=degs[i]*math.pi/180.0
        c=math.cos(r); s=math.sin(r)
        xi=xs[i]; yi=ys[i]
        for j in range(V):
            X=c*tx[j]-s*ty[j]+xi
            Y=s*tx[j]+c*ty[j]+yi
            if X<mnx: mnx=X
            if X>mxx: mxx=X
            if Y<mny: mny=Y
            if Y>mxy: mxy=Y
    side=max(mxx-mnx,mxy-mny)
    return side*side/n

def strip(a):
    return np.array([float(str(v).replace('s','')) for v in a],np.float64)

tx, ty = make_polygon_template()

# Calculate per-N scores
scores = []
for n in range(1, 201):
    g = df[df['N'] == n]
    xs = strip(g['x'].to_numpy())
    ys = strip(g['y'].to_numpy())
    ds = strip(g['deg'].to_numpy())
    sc = score_group(xs, ys, ds, tx, ty)
    scores.append((n, sc))

total = sum(s for _, s in scores)
print(f'Current total: {total:.6f}')
print(f'Target: 68.873342')
print(f'Gap: {total - 68.873342:.6f}')
print(f'Gap per N: {(total - 68.873342)/200:.6f}')

Current total: 70.316492
Target: 68.873342
Gap: 1.443150
Gap per N: 0.007216


In [2]:
# Analyze where improvements might come from
import matplotlib.pyplot as plt

# Per-N scores
n_values = [s[0] for s in scores]
sc_values = [s[1] for s in scores]

# Calculate theoretical minimum (tree area)
tree_area = 0.245625

# Calculate efficiency per N
efficiencies = [tree_area / sc for _, sc in scores]

print('=== Per-N Analysis ===')
print(f'Average score per N: {np.mean(sc_values):.6f}')
print(f'Min score (best): {min(sc_values):.6f} at N={n_values[sc_values.index(min(sc_values))]}')
print(f'Max score (worst): {max(sc_values):.6f} at N={n_values[sc_values.index(max(sc_values))]}')
print(f'Average efficiency: {np.mean(efficiencies)*100:.2f}%')
print(f'Min efficiency: {min(efficiencies)*100:.2f}%')
print(f'Max efficiency: {max(efficiencies)*100:.2f}%')

=== Per-N Analysis ===
Average score per N: 0.351582
Min score (best): 0.329259 at N=181
Max score (worst): 0.661250 at N=1
Average efficiency: 70.16%
Min efficiency: 37.15%
Max efficiency: 74.60%


In [3]:
# What improvement per N do we need to reach target?
target = 68.873342
current = total
gap = current - target

print(f'\n=== Gap Analysis ===')
print(f'Total gap: {gap:.6f}')
print(f'Average gap per N: {gap/200:.6f}')
print(f'Percentage improvement needed: {gap/current*100:.2f}%')

# If we improved each N by the same percentage
improvement_pct = gap / current
print(f'\nIf we improved each N by {improvement_pct*100:.2f}%:')
for n, sc in scores[:10]:
    new_sc = sc * (1 - improvement_pct)
    print(f'  N={n}: {sc:.6f} -> {new_sc:.6f} (save {sc - new_sc:.6f})')


=== Gap Analysis ===
Total gap: 1.443150
Average gap per N: 0.007216
Percentage improvement needed: 2.05%

If we improved each N by 2.05%:
  N=1: 0.661250 -> 0.647679 (save 0.013571)
  N=2: 0.450779 -> 0.441528 (save 0.009252)
  N=3: 0.434745 -> 0.425823 (save 0.008923)
  N=4: 0.416545 -> 0.407996 (save 0.008549)
  N=5: 0.416850 -> 0.408294 (save 0.008555)
  N=6: 0.399610 -> 0.391409 (save 0.008201)
  N=7: 0.399842 -> 0.391636 (save 0.008206)
  N=8: 0.385407 -> 0.377497 (save 0.007910)
  N=9: 0.383047 -> 0.375185 (save 0.007862)
  N=10: 0.376630 -> 0.368900 (save 0.007730)


In [4]:
# Identify N values with most room for improvement
# Compare to theoretical minimum
print('\n=== N values with most room for improvement ===')
print('(Sorted by gap from theoretical minimum)')

gaps_from_min = [(n, sc, sc - tree_area, (sc - tree_area)/sc * 100) for n, sc in scores]
gaps_from_min.sort(key=lambda x: x[2], reverse=True)

print('\nTop 20 N values with largest gap from theoretical minimum:')
for n, sc, gap, pct in gaps_from_min[:20]:
    print(f'  N={n:3d}: score={sc:.6f}, gap={gap:.6f} ({pct:.1f}% above minimum)')


=== N values with most room for improvement ===
(Sorted by gap from theoretical minimum)

Top 20 N values with largest gap from theoretical minimum:
  N=  1: score=0.661250, gap=0.415625 (62.9% above minimum)
  N=  2: score=0.450779, gap=0.205154 (45.5% above minimum)
  N=  3: score=0.434745, gap=0.189120 (43.5% above minimum)
  N=  5: score=0.416850, gap=0.171225 (41.1% above minimum)
  N=  4: score=0.416545, gap=0.170920 (41.0% above minimum)
  N=  7: score=0.399842, gap=0.154217 (38.6% above minimum)
  N=  6: score=0.399610, gap=0.153985 (38.5% above minimum)
  N=  8: score=0.385407, gap=0.139782 (36.3% above minimum)
  N=  9: score=0.383047, gap=0.137422 (35.9% above minimum)
  N= 10: score=0.376630, gap=0.131005 (34.8% above minimum)
  N= 11: score=0.374921, gap=0.129296 (34.5% above minimum)
  N= 15: score=0.374381, gap=0.128756 (34.4% above minimum)
  N= 12: score=0.372724, gap=0.127099 (34.1% above minimum)
  N= 13: score=0.372267, gap=0.126642 (34.0% above minimum)
  N= 20: s

In [5]:
# What would it take to reach target?
print('\n=== Scenarios to reach target ===')

# Scenario 1: Improve all N equally
print('\nScenario 1: Improve all N equally')
needed_per_n = gap / 200
print(f'  Need to save {needed_per_n:.6f} per N on average')
print(f'  That\'s {needed_per_n/np.mean(sc_values)*100:.2f}% improvement per N')

# Scenario 2: Focus on worst N values
print('\nScenario 2: Focus on worst 50 N values')
worst_50 = sorted(scores, key=lambda x: x[1], reverse=True)[:50]
worst_50_total = sum(s for _, s in worst_50)
print(f'  Worst 50 N values contribute: {worst_50_total:.6f}')
print(f'  Need to save {gap:.6f} from these 50 N values')
print(f'  That\'s {gap/worst_50_total*100:.2f}% improvement on worst 50')

# Scenario 3: Focus on small N (highest individual scores)
print('\nScenario 3: Focus on small N (N=1-20)')
small_n = [s for s in scores if s[0] <= 20]
small_n_total = sum(s for _, s in small_n)
print(f'  N=1-20 contribute: {small_n_total:.6f} ({small_n_total/total*100:.1f}% of total)')
print(f'  If we could save {gap:.6f} from N=1-20:')
print(f'  That\'s {gap/small_n_total*100:.2f}% improvement needed')


=== Scenarios to reach target ===

Scenario 1: Improve all N equally
  Need to save 0.000619 per N on average
  That's 0.18% improvement per N

Scenario 2: Focus on worst 50 N values
  Worst 50 N values contribute: 18.956859
  Need to save 0.123722 from these 50 N values
  That's 0.65% improvement on worst 50

Scenario 3: Focus on small N (N=1-20)
  N=1-20 contribute: 8.037953 (11.4% of total)
  If we could save 0.123722 from N=1-20:
  That's 1.54% improvement needed


## Key Observations

1. **The gap is 1.44 points** - this is a significant improvement needed
2. **Top teams have 100-953 submissions** - they iterate heavily to find improvements
3. **Our current score (70.316) is BETTER than all public kernels** - we've already optimized beyond public solutions
4. **The target (68.873) would be #1 on the leaderboard** - this is an aggressive target

## What Top Teams Likely Do

1. **Extended optimization runs** - Days of compute, not hours
2. **Per-N specialization** - Different algorithms for different N ranges
3. **Private solutions** - Solutions shared in private channels (Telegram, Discord)
4. **Massive iteration** - 953 submissions means constant refinement

## Our Options

1. **Run optimization for MUCH longer** (8-24 hours, not 64 minutes)
2. **Try fundamentally different algorithms** (constraint programming, SAT solvers)
3. **Focus on specific N values** where we might find improvements
4. **Search for external solutions** that might be better than what we have