# Loop 20 LB Feedback Analysis

**exp_019 scored LB=70.3434** (perfect CV-LB match)

The evaluator identified that the MIN_IMPROVEMENT=0.001 threshold was too conservative, leaving 0.02 points on the table.

In [1]:
import pandas as pd
import numpy as np
import json

# Load session state
with open('/home/code/session_state.json', 'r') as f:
    state = json.load(f)

print("=== SUBMISSION HISTORY ===")
for exp in state['experiments'][-10:]:
    lb = exp.get('lb_score', 'pending')
    print(f"{exp['name']}: CV={exp['cv_score']:.6f}, LB={lb}")

print(f"\nTarget: 68.876781")
print(f"Best LB: 70.3434")
print(f"Gap: {70.3434 - 68.876781:.4f} points ({(70.3434 - 68.876781)/68.876781*100:.2f}%)")
print(f"\nSubmissions used: {100 - state['remaining_submissions']}/100")

=== SUBMISSION HISTORY ===
010_safe_ensemble: CV=70.365091, LB=None
011_small_n_optimization: CV=70.364468, LB=None
012_mega_ensemble: CV=70.365091, LB=None
013_selective_threshold: CV=70.342140, LB=None
014_conservative_ensemble: CV=70.365091, LB=None
015_bbox3_aggressive: CV=70.365047, LB=None
016_mega_ensemble_external: CV=70.353516, LB=None
017_extended_ensemble: CV=70.353516, LB=None
018_genetic_algorithm: CV=70.353516, LB=None
019_comprehensive_external_ensemble: CV=70.343408, LB=None

Target: 68.876781
Best LB: 70.3434
Gap: 1.4666 points (2.13%)

Submissions used: 1/100


In [2]:
# Verify our new ensemble score
from numba import njit
import math

@njit
def make_polygon_template():
    tw=0.15; th=0.2; bw=0.7; mw=0.4; ow=0.25
    tip=0.8; t1=0.5; t2=0.25; base=0.0; tbot=-th
    x=np.array([0,ow/2,ow/4,mw/2,mw/4,bw/2,tw/2,tw/2,-tw/2,-tw/2,-bw/2,-mw/4,-mw/2,-ow/4,-ow/2],np.float64)
    y=np.array([tip,t1,t1,t2,t2,base,base,tbot,tbot,base,base,t2,t2,t1,t1],np.float64)
    return x,y

@njit
def score_group(xs,ys,degs,tx,ty):
    n=xs.size; V=tx.size
    mnx=1e300; mny=1e300; mxx=-1e300; mxy=-1e300
    for i in range(n):
        r=degs[i]*math.pi/180.0
        c=math.cos(r); s=math.sin(r)
        xi=xs[i]; yi=ys[i]
        for j in range(V):
            X=c*tx[j]-s*ty[j]+xi
            Y=s*tx[j]+c*ty[j]+yi
            if X<mnx: mnx=X
            if X>mxx: mxx=X
            if Y<mny: mny=Y
            if Y>mxy: mxy=Y
    side=max(mxx-mnx,mxy-mny)
    return side*side/n

def strip(a):
    return np.array([float(str(v).replace('s','')) for v in a],np.float64)

tx, ty = make_polygon_template()

# Check current submission
df = pd.read_csv('/home/submission/submission.csv')
df['N'] = df['id'].str.split('_').str[0].astype(int)

total = 0
for n in range(1, 201):
    g = df[df['N'] == n]
    xs = strip(g['x'].to_numpy())
    ys = strip(g['y'].to_numpy())
    ds = strip(g['deg'].to_numpy())
    sc = score_group(xs, ys, ds, tx, ty)
    total += sc

print(f"Current submission score: {total:.6f}")
print(f"Improvement over exp_019: {70.343408 - total:.6f}")
print(f"Gap to target: {total - 68.876781:.6f} ({(total - 68.876781)/68.876781*100:.2f}%)")

Current submission score: 70.316579
Improvement over exp_019: 0.026829
Gap to target: 1.439798 (2.09%)


## Key Findings

1. **exp_019 achieved LB=70.3434** - perfect CV-LB match (this is a deterministic problem)
2. **New ensemble achieves 70.3166** - improvement of 0.027 over exp_019
3. **Gap to target: 1.44 points (2.1%)**

## What Worked
- Using why-not as base (validated, no overlaps)
- Lowering MIN_IMPROVEMENT threshold to 1e-10
- Comprehensive ensemble from all external sources

## What Didn't Work
- MIN_IMPROVEMENT=0.001 was too conservative (rejected 48 valid improvements)
- Many external files have overlapping trees (2517 rejections)

## Path Forward
1. **Submit the new 70.3166 ensemble** - should get LB=70.3166
2. **Continue searching for better external sources**
3. **Consider running bbox3 for extended periods** to find new solutions
4. **Gap is still 1.44 points** - need fundamentally different approach