# Loop 1 LB Feedback Analysis

**Baseline submitted:** CV 70.6473 | LB 70.6473 (gap: 0.0000)

This is a perfect CV-LB alignment, which is expected for an optimization problem where the score is deterministic.

**Target:** 68.894234
**Gap to beat:** 1.753 points (2.5%)

In [1]:
import pandas as pd
import numpy as np
import json

# Load baseline metrics
with open('/home/code/experiments/001_baseline/metrics.json', 'r') as f:
    baseline_metrics = json.load(f)

print(f"Baseline CV Score: {baseline_metrics['cv_score']:.6f}")
print(f"Target: {baseline_metrics['target']}")
print(f"Gap: {baseline_metrics['gap_to_target']:.6f}")

Baseline CV Score: 70.647327
Target: 68.894234
Gap: 1.753093


In [2]:
# Analyze per-N scores from the snapshot metrics
import os

# Load the detailed per-N scores from the snapshot
snapshot_metrics_path = '/home/nonroot/snapshots/santa-2025/21328309254/code/experiments/001_baseline/metrics.json'
with open(snapshot_metrics_path, 'r') as f:
    detailed_metrics = json.load(f)

scores_by_n = detailed_metrics['scores_by_n']

# Convert to DataFrame for analysis
scores_df = pd.DataFrame([
    {'n': int(k), 'score': v} for k, v in scores_by_n.items()
]).sort_values('n')

print(f"Total N values: {len(scores_df)}")
print(f"\nTop 20 worst N values (highest score contribution):")
worst_n = scores_df.nlargest(20, 'score')
for _, row in worst_n.iterrows():
    print(f"  N={int(row['n']):3d}: {row['score']:.6f}")

Total N values: 200

Top 20 worst N values (highest score contribution):
  N=  1: 0.661250
  N=  2: 0.450779
  N=  3: 0.434745
  N=  5: 0.416850
  N=  4: 0.416545
  N=  7: 0.399897
  N=  6: 0.399610
  N=  9: 0.387415
  N=  8: 0.385407
  N= 15: 0.379203
  N= 10: 0.376630
  N= 21: 0.376451
  N= 20: 0.376057
  N= 22: 0.375258
  N= 11: 0.374924
  N= 16: 0.374128
  N= 26: 0.373997
  N= 12: 0.372724
  N= 13: 0.372294
  N= 25: 0.372144


In [3]:
# Calculate cumulative score contribution
scores_df = scores_df.sort_values('n')
scores_df['cumulative'] = scores_df['score'].cumsum()
scores_df['pct_of_total'] = 100 * scores_df['cumulative'] / scores_df['score'].sum()

print("\nCumulative score contribution:")
for n in [10, 20, 30, 50, 100, 150, 200]:
    row = scores_df[scores_df['n'] == n].iloc[0]
    print(f"  N=1-{n}: {row['cumulative']:.4f} ({row['pct_of_total']:.1f}% of total)")


Cumulative score contribution:
  N=1-10: 4.3291 (6.1% of total)
  N=1-20: 8.0554 (11.4% of total)
  N=1-30: 11.7443 (16.6% of total)
  N=1-50: 19.0399 (27.0% of total)
  N=1-100: 36.6678 (51.9% of total)
  N=1-150: 53.8043 (76.2% of total)
  N=1-200: 70.6473 (100.0% of total)


In [4]:
# Key insight: How much improvement is needed per N range?
target = 68.894234
current = baseline_metrics['cv_score']
gap = current - target

print(f"\nTotal gap to close: {gap:.6f}")
print(f"\nIf we improve each N by the same percentage:")
required_pct = 100 * (1 - target/current)
print(f"  Required improvement: {required_pct:.2f}%")

print(f"\nIf we focus on small N (1-20):")
small_n_score = scores_df[scores_df['n'] <= 20]['score'].sum()
print(f"  Current small N score: {small_n_score:.4f}")
print(f"  If we reduce small N by 50%: saves {small_n_score * 0.5:.4f}")
print(f"  If we reduce small N by 30%: saves {small_n_score * 0.3:.4f}")


Total gap to close: 1.753093

If we improve each N by the same percentage:
  Required improvement: 2.48%

If we focus on small N (1-20):
  Current small N score: 8.0554
  If we reduce small N by 50%: saves 4.0277
  If we reduce small N by 30%: saves 2.4166


In [5]:
# Compare with theoretical minimum
# For a single tree (N=1), the minimum bounding box is the tree's own bounding box
# Tree dimensions: width ~0.7, height ~1.0 (from -0.2 to 0.8)
# At 45 degrees rotation, the bounding box is minimized

import math

# Tree vertices
TX = [0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125]
TY = [0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5]

def get_bbox_at_angle(angle_deg):
    angle_rad = math.radians(angle_deg)
    cos_a = math.cos(angle_rad)
    sin_a = math.sin(angle_rad)
    
    rotated_x = [cos_a * x - sin_a * y for x, y in zip(TX, TY)]
    rotated_y = [sin_a * x + cos_a * y for x, y in zip(TX, TY)]
    
    width = max(rotated_x) - min(rotated_x)
    height = max(rotated_y) - min(rotated_y)
    return max(width, height)

# Find optimal angle for N=1
best_angle = 0
best_side = float('inf')
for angle in range(0, 360, 1):
    side = get_bbox_at_angle(angle)
    if side < best_side:
        best_side = side
        best_angle = angle

print(f"N=1 optimal angle: {best_angle}째 with side={best_side:.6f}")
print(f"N=1 optimal score: {best_side**2:.6f}")
print(f"N=1 current score: {scores_df[scores_df['n'] == 1]['score'].values[0]:.6f}")

N=1 optimal angle: 45째 with side=0.813173
N=1 optimal score: 0.661250
N=1 current score: 0.661250


In [6]:
# Summary of findings
print("="*60)
print("KEY FINDINGS FOR STRATEGY")
print("="*60)
print(f"\n1. CV-LB alignment is PERFECT (gap = 0.0000)")
print(f"   This means CV improvements will directly translate to LB improvements.")
print(f"\n2. Gap to target: {gap:.4f} points ({required_pct:.2f}%)")
print(f"\n3. Small N (1-20) contribute {small_n_score:.4f} points ({100*small_n_score/current:.1f}% of total)")
print(f"   These are the highest-leverage targets.")
print(f"\n4. N=1 is already at optimal (45째 rotation)")
print(f"   Score: {scores_df[scores_df['n'] == 1]['score'].values[0]:.6f}")
print(f"\n5. Top improvement opportunities:")
for _, row in worst_n.head(10).iterrows():
    n = int(row['n'])
    score = row['score']
    # Estimate potential improvement (assume 10% possible)
    potential = score * 0.1
    print(f"   N={n:3d}: current={score:.4f}, potential savings={potential:.4f}")

KEY FINDINGS FOR STRATEGY

1. CV-LB alignment is PERFECT (gap = 0.0000)
   This means CV improvements will directly translate to LB improvements.

2. Gap to target: 1.7531 points (2.48%)

3. Small N (1-20) contribute 8.0554 points (11.4% of total)
   These are the highest-leverage targets.

4. N=1 is already at optimal (45째 rotation)
   Score: 0.661250

5. Top improvement opportunities:
   N=  1: current=0.6613, potential savings=0.0661
   N=  2: current=0.4508, potential savings=0.0451
   N=  3: current=0.4347, potential savings=0.0435
   N=  5: current=0.4168, potential savings=0.0417
   N=  4: current=0.4165, potential savings=0.0417
   N=  7: current=0.3999, potential savings=0.0400
   N=  6: current=0.3996, potential savings=0.0400
   N=  9: current=0.3874, potential savings=0.0387
   N=  8: current=0.3854, potential savings=0.0385
   N= 15: current=0.3792, potential savings=0.0379
