# Loop 11 LB Feedback Analysis

## Submission Results
- exp_001: CV 70.7343 | LB 70.7343 (gap: 0.0000)
- exp_010: CV 70.7343 | LB 70.7343 (gap: 0.0000)

## Key Observations
1. CV = LB exactly - this is a deterministic optimization problem, not ML
2. After 11 experiments, ALL achieved the SAME score: 70.734327
3. Target is 68.922808, gap is 1.81 points (2.6%)
4. The overlapping configurations score 67.727 (below target!) but have overlaps

## Critical Question
How do we escape the local optimum at 70.734327?

In [None]:
import pandas as pd
import numpy as np
import os

# Load the baseline to analyze its structure
baseline_path = '/home/code/experiments/011_mip_cpsat/baseline.csv'
df = pd.read_csv(baseline_path)

print(f"Total rows: {len(df)}")
print(f"Columns: {df.columns.tolist()}")
print(df.head(10))

In [None]:
# Analyze the angle patterns in the baseline
import numpy as np

def parse_value(v):
    return float(str(v).replace('s', ''))

# Extract all angles
angles = []
for _, row in df.iterrows():
    deg = parse_value(row['deg'])
    angles.append(deg % 360)

angles = np.array(angles)
print(f"Total trees: {len(angles)}")
print(f"Unique angles (rounded to 0.1): {len(np.unique(np.round(angles, 1)))}")

# Most common angles
from collections import Counter
angle_counts = Counter(np.round(angles, 1))
print("\nTop 20 most common angles:")
for angle, count in angle_counts.most_common(20):
    print(f"  {angle:.1f}째: {count} trees")

In [None]:
# Analyze per-N scores and identify where improvements are theoretically possible
TX = np.array([0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125])
TY = np.array([0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5])

def get_tree_polygon(cx, cy, deg):
    rad = np.radians(deg)
    c, s = np.cos(rad), np.sin(rad)
    x = TX * c - TY * s + cx
    y = TX * s + TY * c + cy
    return x, y

def score_trees(trees):
    n = len(trees)
    if n == 0:
        return float('inf')
    all_x, all_y = [], []
    for x, y, deg in trees:
        px, py = get_tree_polygon(x, y, deg)
        all_x.extend(px)
        all_y.extend(py)
    side = max(max(all_x) - min(all_x), max(all_y) - min(all_y))
    return side * side / n

def parse_config(df, n):
    rows = df[df['id'].str.startswith(f'{n:03d}_')]
    trees = []
    for _, row in rows.iterrows():
        x = parse_value(row['x'])
        y = parse_value(row['y'])
        deg = parse_value(row['deg'])
        trees.append((x, y, deg))
    return trees

# Calculate per-N scores
per_n_scores = []
for n in range(1, 201):
    trees = parse_config(df, n)
    score = score_trees(trees)
    per_n_scores.append({'n': n, 'score': score, 'contribution': score})

per_n_df = pd.DataFrame(per_n_scores)
print(f"Total score: {per_n_df['score'].sum():.6f}")
print(f"\nTop 10 N values with highest scores (most room for improvement):")
print(per_n_df.nlargest(10, 'score')[['n', 'score']])

In [None]:
# Load the overlapping CSV to compare
overlap_path = '/home/nonroot/snapshots/santa-2025/21105319338/code/datasets/santa25-public/submission_67_727.csv'
if os.path.exists(overlap_path):
    overlap_df = pd.read_csv(overlap_path)
    
    # Compare per-N scores
    comparison = []
    for n in range(1, 201):
        baseline_trees = parse_config(df, n)
        overlap_trees = parse_config(overlap_df, n)
        
        baseline_score = score_trees(baseline_trees)
        overlap_score = score_trees(overlap_trees)
        
        comparison.append({
            'n': n,
            'baseline': baseline_score,
            'overlap': overlap_score,
            'diff': baseline_score - overlap_score
        })
    
    comp_df = pd.DataFrame(comparison)
    print(f"Baseline total: {comp_df['baseline'].sum():.6f}")
    print(f"Overlap total: {comp_df['overlap'].sum():.6f}")
    print(f"Total potential improvement: {comp_df['diff'].sum():.6f}")
    
    print("\nTop 20 N values with biggest potential improvement:")
    print(comp_df.nlargest(20, 'diff')[['n', 'baseline', 'overlap', 'diff']])
else:
    print(f"Overlap file not found at {overlap_path}")

In [None]:
# Check what datasets are available
import glob

print("Available datasets:")
for path in glob.glob('/home/nonroot/snapshots/santa-2025/*/code/datasets/*'):
    print(f"  {path}")

print("\nCSV files in santa25-public:")
for path in glob.glob('/home/nonroot/snapshots/santa-2025/*/code/datasets/santa25-public/*.csv'):
    print(f"  {os.path.basename(path)}")

In [None]:
# Analyze the structure of the best configurations for small N
print("Analyzing baseline configurations for N=2-10:")
print("="*70)

for n in range(2, 11):
    trees = parse_config(df, n)
    score = score_trees(trees)
    
    print(f"\nN={n}, score: {score:.6f}")
    
    # Analyze angles
    angles = [t[2] % 360 for t in trees]
    unique_angles = sorted(set(np.round(angles, 1)))
    print(f"  Unique angles: {unique_angles}")
    
    # Analyze positions
    xs = [t[0] for t in trees]
    ys = [t[1] for t in trees]
    print(f"  X range: [{min(xs):.4f}, {max(xs):.4f}]")
    print(f"  Y range: [{min(ys):.4f}, {max(ys):.4f}]")

## Key Insights

1. **The baseline is at a strong local optimum** - 11 experiments with diverse approaches all converged to 70.734327

2. **The overlapping configurations show 3.0 points of potential** - but they have overlaps in N=3-32

3. **The angles are NOT simple multiples** - they use specific angles like 23.6째, 66.4째, 113.6째 that allow efficient interlocking

4. **The target (68.922808) IS achievable** - it's on the leaderboard, so valid configurations exist

## Potential Approaches to Try

1. **Constructive heuristics** - Build packings tree-by-tree with intelligent placement
2. **Minkowski sum / No-fit polygon** - Use geometric constraints to find valid placements
3. **Gradual constraint tightening** - Start from overlapping configs and gradually eliminate overlaps
4. **Different angle sets** - Try the specific angles used in the overlapping configs
5. **Pattern extraction** - Analyze the overlapping configs to find valid sub-patterns