# Loop 11 LB Feedback Analysis

## Submission Results
- exp_001: CV 70.7343 | LB 70.7343 (gap: 0.0000)
- exp_010: CV 70.7343 | LB 70.7343 (gap: 0.0000)

## Key Observations
1. CV = LB exactly - this is a deterministic optimization problem, not ML
2. After 11 experiments, ALL achieved the SAME score: 70.734327
3. Target is 68.922808, gap is 1.81 points (2.6%)
4. The overlapping configurations score 67.727 (below target!) but have overlaps

## Critical Question
How do we escape the local optimum at 70.734327?

In [1]:
import pandas as pd
import numpy as np
import os

# Load the baseline to analyze its structure
baseline_path = '/home/code/experiments/011_mip_cpsat/baseline.csv'
df = pd.read_csv(baseline_path)

print(f"Total rows: {len(df)}")
print(f"Columns: {df.columns.tolist()}")
print(df.head(10))

Total rows: 20100
Columns: ['id', 'x', 'y', 'deg']
      id                       x                        y  \
0  001_0  s43.591192092102147626  s-31.783267068741778871   
1  002_0   s0.154097069621360605   s-0.038540742694777107   
2  002_1  s-0.154097069621359162   s-0.561459257305227277   
3  003_0   s1.131270585068746337    s0.792202872326948637   
4  003_1   s1.234055695842160016    s1.275999500663759001   
5  003_2   s0.641714640229074984    s1.180458566613381111   
6  004_0   s-0.32474778958576689     s0.13210997810099298   
7  004_1    s0.31535434619379671     s0.13210997807037927   
8  004_2    s0.32474778958571543    s-0.73210997807598177   
9  004_3   s-0.31535434818683689    s-0.73210997810117096   

                       deg  
0   s44.999999999999978684  
1  s203.629377730650162448  
2   s23.629377730649704148  
3  s113.563260441729482864  
4   s66.370622269343002131  
5  s155.134051937100821306  
6   s156.37062214328020104  
7   s156.37062227193740682  
8   s336.3706222

In [2]:
# Analyze the angle patterns in the baseline
import numpy as np

def parse_value(v):
    return float(str(v).replace('s', ''))

# Extract all angles
angles = []
for _, row in df.iterrows():
    deg = parse_value(row['deg'])
    angles.append(deg % 360)

angles = np.array(angles)
print(f"Total trees: {len(angles)}")
print(f"Unique angles (rounded to 0.1): {len(np.unique(np.round(angles, 1)))}")

# Most common angles
from collections import Counter
angle_counts = Counter(np.round(angles, 1))
print("\nTop 20 most common angles:")
for angle, count in angle_counts.most_common(20):
    print(f"  {angle:.1f}°: {count} trees")

Total trees: 20100
Unique angles (rounded to 0.1): 1582

Top 20 most common angles:
  68.2°: 768 trees
  248.2°: 746 trees
  68.4°: 238 trees
  248.3°: 233 trees
  248.4°: 231 trees
  68.3°: 218 trees
  248.1°: 203 trees
  259.5°: 199 trees
  79.5°: 194 trees
  68.1°: 190 trees
  158.2°: 167 trees
  338.2°: 159 trees
  248.5°: 140 trees
  68.5°: 123 trees
  68.6°: 122 trees
  248.6°: 115 trees
  79.4°: 113 trees
  259.7°: 108 trees
  79.7°: 107 trees
  259.4°: 106 trees


In [3]:
# Analyze per-N scores and identify where improvements are theoretically possible
TX = np.array([0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125])
TY = np.array([0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5])

def get_tree_polygon(cx, cy, deg):
    rad = np.radians(deg)
    c, s = np.cos(rad), np.sin(rad)
    x = TX * c - TY * s + cx
    y = TX * s + TY * c + cy
    return x, y

def score_trees(trees):
    n = len(trees)
    if n == 0:
        return float('inf')
    all_x, all_y = [], []
    for x, y, deg in trees:
        px, py = get_tree_polygon(x, y, deg)
        all_x.extend(px)
        all_y.extend(py)
    side = max(max(all_x) - min(all_x), max(all_y) - min(all_y))
    return side * side / n

def parse_config(df, n):
    rows = df[df['id'].str.startswith(f'{n:03d}_')]
    trees = []
    for _, row in rows.iterrows():
        x = parse_value(row['x'])
        y = parse_value(row['y'])
        deg = parse_value(row['deg'])
        trees.append((x, y, deg))
    return trees

# Calculate per-N scores
per_n_scores = []
for n in range(1, 201):
    trees = parse_config(df, n)
    score = score_trees(trees)
    per_n_scores.append({'n': n, 'score': score, 'contribution': score})

per_n_df = pd.DataFrame(per_n_scores)
print(f"Total score: {per_n_df['score'].sum():.6f}")
print(f"\nTop 10 N values with highest scores (most room for improvement):")
print(per_n_df.nlargest(10, 'score')[['n', 'score']])

Total score: 70.734327

Top 10 N values with highest scores (most room for improvement):
     n     score
0    1  0.661250
1    2  0.450779
2    3  0.434745
4    5  0.416850
3    4  0.416545
6    7  0.399897
5    6  0.399610
8    9  0.387415
7    8  0.385407
14  15  0.379203


In [4]:
# Load the overlapping CSV to compare
overlap_path = '/home/nonroot/snapshots/santa-2025/21105319338/code/datasets/santa25-public/submission_67_727.csv'
if os.path.exists(overlap_path):
    overlap_df = pd.read_csv(overlap_path)
    
    # Compare per-N scores
    comparison = []
    for n in range(1, 201):
        baseline_trees = parse_config(df, n)
        overlap_trees = parse_config(overlap_df, n)
        
        baseline_score = score_trees(baseline_trees)
        overlap_score = score_trees(overlap_trees)
        
        comparison.append({
            'n': n,
            'baseline': baseline_score,
            'overlap': overlap_score,
            'diff': baseline_score - overlap_score
        })
    
    comp_df = pd.DataFrame(comparison)
    print(f"Baseline total: {comp_df['baseline'].sum():.6f}")
    print(f"Overlap total: {comp_df['overlap'].sum():.6f}")
    print(f"Total potential improvement: {comp_df['diff'].sum():.6f}")
    
    print("\nTop 20 N values with biggest potential improvement:")
    print(comp_df.nlargest(20, 'diff')[['n', 'baseline', 'overlap', 'diff']])
else:
    print(f"Overlap file not found at {overlap_path}")

Overlap file not found at /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/santa25-public/submission_67_727.csv


In [5]:
# Check what datasets are available
import glob

print("Available datasets:")
for path in glob.glob('/home/nonroot/snapshots/santa-2025/*/code/datasets/*'):
    print(f"  {path}")

print("\nCSV files in santa25-public:")
for path in glob.glob('/home/nonroot/snapshots/santa-2025/*/code/datasets/santa25-public/*.csv'):
    print(f"  {os.path.basename(path)}")

Available datasets:
  /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/smartmanoj_submission.csv
  /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/santa25-public
  /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/santa-2025-csv
  /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/bucket-of-chump
  /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/telegram-public
  /home/nonroot/snapshots/santa-2025/21105319338/code/datasets/santa-2025-try3

CSV files in santa25-public:
  submission_JKoT4.csv
  New_Tree_144_196.csv
  submission_JKoT3.csv
  santa2025_ver2_v61.csv
  submission_JKoT2.csv
  santa2025_ver2_v67.csv
  santa2025_ver2_v76.csv
  submission_70_936673758122.csv
  santa2025_ver2_v65.csv
  submission_70_926149550346.csv
  santa2025_ver2_v66.csv
  santa2025_ver2_v63.csv
  santa2025_ver2_v69.csv
  submission_JKoT1.csv
  submission_opt1.csv
  santa2025_ver2_v68.csv


In [6]:
# Analyze the structure of the best configurations for small N
print("Analyzing baseline configurations for N=2-10:")
print("="*70)

for n in range(2, 11):
    trees = parse_config(df, n)
    score = score_trees(trees)
    
    print(f"\nN={n}, score: {score:.6f}")
    
    # Analyze angles
    angles = [t[2] % 360 for t in trees]
    unique_angles = sorted(set(np.round(angles, 1)))
    print(f"  Unique angles: {unique_angles}")
    
    # Analyze positions
    xs = [t[0] for t in trees]
    ys = [t[1] for t in trees]
    print(f"  X range: [{min(xs):.4f}, {max(xs):.4f}]")
    print(f"  Y range: [{min(ys):.4f}, {max(ys):.4f}]")

Analyzing baseline configurations for N=2-10:

N=2, score: 0.450779
  Unique angles: [np.float64(23.6), np.float64(203.6)]
  X range: [-0.1541, 0.1541]
  Y range: [-0.5615, -0.0385]

N=3, score: 0.434745
  Unique angles: [np.float64(66.4), np.float64(113.6), np.float64(155.1)]
  X range: [0.6417, 1.2341]
  Y range: [0.7922, 1.2760]

N=4, score: 0.416545
  Unique angles: [np.float64(156.4), np.float64(336.4)]
  X range: [-0.3247, 0.3247]
  Y range: [-0.7321, 0.1321]

N=5, score: 0.416850
  Unique angles: [np.float64(23.6), np.float64(66.4), np.float64(112.6), np.float64(207.5), np.float64(293.6)]
  X range: [-0.4606, 0.4606]
  Y range: [-0.7740, 0.1571]

N=6, score: 0.399610
  Unique angles: [np.float64(23.6), np.float64(158.9), np.float64(246.4), np.float64(293.6), np.float64(338.9)]
  X range: [-0.5043, 0.5043]
  Y range: [-0.8071, 0.2071]

N=7, score: 0.399897
  Unique angles: [np.float64(27.8), np.float64(37.6), np.float64(65.2), np.float64(207.8), np.float64(213.5), np.float64(252.

## Key Insights

1. **The baseline is at a strong local optimum** - 11 experiments with diverse approaches all converged to 70.734327

2. **The overlapping configurations show 3.0 points of potential** - but they have overlaps in N=3-32

3. **The angles are NOT simple multiples** - they use specific angles like 23.6°, 66.4°, 113.6° that allow efficient interlocking

4. **The target (68.922808) IS achievable** - it's on the leaderboard, so valid configurations exist

## Potential Approaches to Try

1. **Constructive heuristics** - Build packings tree-by-tree with intelligent placement
2. **Minkowski sum / No-fit polygon** - Use geometric constraints to find valid placements
3. **Gradual constraint tightening** - Start from overlapping configs and gradually eliminate overlaps
4. **Different angle sets** - Try the specific angles used in the overlapping configs
5. **Pattern extraction** - Analyze the overlapping configs to find valid sub-patterns