# Loop 1 LB Feedback Analysis

## Submission Results
- **CV Score**: 70.7438
- **LB Score**: 70.7438 (perfect match!)
- **Target**: 68.922808
- **Gap to target**: 1.82 points (2.6%)

## Key Observations
1. Local scoring matches LB perfectly - our validation is correct
2. The pre-optimized baseline from GitHub is a good starting point
3. We need to improve by ~2.6% to beat the target

In [None]:
# Analyze the current submission to understand which N values contribute most to the score
import pandas as pd
import numpy as np
from decimal import Decimal, getcontext
from shapely import affinity
from shapely.geometry import Polygon
from shapely.ops import unary_union

getcontext().prec = 30

# Load submission
df = pd.read_csv('/home/code/experiments/001_baseline/preoptimized_submission.csv')

# Parse values
def parse_value(val):
    if isinstance(val, str) and val.startswith('s'):
        return float(val[1:])
    return float(val)

df['x_val'] = df['x'].apply(parse_value)
df['y_val'] = df['y'].apply(parse_value)
df['deg_val'] = df['deg'].apply(parse_value)
df['n'] = df['id'].apply(lambda x: int(x.split('_')[0]))

print(f'Total rows: {len(df)}')
print(f'Configurations: n=1 to n={df["n"].max()}')

In [None]:
# Tree geometry
TX = [0, 0.125, 0.0625, 0.2, 0.1, 0.35, 0.075, 0.075, -0.075, -0.075, -0.35, -0.1, -0.2, -0.0625, -0.125]
TY = [0.8, 0.5, 0.5, 0.25, 0.25, 0, 0, -0.2, -0.2, 0, 0, 0.25, 0.25, 0.5, 0.5]

def get_tree_polygon(x, y, deg):
    """Create a tree polygon at position (x, y) with rotation deg."""
    r = np.radians(deg)
    c, s = np.cos(r), np.sin(r)
    points = []
    for tx, ty in zip(TX, TY):
        px = c * tx - s * ty + x
        py = s * tx + c * ty + y
        points.append((px, py))
    return Polygon(points)

def get_config_score(group):
    """Calculate score contribution for a configuration."""
    n = len(group)
    polygons = [get_tree_polygon(row['x_val'], row['y_val'], row['deg_val']) 
                for _, row in group.iterrows()]
    bounds = unary_union(polygons).bounds
    side = max(bounds[2] - bounds[0], bounds[3] - bounds[1])
    return side * side / n, side

# Calculate score for each N
scores = []
for n in range(1, 201):
    group = df[df['n'] == n]
    if len(group) > 0:
        score_contrib, side = get_config_score(group)
        scores.append({'n': n, 'score_contrib': score_contrib, 'side': side})

scores_df = pd.DataFrame(scores)
print(f'Total score: {scores_df["score_contrib"].sum():.6f}')

In [None]:
# Find which N values contribute most to the score
scores_df = scores_df.sort_values('score_contrib', ascending=False)
print('Top 20 N values by score contribution:')
print(scores_df.head(20).to_string())

print(f'\nTop 20 contribute: {scores_df.head(20)["score_contrib"].sum():.4f} ({100*scores_df.head(20)["score_contrib"].sum()/scores_df["score_contrib"].sum():.1f}%)')
print(f'Top 50 contribute: {scores_df.head(50)["score_contrib"].sum():.4f} ({100*scores_df.head(50)["score_contrib"].sum()/scores_df["score_contrib"].sum():.1f}%)')

In [None]:
# Calculate theoretical minimum score if we could achieve perfect packing
# For a single tree, the minimum bounding box is when rotated 45 degrees
# Tree dimensions: width=0.7, height=1.0 (from -0.2 to 0.8)

import math

# At 45 degrees, the bounding box diagonal becomes the side
# For a rectangle of w x h, rotated 45 deg, bbox side = (w + h) / sqrt(2)
w, h = 0.7, 1.0
min_side_single = (w + h) / math.sqrt(2)
print(f'Minimum side for n=1 (45 deg rotation): {min_side_single:.6f}')
print(f'Score contribution for n=1: {min_side_single**2:.6f}')

# Check what our current n=1 score is
n1_row = scores_df[scores_df['n'] == 1]
print(f'\nCurrent n=1 side: {n1_row["side"].values[0]:.6f}')
print(f'Current n=1 score: {n1_row["score_contrib"].values[0]:.6f}')

In [None]:
# Analyze the efficiency of each N (how close to theoretical minimum)
# For N trees, a rough lower bound is sqrt(N * tree_area)
tree_area = 0.7 * 1.0 * 0.5  # Approximate tree area (rough estimate)

scores_df_sorted = scores_df.sort_values('n')
scores_df_sorted['theoretical_min_side'] = np.sqrt(scores_df_sorted['n'] * tree_area * 1.5)  # With packing inefficiency
scores_df_sorted['efficiency'] = scores_df_sorted['theoretical_min_side'] / scores_df_sorted['side']

print('Configurations with lowest efficiency (most room for improvement):')
low_eff = scores_df_sorted.sort_values('efficiency').head(20)
print(low_eff[['n', 'side', 'score_contrib', 'efficiency']].to_string())

In [None]:
# Summary statistics
print('\n=== SUMMARY ===')
print(f'Current total score: {scores_df["score_contrib"].sum():.6f}')
print(f'Target score: 68.922808')
print(f'Gap: {scores_df["score_contrib"].sum() - 68.922808:.6f} ({100*(scores_df["score_contrib"].sum() - 68.922808)/68.922808:.2f}%)')

# If we could improve each configuration by 2.6%, what would the new score be?
improved_score = scores_df['score_contrib'].sum() * (1 - 0.026)
print(f'\nIf we improve by 2.6%: {improved_score:.6f}')

# What improvement per configuration is needed?
needed_improvement = (scores_df['score_contrib'].sum() - 68.922808) / 200
print(f'Average improvement needed per N: {needed_improvement:.6f}')

## Next Steps

### Strategy for Improvement

1. **Ensemble approach** (from jonathanchan kernel):
   - Combine best configurations from multiple sources
   - Each N value can come from a different optimized solution

2. **Run bbox3 optimizer**:
   - Extract and compile bbox3.cpp from jazivxt_why-not kernel
   - Run with various parameters (-n iterations, -r radius)
   - Apply fix_direction post-processing

3. **Run tree_packer optimizer**:
   - Extract from smartmanoj_santa-claude kernel
   - Uses Simulated Annealing with swap moves
   - Apply backward propagation (bp.cpp)

4. **Focus on high-contribution N values**:
   - Small N (1-20) contribute significantly due to sÂ²/n formula
   - Optimize these specifically

5. **Fractional translation** (from jonathanchan kernel):
   - Fine-tune positions with very small movements
   - Can squeeze out extra improvements after SA