# Loop 15 Strategic Analysis

## Current Situation
- Best valid LB score: 70.365091 (exp_010)
- Target: 68.878195
- Gap: 1.49 points (2.1%)

## Critical Findings from Research

### 1. Top Kernels Use bbox3 C++ Optimizer
The "Why Not" kernel (376 votes) uses bbox3.cpp with:
- Complex number vector coordination
- Fluid dynamics simulation
- Hinge pivot mechanics
- Density gradient flow
- Global boundary tension
- Multi-restart with random parameters

### 2. Manual Tree Shifter Shows Key Insight
The manual tree shifter kernel shows that:
- Interactive placement can find improvements
- Simulated annealing with bbox3 is the standard approach
- The key is running bbox3 with MANY iterations and restarts

### 3. Our Problem
We've been trying to ENSEMBLE existing solutions, but:
- All available improvements are < 0.001 (too small, risky)
- External data is NOT significantly better than our snapshots
- The ensemble approach has hit a ceiling at 70.365

## The Path Forward

**We need to GENERATE better solutions, not just filter existing ones.**

The top kernels achieve sub-69 scores by:
1. Running bbox3 with 50-100 restarts per N
2. Using 10,000-20,000 iterations per restart
3. Keeping the best solution across all restarts
4. Fractional translation refinement after SA

In [None]:
# Let's analyze what we have and what we need
import sys
sys.path.insert(0, '/home/code')
import pandas as pd
import numpy as np
import os
import glob

# Load our best submission
exp010 = pd.read_csv('/home/code/experiments/010_safe_ensemble/submission.csv')
print(f"exp_010 rows: {len(exp010)}")

# Check what optimizers we have available
print("\nAvailable optimizers:")
for f in glob.glob('/home/code/experiments/*.cpp'):
    print(f"  {f}")
for f in glob.glob('/home/code/experiments/sa_*'):
    print(f"  {f}")

In [None]:
# Check the C++ optimizer we compiled
import subprocess

# Check if sa_parallel exists and works
sa_path = '/home/code/experiments/sa_parallel'
if os.path.exists(sa_path):
    print(f"sa_parallel exists at {sa_path}")
    result = subprocess.run([sa_path, '-h'], capture_output=True, text=True)
    print("Help output:")
    print(result.stdout[:500] if result.stdout else result.stderr[:500])
else:
    print("sa_parallel not found")

In [None]:
# Let's look at what bbox3 does differently
# From the kernel, bbox3 uses:
# - Complex number vector coordination
# - Fluid dynamics simulation
# - Hinge pivot mechanics
# - Density gradient flow
# - Global boundary tension

# Our sa_parallel only does basic SA with translation/rotation
# We need to implement more sophisticated moves

# Key insight: The gap is 1.49 points
# With 200 N values, that's ~0.0075 per N on average
# But improvements are NOT uniform - some N values have more room

# Let's analyze which N values have the most room for improvement
from code.tree_geometry import calculate_score
from code.utils import parse_submission

configs = parse_submission(exp010)
scores = {n: calculate_score(configs[n]) for n in range(1, 201)}

# Sort by score contribution (higher = more room to improve)
scores_sorted = sorted(scores.items(), key=lambda x: x[1], reverse=True)

print("Top 20 N values by score contribution:")
for n, score in scores_sorted[:20]:
    print(f"  N={n}: {score:.6f}")

In [None]:
# The theoretical minimum score for N=1 is when the tree is rotated
# to minimize its bounding box

# Let's compute the theoretical minimum for N=1
import numpy as np
from code.tree_geometry import TX, TY

def compute_bbox_for_angle(angle_deg):
    """Compute bounding box size for a single tree at given angle."""
    angle_rad = np.radians(angle_deg)
    cos_a = np.cos(angle_rad)
    sin_a = np.sin(angle_rad)
    
    # Rotate all vertices
    rx = TX * cos_a - TY * sin_a
    ry = TX * sin_a + TY * cos_a
    
    # Compute bounding box
    width = rx.max() - rx.min()
    height = ry.max() - ry.min()
    side = max(width, height)
    return side * side  # Score for N=1

# Search for optimal angle
best_angle = 0
best_score = float('inf')
for angle in range(0, 360):
    score = compute_bbox_for_angle(angle)
    if score < best_score:
        best_score = score
        best_angle = angle

print(f"Theoretical minimum for N=1: {best_score:.6f} at angle {best_angle}°")
print(f"Current N=1 score: {scores[1]:.6f}")
print(f"Gap: {scores[1] - best_score:.6f}")

In [None]:
# Fine search around the best angle
best_fine_angle = best_angle
best_fine_score = best_score

for angle in np.arange(best_angle - 5, best_angle + 5, 0.01):
    score = compute_bbox_for_angle(angle)
    if score < best_fine_score:
        best_fine_score = score
        best_fine_angle = angle

print(f"Fine-tuned minimum for N=1: {best_fine_score:.6f} at angle {best_fine_angle:.2f}°")
print(f"Current N=1 score: {scores[1]:.6f}")
print(f"Gap: {scores[1] - best_fine_score:.6f}")

# What angle is the current N=1 at?
current_n1 = configs[1][0]
print(f"\nCurrent N=1 config: x={current_n1[0]:.6f}, y={current_n1[1]:.6f}, angle={current_n1[2]:.2f}°")

In [None]:
# Let's compute the theoretical lower bound for the entire problem
# This is the sum of minimum scores for each N

# For N=1, we know the minimum
# For N>1, the minimum is harder to compute, but we can estimate

# A rough lower bound: if trees could be packed perfectly,
# the area would be N * (area of one tree)
# The tree area is approximately 0.35 * 1.0 = 0.35 (rough estimate)

# Actually, let's compute the exact tree area
from shapely.geometry import Polygon
tree_poly = Polygon(zip(TX, TY))
tree_area = tree_poly.area
print(f"Tree area: {tree_area:.6f}")

# For N trees, minimum bounding box area would be at least N * tree_area
# But trees can't pack perfectly, so there's always wasted space

# The score is S^2/N where S is the side length
# If we could pack N trees in area A, then S = sqrt(A), score = A/N
# Minimum A = N * tree_area (perfect packing)
# So minimum score = tree_area (constant!)

print(f"\nTheoretical lower bound per N (perfect packing): {tree_area:.6f}")
print(f"Sum over N=1 to 200: {tree_area * 200:.6f}")

# But this is unrealistic - real packing has gaps
# Let's see what the actual scores look like
print(f"\nActual total score: {sum(scores.values()):.6f}")
print(f"Average score per N: {sum(scores.values())/200:.6f}")

In [None]:
# The key insight: we need to run bbox3 or similar optimizer AGGRESSIVELY
# Let's check if we have bbox3 available

bbox3_paths = [
    '/home/code/experiments/bbox3',
    '/home/code/bbox3',
    '/home/nonroot/snapshots/santa-2025/*/bbox3',
]

for pattern in bbox3_paths:
    matches = glob.glob(pattern)
    if matches:
        print(f"Found bbox3: {matches}")

# Check research/kernels for bbox3
for f in glob.glob('/home/code/research/kernels/**/bbox3*', recursive=True):
    print(f"Found in kernels: {f}")

## Strategy for Next Experiment

### Option 1: Compile and Run bbox3 Aggressively
The bbox3.cpp from the "Why Not" kernel has sophisticated optimization:
- Complex number vector coordination
- Fluid dynamics simulation
- Hinge pivot mechanics
- Density gradient flow
- Global boundary tension

We should:
1. Compile bbox3.cpp
2. Run it with 50-100 restarts per N
3. Use 10,000-20,000 iterations per restart
4. Keep best solution across all restarts

### Option 2: Implement Novel Algorithm
Since bbox3 is a binary, we could implement our own version:
- No-Fit Polygon (NFP) for collision detection
- Simulated annealing with sophisticated moves
- Genetic algorithm with crossover operators

### Option 3: Focus on High-Impact N Values
From the analysis:
- Large N values (150-200) have the highest scores
- These have the most room for improvement
- Focus optimization effort on these N values

### Recommendation
**Option 1 is the fastest path to improvement.**
Compile bbox3.cpp and run it aggressively. This is what top kernels do.