# Loop 3 Analysis: Understanding the Gap to Target

## Key Findings from Web Search:
- **Current LB Leader**: 71.19 (terry_u16)
- **Our Score**: 70.676 (saspav baseline)
- **Target**: 68.922808

**CRITICAL INSIGHT**: The target (68.92) is BETTER than the current leaderboard leader (71.19)!
This means the target may be the theoretical optimum or a very aggressive goal.

In [1]:
import pandas as pd
import numpy as np
import json

# Load session state to understand what we've tried
with open('/home/code/session_state.json', 'r') as f:
    state = json.load(f)

print("=== EXPERIMENTS SUMMARY ===")
for exp in state['experiments']:
    print(f"{exp['id']}: {exp['name']} | CV: {exp['cv_score']} | Notes: {exp['notes'][:100]}...")

print("\n=== SUBMISSIONS ===")
for sub in state['submissions']:
    print(f"{sub['model_name']}: CV={sub['cv_score']}, LB={sub['lb_score']}, Error={sub.get('error', 'None')}")

=== EXPERIMENTS SUMMARY ===
exp_000: 001_baseline_preoptimized | CV: 70.676102 | Notes: Baseline using pre-optimized saspav dataset (santa-2025-csv). Downloaded pre-optimized solutions fro...
exp_001: 002_ensemble_all_sources | CV: 70.676102 | Notes: Attempted to ensemble 20 different pre-optimized solutions from multiple sources (saspav, bucket-of-...
exp_002: 003_lattice_backward_prop | CV: 70.676102 | Notes: Attempted two approaches: (1) Backward propagation from N=200 to N=2 - showed 0 improvements, confir...

=== SUBMISSIONS ===
001_baseline_preoptimized: CV=70.676102, LB=70.676102398091, Error=None
002_ensemble_all_sources: CV=70.676102, LB=, Error=Overlapping trees in group 040


In [2]:
# Analyze the score breakdown
# From previous analysis: Small N (1-50): 19.04, Medium N (51-100): 17.64, Large N (101-200): 33.99

print("=== SCORE BREAKDOWN ===")
print("Small N (1-50):   19.04 points (27.0%)")
print("Medium N (51-100): 17.64 points (25.0%)")
print("Large N (101-200): 33.99 points (48.1%)")
print("Total:            70.676 points")
print()
print("=== GAP ANALYSIS ===")
print(f"Current score: 70.676")
print(f"Target score:  68.923")
print(f"Gap:           {70.676 - 68.923:.3f} points")
print()
print("To reach target, we need to reduce score by 1.75 points (2.5%)")
print()
print("=== LEADERBOARD CONTEXT ===")
print("LB Leader (terry_u16): 71.19")
print("Our score:             70.68")
print("Target:                68.92")
print()
print("We are ALREADY BETTER than the LB leader!")
print("The target is 2.26 points BETTER than the LB leader.")
print("This suggests the target may be theoretical optimum or very aggressive.")

=== SCORE BREAKDOWN ===
Small N (1-50):   19.04 points (27.0%)
Medium N (51-100): 17.64 points (25.0%)
Large N (101-200): 33.99 points (48.1%)
Total:            70.676 points

=== GAP ANALYSIS ===
Current score: 70.676
Target score:  68.923
Gap:           1.753 points

To reach target, we need to reduce score by 1.75 points (2.5%)

=== LEADERBOARD CONTEXT ===
LB Leader (terry_u16): 71.19
Our score:             70.68
Target:                68.92

We are ALREADY BETTER than the LB leader!
The target is 2.26 points BETTER than the LB leader.
This suggests the target may be theoretical optimum or very aggressive.


In [3]:
# What techniques have we tried?
print("=== TECHNIQUES TRIED ===")
print("1. Pre-optimized saspav solution (baseline) - Score: 70.676")
print("2. Ensemble of 20 solutions - No improvement (saspav best for all N)")
print("3. Backward propagation - 0 improvements")
print("4. Lattice construction - Worse than baseline (0.385 vs 0.349 for N=72)")
print()
print("=== TECHNIQUES NOT YET TRIED ===")
print("1. Full Simulated Annealing with proper parameters (egortrushin approach)")
print("2. Rebuild from corners (chistyakov approach)")
print("3. Fractional translation optimization (jonathanchan approach)")
print("4. C++ bbox3 optimizer (need to compile from source)")
print("5. Multi-start random initialization + SA")
print("6. Genetic algorithm crossover")

=== TECHNIQUES TRIED ===
1. Pre-optimized saspav solution (baseline) - Score: 70.676
2. Ensemble of 20 solutions - No improvement (saspav best for all N)
3. Backward propagation - 0 improvements
4. Lattice construction - Worse than baseline (0.385 vs 0.349 for N=72)

=== TECHNIQUES NOT YET TRIED ===
1. Full Simulated Annealing with proper parameters (egortrushin approach)
2. Rebuild from corners (chistyakov approach)
3. Fractional translation optimization (jonathanchan approach)
4. C++ bbox3 optimizer (need to compile from source)
5. Multi-start random initialization + SA
6. Genetic algorithm crossover


## Key Insight: We Need to Try the Full SA Optimization

The evaluator correctly identified that our lattice construction was INCOMPLETE:
- We used binary search for translations (too simplistic)
- We did NOT apply SA to optimize the lattice
- The egortrushin kernel runs SA for 10,000+ iterations

## New Technique: Rebuild from Corners (chistyakov)

This technique:
1. Takes a large N configuration
2. For each corner, sorts trees by distance from that corner
3. Reconstructs smaller N configurations from the closest trees
4. If the reconstructed config is better than existing, use it

This is DIFFERENT from backward propagation because:
- Backward prop removes trees from boundary
- Rebuild from corners selects trees closest to a specific corner
- This can find better subsets that weren't discovered by backward prop

In [4]:
# Let's implement the rebuild from corners technique
from decimal import Decimal, getcontext
from shapely import affinity
from shapely.geometry import Polygon
from shapely.ops import unary_union
import warnings
warnings.filterwarnings('ignore')

getcontext().prec = 25
scale_factor = Decimal("1")

class ChristmasTree:
    def __init__(self, center_x='0', center_y='0', angle='0'):
        self.center_x = Decimal(str(center_x))
        self.center_y = Decimal(str(center_y))
        self.angle = Decimal(str(angle))

        trunk_w = Decimal('0.15')
        trunk_h = Decimal('0.2')
        base_w = Decimal('0.7')
        mid_w = Decimal('0.4')
        top_w = Decimal('0.25')
        tip_y = Decimal('0.8')
        tier_1_y = Decimal('0.5')
        tier_2_y = Decimal('0.25')
        base_y = Decimal('0.0')
        trunk_bottom_y = -trunk_h

        initial_polygon = Polygon([
            (Decimal('0.0') * scale_factor, tip_y * scale_factor),
            (top_w / Decimal('2') * scale_factor, tier_1_y * scale_factor),
            (top_w / Decimal('4') * scale_factor, tier_1_y * scale_factor),
            (mid_w / Decimal('2') * scale_factor, tier_2_y * scale_factor),
            (mid_w / Decimal('4') * scale_factor, tier_2_y * scale_factor),
            (base_w / Decimal('2') * scale_factor, base_y * scale_factor),
            (trunk_w / Decimal('2') * scale_factor, base_y * scale_factor),
            (trunk_w / Decimal('2') * scale_factor, trunk_bottom_y * scale_factor),
            (-(trunk_w / Decimal('2')) * scale_factor, trunk_bottom_y * scale_factor),
            (-(trunk_w / Decimal('2')) * scale_factor, base_y * scale_factor),
            (-(base_w / Decimal('2')) * scale_factor, base_y * scale_factor),
            (-(mid_w / Decimal('4')) * scale_factor, tier_2_y * scale_factor),
            (-(mid_w / Decimal('2')) * scale_factor, tier_2_y * scale_factor),
            (-(top_w / Decimal('4')) * scale_factor, tier_1_y * scale_factor),
            (-(top_w / Decimal('2')) * scale_factor, tier_1_y * scale_factor),
        ])
        rotated = affinity.rotate(initial_polygon, float(self.angle), origin=(0, 0))
        self.polygon = affinity.translate(rotated,
                                          xoff=float(self.center_x * scale_factor),
                                          yoff=float(self.center_y * scale_factor))

    def clone(self):
        return ChristmasTree(str(self.center_x), str(self.center_y), str(self.angle))

def get_side_length(tree_list):
    all_polygons = [t.polygon for t in tree_list]
    bounds = unary_union(all_polygons).bounds
    return Decimal(max(bounds[2] - bounds[0], bounds[3] - bounds[1])) / scale_factor

def calculate_score(trees):
    side = get_side_length(trees)
    return float(side ** 2 / len(trees))

print("Classes defined successfully")

Classes defined successfully


In [5]:
# Load the saspav solution
def load_submission(filepath):
    df = pd.read_csv(filepath)
    solutions = {}
    for n in range(1, 201):
        group_data = df[df['id'].str.startswith(f"{n:03d}_")]
        trees = []
        for _, row in group_data.iterrows():
            x = str(row['x']).replace('s', '')
            y = str(row['y']).replace('s', '')
            deg = str(row['deg']).replace('s', '')
            trees.append(ChristmasTree(x, y, deg))
        solutions[n] = trees
    return solutions

print("Loading saspav solution...")
solutions = load_submission('/home/code/santa-2025-csv/santa-2025.csv')
print(f"Loaded {len(solutions)} configurations")

# Calculate baseline score
baseline_score = sum(calculate_score(solutions[n]) for n in range(1, 201))
print(f"Baseline score: {baseline_score:.6f}")

Loading saspav solution...


Loaded 200 configurations


Baseline score: 70.676102


In [None]:
# Implement rebuild from corners
def rebuild_from_corners(solutions, verbose=True):
    """Try to find better small N configurations from large N layouts."""
    improvements = 0
    
    # For each large N configuration
    for large_n in range(50, 201):  # Start from N=50 to find improvements for smaller N
        layout = solutions[large_n]
        all_polygons = [t.polygon for t in layout]
        bounds = unary_union(all_polygons).bounds
        
        # Try each corner
        for corner_x, corner_y in [(bounds[0], bounds[1]), (bounds[0], bounds[3]), 
                                    (bounds[2], bounds[1]), (bounds[2], bounds[3])]:
            # Sort trees by distance from corner
            candidates = {}
            for tree in layout:
                dist = max(
                    abs(tree.polygon.bounds[0] - corner_x),
                    abs(tree.polygon.bounds[2] - corner_x),
                    abs(tree.polygon.bounds[1] - corner_y),
                    abs(tree.polygon.bounds[3] - corner_y)
                )
                candidates[dist] = tree
            
            # Sort by distance
            candidates = dict(sorted(candidates.items()))
            
            # Reconstruct smaller layouts
            layout_new = []
            for tree in candidates.values():
                layout_new.append(tree.clone())
                small_n = len(layout_new)
                
                if small_n >= large_n:
                    break
                
                # Compare with existing solution
                current_side = get_side_length(solutions[small_n])
                candidate_side = get_side_length(layout_new)
                
                if candidate_side < current_side - Decimal('1e-8'):
                    if verbose:
                        print(f"IMPROVEMENT! N={small_n} from N={large_n}: {float(current_side):.8f} -> {float(candidate_side):.8f}")
                    solutions[small_n] = [t.clone() for t in layout_new]
                    improvements += 1
    
    return improvements

print("\nApplying rebuild from corners...")
improvements = rebuild_from_corners(solutions, verbose=True)
print(f"\nTotal improvements: {improvements}")

# Calculate new score
new_score = sum(calculate_score(solutions[n]) for n in range(1, 201))
print(f"\nBaseline score: {baseline_score:.6f}")
print(f"After rebuild:  {new_score:.6f}")
print(f"Improvement:    {baseline_score - new_score:.6f}")