# Evolver Loop 13 Analysis

## Key Insights from Research

1. **Chistyakov's "Tree Removal" Technique**: For each large N layout, try removing trees one by one (only those touching the bounding box) to see if the remaining trees form a better solution for N-1.

2. **BackPacking (crodoc)**: Similar approach - propagate good configurations backward from larger N to smaller N.

3. **Current Status**:
   - Best score: 70.630478 (from saspav_best.csv ensemble)
   - Target: 68.919154
   - Gap: 1.711 points (2.42%)

4. **What's been tried (14 experiments)**:
   - Ensemble from multiple sources ✓
   - bbox3 optimization (produces overlaps)
   - sa_v1_parallel (produces overlaps)
   - Grid-based approaches (zaburo, tessellation) - worse
   - Constructive heuristics (scanline, lattice, chebyshev, BL) - worse
   - Random restart SA - no improvement
   - Long-running SA (15 generations) - no improvement
   - Basin hopping - no improvement
   - GA with crossover - no improvement

5. **What HASN'T been tried**:
   - Chistyakov's tree removal technique
   - Per-N focused analysis (which N values have most room for improvement?)

In [None]:
import pandas as pd
import numpy as np
from decimal import Decimal, getcontext
from shapely import affinity
from shapely.geometry import Polygon
from shapely.ops import unary_union

getcontext().prec = 25
scale_factor = Decimal('1e15')

print('Libraries loaded')

In [None]:
class ChristmasTree:
    def __init__(self, center_x='0', center_y='0', angle='0'):
        self.center_x = Decimal(center_x)
        self.center_y = Decimal(center_y)
        self.angle = Decimal(angle)

        trunk_w = Decimal('0.15')
        trunk_h = Decimal('0.2')
        base_w = Decimal('0.7')
        mid_w = Decimal('0.4')
        top_w = Decimal('0.25')
        tip_y = Decimal('0.8')
        tier_1_y = Decimal('0.5')
        tier_2_y = Decimal('0.25')
        base_y = Decimal('0.0')
        trunk_bottom_y = -trunk_h

        initial_polygon = Polygon([
            (Decimal('0.0') * scale_factor, tip_y * scale_factor),
            (top_w / Decimal('2') * scale_factor, tier_1_y * scale_factor),
            (top_w / Decimal('4') * scale_factor, tier_1_y * scale_factor),
            (mid_w / Decimal('2') * scale_factor, tier_2_y * scale_factor),
            (mid_w / Decimal('4') * scale_factor, tier_2_y * scale_factor),
            (base_w / Decimal('2') * scale_factor, base_y * scale_factor),
            (trunk_w / Decimal('2') * scale_factor, base_y * scale_factor),
            (trunk_w / Decimal('2') * scale_factor, trunk_bottom_y * scale_factor),
            (-(trunk_w / Decimal('2')) * scale_factor, trunk_bottom_y * scale_factor),
            (-(trunk_w / Decimal('2')) * scale_factor, base_y * scale_factor),
            (-(base_w / Decimal('2')) * scale_factor, base_y * scale_factor),
            (-(mid_w / Decimal('4')) * scale_factor, tier_2_y * scale_factor),
            (-(mid_w / Decimal('2')) * scale_factor, tier_2_y * scale_factor),
            (-(top_w / Decimal('4')) * scale_factor, tier_1_y * scale_factor),
            (-(top_w / Decimal('2')) * scale_factor, tier_1_y * scale_factor),
        ])
        rotated = affinity.rotate(initial_polygon, float(self.angle), origin=(0, 0))
        self.polygon = affinity.translate(rotated,
                                          xoff=float(self.center_x * scale_factor),
                                          yoff=float(self.center_y * scale_factor))

    def clone(self):
        return ChristmasTree(str(self.center_x), str(self.center_y), str(self.angle))

print('ChristmasTree class defined')

In [None]:
def get_tree_list_side_length(tree_list):
    all_polygons = [t.polygon for t in tree_list]
    bounds = unary_union(all_polygons).bounds
    return Decimal(max(bounds[2] - bounds[0], bounds[3] - bounds[1])) / scale_factor

def calculate_score(trees):
    if not trees:
        return float('inf')
    side = get_tree_list_side_length(trees)
    return float(side ** 2 / len(trees))

def load_trees(n, df):
    group_data = df[df['id'].str.startswith(f'{n:03d}_')]
    trees = []
    for _, row in group_data.iterrows():
        x = str(row['x']).lstrip('s')
        y = str(row['y']).lstrip('s')
        deg = str(row['deg']).lstrip('s')
        trees.append(ChristmasTree(x, y, deg))
    return trees

print('Helper functions defined')

In [None]:
# Load current best solution
current_best_df = pd.read_csv('/home/code/exploration/datasets/saspav_best.csv')

# Get current best scores for each N
current_scores = {}
for n in range(1, 201):
    trees = load_trees(n, current_best_df)
    current_scores[n] = calculate_score(trees)

print(f'Loaded current best scores for N=1-200')
print(f'Total current best: {sum(current_scores.values()):.6f}')

In [None]:
# Per-N analysis: Which N values have the most room for improvement?
# Calculate score contribution and efficiency for each N

analysis = []
for n in range(1, 201):
    score = current_scores[n]
    # Theoretical minimum: if trees could be packed with zero waste
    # For a single tree, the bounding box is about 0.7 x 1.0 (width x height)
    # So minimum side is ~1.0 for N=1
    # For N trees, if perfectly packed, area = N * tree_area
    # tree_area ≈ 0.5 (rough estimate)
    # So minimum side ≈ sqrt(N * 0.5)
    theoretical_min_side = np.sqrt(n * 0.5)
    theoretical_min_score = theoretical_min_side ** 2 / n
    
    efficiency = theoretical_min_score / score if score > 0 else 0
    
    analysis.append({
        'n': n,
        'score': score,
        'contribution': score,  # Each N contributes its score to total
        'theoretical_min': theoretical_min_score,
        'efficiency': efficiency,
        'gap': score - theoretical_min_score
    })

analysis_df = pd.DataFrame(analysis)
print('Per-N analysis:')
print(analysis_df.sort_values('gap', ascending=False).head(20))

In [None]:
# Which N values contribute most to the total score?
print('\nTop 20 N values by score contribution:')
print(analysis_df.sort_values('contribution', ascending=False).head(20)[['n', 'score', 'contribution']])

In [None]:
# Which N values have the worst efficiency (most room for improvement)?
print('\nTop 20 N values by worst efficiency (most room for improvement):')
print(analysis_df.sort_values('efficiency').head(20)[['n', 'score', 'efficiency', 'gap']])