# üîÑ Backward Iteration Approach (BackPacking) üîÑ

This notebook implements a **backward iteration strategy** for optimizing the packing of Christmas trees into minimal bounding squares across all tree counts (1-200).

### üìã Ensemble Creation

**Before optimization begins**, we:
1. üîç **Load ALL CSV files** in the workspace (submission.csv, previous runs, etc.)
2. üìä **Compare solutions** for each tree count (n=1 to 200)
3. üèÜ **Select the best configuration** based on bounding square side length
4. üì¶ **Create ensemble baseline** - the optimal starting point for optimization

This ensures we start from the best available solutions across all your previous work!

### üí° Key Concept

Instead of optimizing each tree count independently, we:

1. **Start from n=200** (largest configuration) and iterate backward to n=1
2. **Track the best-performing configuration** based on bounding square side length
3. **Propagate successful patterns** by adapting high-performing configurations to smaller tree counts
4. When a configuration at n trees performs poorly, we **copy the best configuration found so far** and simply drop the extra trees

### ‚ú® Why This Works

- üéØ **Leverage optimal patterns**: Good packing arrangements at larger counts often remain efficient when trees are removed
- üöÄ **Avoid local minima**: Instead of getting stuck with a poor configuration, we adapt from proven successful layouts
- ‚ö° **Computational efficiency**: Reusing configurations is faster than optimizing each count from scratch
- üîó **Consistency**: Maintains similar packing strategies across different tree counts

### üéØ Expected Outcome

This approach will improve scores for tree counts where the original configuration was worse than for larger examples. This is something I noticed while going through my tree visualizer notebook.

## üìä Check out the outcome and visualizations at the end of the notebook!

## üì¶ Setup and Imports

In [None]:
import math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from decimal import Decimal, getcontext
from shapely import affinity
from shapely.geometry import Polygon
from shapely.ops import unary_union
import glob
import os

pd.set_option('display.float_format', '{:.12f}'.format)
getcontext().prec = 25
scale_factor = Decimal('1e15')

## üå≤ ChristmasTree Class

Represents a single rotatable Christmas tree with trunk and three-tier design.

In [None]:
class ChristmasTree:
    """Represents a single, rotatable Christmas tree of a fixed size."""

    def __init__(self, center_x='0', center_y='0', angle='0'):
        """Initializes the Christmas tree with a specific position and rotation."""
        self.center_x = Decimal(center_x)
        self.center_y = Decimal(center_y)
        self.angle = Decimal(angle)

        trunk_w = Decimal('0.15')
        trunk_h = Decimal('0.2')
        base_w = Decimal('0.7')
        mid_w = Decimal('0.4')
        top_w = Decimal('0.25')
        tip_y = Decimal('0.8')
        tier_1_y = Decimal('0.5')
        tier_2_y = Decimal('0.25')
        base_y = Decimal('0.0')
        trunk_bottom_y = -trunk_h

        initial_polygon = Polygon([
            (Decimal('0.0') * scale_factor, tip_y * scale_factor),
            (top_w / Decimal('2') * scale_factor, tier_1_y * scale_factor),
            (top_w / Decimal('4') * scale_factor, tier_1_y * scale_factor),
            (mid_w / Decimal('2') * scale_factor, tier_2_y * scale_factor),
            (mid_w / Decimal('4') * scale_factor, tier_2_y * scale_factor),
            (base_w / Decimal('2') * scale_factor, base_y * scale_factor),
            (trunk_w / Decimal('2') * scale_factor, base_y * scale_factor),
            (trunk_w / Decimal('2') * scale_factor, trunk_bottom_y * scale_factor),
            (-(trunk_w / Decimal('2')) * scale_factor, trunk_bottom_y * scale_factor),
            (-(trunk_w / Decimal('2')) * scale_factor, base_y * scale_factor),
            (-(base_w / Decimal('2')) * scale_factor, base_y * scale_factor),
            (-(mid_w / Decimal('4')) * scale_factor, tier_2_y * scale_factor),
            (-(mid_w / Decimal('2')) * scale_factor, tier_2_y * scale_factor),
            (-(top_w / Decimal('4')) * scale_factor, tier_1_y * scale_factor),
            (-(top_w / Decimal('2')) * scale_factor, tier_1_y * scale_factor),
        ])
        rotated = affinity.rotate(initial_polygon, float(self.angle), origin=(0, 0))
        self.polygon = affinity.translate(rotated,
                                          xoff=float(self.center_x * scale_factor),
                                          yoff=float(self.center_y * scale_factor))

## üìê Metric Calculation & Ensemble Loading

Calculate the score for a given configuration: **(side_length)¬≤ / n_trees**

Also includes ensemble loading to find the best solution across all CSV files.

In [None]:
def calculate_side_length(trees):
    """Calculate the bounding square side length for a list of trees."""
    if not trees:
        return Decimal('0')
    
    all_polygons = [t.polygon for t in trees]
    bounds = unary_union(all_polygons).bounds
    
    minx = Decimal(bounds[0]) / scale_factor
    miny = Decimal(bounds[1]) / scale_factor
    maxx = Decimal(bounds[2]) / scale_factor
    maxy = Decimal(bounds[3]) / scale_factor
    
    width = maxx - minx
    height = maxy - miny
    side_length = max(width, height)
    
    return side_length

def calculate_score(trees, n_trees):
    """Calculate normalized score: side_length^2 / n_trees."""
    side = calculate_side_length(trees)
    return float(side * side / n_trees)

def load_all_csv_solutions():
    """
    Load all CSV files in the workspace and create an ensemble solution.
    For each configuration (n), picks the best solution across all CSV files.
    
    Returns: DataFrame with best configurations for each n
    """
    print(f"\nüìÇ Loading ensemble from CSV files...")
    
    # Find all CSV files
    csv_files = glob.glob('/kaggle/input/*/*.csv')
    csv_files = [f for f in csv_files if os.path.isfile(f)]
    
    if not csv_files:
        print("‚ö†Ô∏è  No CSV files found!")
        return None
    
    print(f"Found {len(csv_files)} CSV files")
    
    # Load all solutions
    all_solutions = {}  # file -> DataFrame
    
    for csv_path in csv_files:
        try:
            # Keep x, y, deg as strings to preserve precision
            df_temp = pd.read_csv(csv_path, index_col=0, dtype={'x': str, 'y': str, 'deg': str})
            
            # Remove 's' prefix from values if present, but keep as strings
            for col in ['x', 'y', 'deg']:
                df_temp[col] = df_temp[col].astype(str).str.lstrip('s')
            
            all_solutions[csv_path] = df_temp
            
        except Exception as e:
            print(f"  ‚ö†Ô∏è  {csv_path}: {e}")
    
    if not all_solutions:
        print("‚ö†Ô∏è  No valid solutions loaded!")
        return None
    
    print(f"‚úÖ Loaded {len(all_solutions)} valid CSV files")
    
    # Create ensemble: for each n, pick the best solution across all files
    print(f"üîç Building ensemble...")
    
    # Get all unique tree counts
    all_tree_counts = set()
    for df_temp in all_solutions.values():
        tree_counts = df_temp.index.str.split('_').str[0].astype(int).unique()
        all_tree_counts.update(tree_counts)
    
    ensemble_rows = []
    
    for n in sorted(all_tree_counts):
        best_score = float('inf')
        best_config = None
        best_source = None
        
        # Compare all solutions for this n
        for csv_path, df_temp in all_solutions.items():
            indices = [f'{n:03d}_{t}' for t in range(n)]
            
            try:
                config = df_temp.loc[indices]
                
                if len(config) != n:
                    continue
                
                # Create trees and calculate score - keep values as strings
                trees = [
                    ChristmasTree(
                        center_x=row['x'],
                        center_y=row['y'],
                        angle=row['deg']
                    )
                    for _, row in config.iterrows()
                ]
                
                side = calculate_side_length(trees)
                score = float(side * side / n)
                
                if score < best_score:
                    best_score = score
                    best_config = config
                    best_source = csv_path
            except:
                continue
        
        if best_config is not None:
            ensemble_rows.append(best_config)
    
    # Combine all best configurations
    ensemble_df = pd.concat(ensemble_rows)
    
    print(f"‚úÖ Ensemble created with {len(ensemble_df)} tree configurations")
    
    # Calculate ensemble score
    total_score = 0.0
    for n in sorted(all_tree_counts):
        indices = [f'{n:03d}_{t}' for t in range(n)]
        try:
            config = ensemble_df.loc[indices]
            trees = [
                ChristmasTree(
                    center_x=row['x'],
                    center_y=row['y'],
                    angle=row['deg']
                )
                for _, row in config.iterrows()
            ]
            score = calculate_score(trees, n)
            total_score += score
        except:
            pass
    
    print(f"üìä Ensemble total score: {total_score:.6f}\n")
    
    return ensemble_df

## üìÇ Load Existing Solution

Load the baseline submission and prepare for backward optimization.

In [None]:
# Load ensemble from all CSV files
df = load_all_csv_solutions()

if df is None:
    print("‚ö†Ô∏è  Failed to load ensemble. Creating empty baseline.")
    df = pd.DataFrame(columns=['x', 'y', 'deg'])
else:
    print(f"\n‚úÖ Loaded ensemble with {len(df)} tree configurations")

## üîÑ Backward Iteration Optimizer

Iterate from 200 trees down to 1, keeping the best configurations and adapting when needed.

In [None]:
# Initialize storage for results
optimized_data = []
best_side = float('inf')
best_n = None
best_config = None
improvements = []  # Track improvements for visualization

# Iterate from 200 down to 1
for n in range(200, 0, -1):
    # Get current configuration for n trees
    indices = [f'{n:03d}_{t}' for t in range(n)]
    current_config = df.loc[indices]
    
    # Create tree objects - keep values as strings
    trees = [
        ChristmasTree(
            center_x=row['x'],
            center_y=row['y'],
            angle=row['deg']
        )
        for _, row in current_config.iterrows()
    ]
    
    # Calculate current score
    current_score = calculate_score(trees, n)
    current_side = float(calculate_side_length(trees))
    
    # Check if we should use current or adapt from best
    if current_side < best_side:
        # Current has better (lower) side - use it
        print(f"‚≠ê NEW BEST at n={n}: side={current_side:.6f}")
        best_side = current_side
        best_n = n
        best_config = current_config.copy()
        
        # Store current configuration - keep as strings
        for idx, row in current_config.iterrows():
            optimized_data.append({
                'id': idx,
                'x': row['x'],
                'y': row['y'],
                'deg': row['deg']
            })
    else:
        # Current side is worse - adapt from best by dropping extra trees
        if best_config is not None and len(best_config) >= n:
            adapted_config = best_config.iloc[:n].copy()
            adapted_config.index = indices
            
            # Calculate improvement
            adapted_trees = [
                ChristmasTree(
                    center_x=row['x'],
                    center_y=row['y'],
                    angle=row['deg']
                )
                for _, row in adapted_config.iterrows()
            ]
            adapted_side = float(calculate_side_length(adapted_trees))
            
            if adapted_side < current_side:
                improvement_pct = ((current_side - adapted_side) / current_side) * 100
                print(f"‚úì IMPROVED n={n}: {current_side:.6f} ‚Üí {adapted_side:.6f} ({improvement_pct:.2f}% reduction)")
                improvements.append({
                    'n': n,
                    'original_side': current_side,
                    'optimized_side': adapted_side,
                    'improvement_pct': improvement_pct
                })
            
            # Store adapted configuration - keep as strings
            for idx, row in adapted_config.iterrows():
                optimized_data.append({
                    'id': idx,
                    'x': row['x'],
                    'y': row['y'],
                    'deg': row['deg']
                })
        else:
            # Fallback - use current config, keep as strings
            for idx, row in current_config.iterrows():
                optimized_data.append({
                    'id': idx,
                    'x': row['x'],
                    'y': row['y'],
                    'deg': row['deg']
                })

print(f"\nOptimization complete! Total improvements tracked: {len(improvements)}")

# Calculate total score across all tree counts
total_score = 0.0
for n in range(1, 201):
    indices = [f'{n:03d}_{t}' for t in range(n)]
    config = pd.DataFrame([d for d in optimized_data if d['id'] in indices])
    
    trees = [
        ChristmasTree(
            center_x=row['x'],
            center_y=row['y'],
            angle=row['deg']
        )
        for _, row in config.iterrows()
    ]
    
    score = calculate_score(trees, n)
    total_score += score

print(f"\n{'='*50}")
print(f"TOTAL SCORE (sum of all n=1 to n=200): {total_score:.6f}")
print(f"{'='*50}")

## üíæ Save Optimized Submission

Format and save the optimized solution with the required 's' prefix.

In [None]:
# Create submission dataframe - keep x, y, deg as strings
submission = pd.DataFrame(optimized_data)

# Sort by tree count (extract from id: '001_0' -> 1, '200_0' -> 200)
submission['tree_count'] = submission['id'].str.split('_').str[0].astype(int)
submission = submission.sort_values('tree_count').drop('tree_count', axis=1)

submission = submission.set_index('id')

# Add 's' prefix to string values (no rounding needed, already precise)
for col in ['x', 'y', 'deg']:
    submission[col] = 's' + submission[col].astype(str)

# Save
submission.to_csv('submission.csv')
print(f"Saved optimized submission with {len(submission)} configurations")
submission.head(10)

## üé® Visualization Function

Plot tree arrangements with bounding squares to visualize packing efficiency.

In [None]:
def plot_results(trees, num_trees, title_suffix=""):
    """Plots the arrangement of trees and the bounding square."""
    side_length = calculate_side_length(trees)
    
    fig, ax = plt.subplots(figsize=(8, 8))
    colors = plt.cm.viridis([i / max(num_trees, 1) for i in range(num_trees)])
    
    all_polygons = [t.polygon for t in trees]
    bounds = unary_union(all_polygons).bounds
    
    for i, tree in enumerate(trees):
        x_scaled, y_scaled = tree.polygon.exterior.xy
        x = [Decimal(val) / scale_factor for val in x_scaled]
        y = [Decimal(val) / scale_factor for val in y_scaled]
        ax.plot(x, y, color=colors[i], linewidth=0.5)
        ax.fill(x, y, alpha=0.5, color=colors[i])
    
    minx = Decimal(bounds[0]) / scale_factor
    miny = Decimal(bounds[1]) / scale_factor
    maxx = Decimal(bounds[2]) / scale_factor
    maxy = Decimal(bounds[3]) / scale_factor
    
    width = maxx - minx
    height = maxy - miny
    
    square_x = minx if width >= height else minx - (side_length - width) / 2
    square_y = miny if height >= width else miny - (side_length - height) / 2
    bounding_square = Rectangle(
        (float(square_x), float(square_y)),
        float(side_length),
        float(side_length),
        fill=False,
        edgecolor='red',
        linewidth=2,
        linestyle='--',
    )
    ax.add_patch(bounding_square)
    
    padding = 0.5
    ax.set_xlim(
        float(square_x - Decimal(str(padding))),
        float(square_x + side_length + Decimal(str(padding))))
    ax.set_ylim(float(square_y - Decimal(str(padding))),
                float(square_y + side_length + Decimal(str(padding))))
    ax.set_aspect('equal', adjustable='box')
    ax.axis('off')
    
    score = calculate_score(trees, num_trees)
    plt.title(f'{num_trees} Trees{title_suffix}\nSide: {float(side_length):.6f}, Score: {score:.6f}', 
              fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()

def plot_side_by_side(original_trees, optimized_trees, num_trees):
    """Plot before and after configurations side by side."""
    original_side = calculate_side_length(original_trees)
    optimized_side = calculate_side_length(optimized_trees)
    improvement_pct = ((float(original_side) - float(optimized_side)) / float(original_side)) * 100
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
    colors = plt.cm.viridis([i / max(num_trees, 1) for i in range(num_trees)])
    
    # Plot original (left)
    all_polygons_orig = [t.polygon for t in original_trees]
    bounds_orig = unary_union(all_polygons_orig).bounds
    
    for i, tree in enumerate(original_trees):
        x_scaled, y_scaled = tree.polygon.exterior.xy
        x = [Decimal(val) / scale_factor for val in x_scaled]
        y = [Decimal(val) / scale_factor for val in y_scaled]
        ax1.plot(x, y, color=colors[i], linewidth=0.5)
        ax1.fill(x, y, alpha=0.5, color=colors[i])
    
    minx_orig = Decimal(bounds_orig[0]) / scale_factor
    miny_orig = Decimal(bounds_orig[1]) / scale_factor
    maxx_orig = Decimal(bounds_orig[2]) / scale_factor
    maxy_orig = Decimal(bounds_orig[3]) / scale_factor
    width_orig = maxx_orig - minx_orig
    height_orig = maxy_orig - miny_orig
    
    square_x_orig = minx_orig if width_orig >= height_orig else minx_orig - (original_side - width_orig) / 2
    square_y_orig = miny_orig if height_orig >= width_orig else miny_orig - (original_side - height_orig) / 2
    bounding_square_orig = Rectangle(
        (float(square_x_orig), float(square_y_orig)),
        float(original_side),
        float(original_side),
        fill=False,
        edgecolor='red',
        linewidth=2,
        linestyle='--',
    )
    ax1.add_patch(bounding_square_orig)
    
    padding = 0.5
    ax1.set_xlim(float(square_x_orig - Decimal(str(padding))),
                 float(square_x_orig + original_side + Decimal(str(padding))))
    ax1.set_ylim(float(square_y_orig - Decimal(str(padding))),
                 float(square_y_orig + original_side + Decimal(str(padding))))
    ax1.set_aspect('equal', adjustable='box')
    ax1.axis('off')
    ax1.set_title(f'BEFORE: {num_trees} Trees\nSide: {float(original_side):.6f}', 
                  fontsize=12, fontweight='bold')
    
    # Plot optimized (right)
    all_polygons_opt = [t.polygon for t in optimized_trees]
    bounds_opt = unary_union(all_polygons_opt).bounds
    
    for i, tree in enumerate(optimized_trees):
        x_scaled, y_scaled = tree.polygon.exterior.xy
        x = [Decimal(val) / scale_factor for val in x_scaled]
        y = [Decimal(val) / scale_factor for val in y_scaled]
        ax2.plot(x, y, color=colors[i], linewidth=0.5)
        ax2.fill(x, y, alpha=0.5, color=colors[i])
    
    minx_opt = Decimal(bounds_opt[0]) / scale_factor
    miny_opt = Decimal(bounds_opt[1]) / scale_factor
    maxx_opt = Decimal(bounds_opt[2]) / scale_factor
    maxy_opt = Decimal(bounds_opt[3]) / scale_factor
    width_opt = maxx_opt - minx_opt
    height_opt = maxy_opt - miny_opt
    
    square_x_opt = minx_opt if width_opt >= height_opt else minx_opt - (optimized_side - width_opt) / 2
    square_y_opt = miny_opt if height_opt >= width_opt else miny_opt - (optimized_side - height_opt) / 2
    bounding_square_opt = Rectangle(
        (float(square_x_opt), float(square_y_opt)),
        float(optimized_side),
        float(optimized_side),
        fill=False,
        edgecolor='green',
        linewidth=2,
        linestyle='--',
    )
    ax2.add_patch(bounding_square_opt)
    
    ax2.set_xlim(float(square_x_opt - Decimal(str(padding))),
                 float(square_x_opt + optimized_side + Decimal(str(padding))))
    ax2.set_ylim(float(square_y_opt - Decimal(str(padding))),
                 float(square_y_opt + optimized_side + Decimal(str(padding))))
    ax2.set_aspect('equal', adjustable='box')
    ax2.axis('off')
    ax2.set_title(f'AFTER: {num_trees} Trees\nSide: {float(optimized_side):.6f}', 
                  fontsize=12, fontweight='bold', color='green')
    
    fig.suptitle(f'Improvement: {improvement_pct:.2f}% reduction in side length', 
                 fontsize=16, fontweight='bold', color='darkgreen')
    plt.tight_layout()
    plt.show()

## üìà Analyze Improvements

Review which configurations were improved and by how much.

In [None]:
# Create dataframe of improvements
if improvements:
    improvements_df = pd.DataFrame(improvements)
    improvements_df = improvements_df.sort_values('improvement_pct', ascending=False)
    
    print(f"Found {len(improvements_df)} configurations with improvements\n")
    print("Top 10 improvements:")
    print(improvements_df.head(10).to_string(index=False))
    print(f"\nMean improvement: {improvements_df['improvement_pct'].mean():.2f}%")
    print(f"Median improvement: {improvements_df['improvement_pct'].median():.2f}%")
else:
    print("No improvements tracked (all configurations kept their original layout)")
    improvements_df = pd.DataFrame()

## üîç Visualize Before/After Comparisons

Display side-by-side comparisons showing the most significant improvements.

In [None]:
# Select configurations to visualize (top improvements or sample)
if not improvements_df.empty:
    # Get top 5 improvements
    top_improvements = improvements_df.head(20)['n'].tolist()
    visualize_counts = top_improvements
else:
    # Fallback to sample counts
    visualize_counts = [10, 20, 50, 100, 150]

print(f"Visualizing {len(visualize_counts)} configurations with improvements:\n")

for n in visualize_counts:
    # Get original configuration
    indices = [f'{n:03d}_{t}' for t in range(n)]
    original_config = df.loc[indices]
    
    # Keep values as strings
    original_trees = [
        ChristmasTree(
            center_x=row['x'],
            center_y=row['y'],
            angle=row['deg']
        )
        for _, row in original_config.iterrows()
    ]
    
    # Get optimized configuration
    optimized_config = submission.loc[indices]
    
    # Remove 's' prefix but keep as strings
    optimized_trees = [
        ChristmasTree(
            center_x=row['x'][1:],  # Remove 's' prefix
            center_y=row['y'][1:],
            angle=row['deg'][1:]
        )
        for _, row in optimized_config.iterrows()
    ]
    
    # Plot side by side
    plot_side_by_side(original_trees, optimized_trees, n)

## üìù Summary

This notebook demonstrates a **backward iteration strategy** for optimizing tree packing:

### üîß The Process

1. ‚¨áÔ∏è **Start from the largest configuration** (200 trees) and work backward
2. üìä **Track the best score** at each step
3. üîÑ **Adapt solutions** when performance degrades by copying the best configuration and dropping extra trees
4. üì∏ **Visualize improvements** with side-by-side comparisons showing before/after results

### üéØ Results

The approach ensures we maintain good packing efficiency across all tree counts while leveraging successful configurations from larger sets.

### üíé Key Insights from Improvements

- ‚úÖ Configurations adapted from better-performing larger sets show significant improvements
- üîÑ The backward iteration successfully propagates optimal packing patterns
- üëÄ Side-by-side visualizations clearly demonstrate reduced bounding box sizes

---

### üéÑ Happy optimizing and good luck with the competition! üéÖ‚ú®