# Loop 3 LB Feedback Analysis

**Submission Result:** exp_002 (003_original_baseline) scored 70.676102 on LB

**Key Insight:** The original santa-2025.csv passes Kaggle validation with no overlaps. This confirms:
1. The precision loss from fix_direction was the issue in exp_001
2. We have a valid baseline at 70.676102
3. Gap to target: 1.75 points (2.5%)

In [1]:
# Analyze the gap and what's needed
import pandas as pd
import numpy as np

# Current state
current_score = 70.676102
target_score = 68.922808
gap = current_score - target_score

print(f"Current LB Score: {current_score:.6f}")
print(f"Target Score: {target_score:.6f}")
print(f"Gap: {gap:.6f} ({gap/target_score*100:.2f}%)")
print()
print("To close this gap, we need to reduce the average side length.")
print(f"Average reduction needed per N: {gap/200:.6f} score units")
print(f"This translates to reducing side by ~0.053 units on average across all N values")

Current LB Score: 70.676102
Target Score: 68.922808
Gap: 1.753294 (2.54%)

To close this gap, we need to reduce the average side length.
Average reduction needed per N: 0.008766 score units
This translates to reducing side by ~0.053 units on average across all N values


In [2]:
# Key techniques from top kernels
print("="*60)
print("KEY TECHNIQUES FROM TOP KERNELS")
print("="*60)
print()
print("1. FRACTIONAL TRANSLATION (jonathanchan kernel)")
print("   - Micro-adjustments at 0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001")
print("   - 8 directions (N, S, E, W, NE, NW, SE, SW)")
print("   - Iterates until no improvement")
print()
print("2. SIMULATED ANNEALING (sa_v3 in jonathanchan)")
print("   - Temperature: 1.0 -> 0.000005")
print("   - Cooling factor: 0.25")
print("   - Population-based (keeps top 3 solutions)")
print()
print("3. ENSEMBLE FROM 19+ SOURCES (jonathanchan)")
print("   - Best config for each N from multiple optimizers")
print("   - Sources include: bucket-of-chump, telegram, why-not, santa-claude, etc.")
print()
print("4. BACKWARD PROPAGATION (egortrushin)")
print("   - For N=200 down to N=2")
print("   - Try removing each tree, keep if (N-1) config is better")
print()
print("5. GRID/LATTICE PACKING (egortrushin)")
print("   - For large N (>=58), use periodic arrangements")
print("   - nt = [rows, cols] for grid-based placement")

KEY TECHNIQUES FROM TOP KERNELS

1. FRACTIONAL TRANSLATION (jonathanchan kernel)
   - Micro-adjustments at 0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001
   - 8 directions (N, S, E, W, NE, NW, SE, SW)
   - Iterates until no improvement

2. SIMULATED ANNEALING (sa_v3 in jonathanchan)
   - Temperature: 1.0 -> 0.000005
   - Cooling factor: 0.25
   - Population-based (keeps top 3 solutions)

3. ENSEMBLE FROM 19+ SOURCES (jonathanchan)
   - Best config for each N from multiple optimizers
   - Sources include: bucket-of-chump, telegram, why-not, santa-claude, etc.

4. BACKWARD PROPAGATION (egortrushin)
   - For N=200 down to N=2
   - Try removing each tree, keep if (N-1) config is better

5. GRID/LATTICE PACKING (egortrushin)
   - For large N (>=58), use periodic arrangements
   - nt = [rows, cols] for grid-based placement


In [3]:
# What we need to implement
print("="*60)
print("IMPLEMENTATION PRIORITY")
print("="*60)
print()
print("HIGHEST PRIORITY: Extract and run sa_v1_parallel.cpp")
print("  - Located in: research/kernels/jonathanchan_santa25-ensemble-sa-fractional-translation/")
print("  - Contains: SA + fractional translation + population optimization")
print("  - Expected improvement: 1-2 points")
print()
print("SECOND PRIORITY: Ensemble from more sources")
print("  - Download more pre-optimized submissions")
print("  - For each N, pick best config across all sources")
print()
print("THIRD PRIORITY: Backward propagation")
print("  - After ensemble, run backward propagation")
print("  - Can improve smaller N values using larger N configs")

IMPLEMENTATION PRIORITY

HIGHEST PRIORITY: Extract and run sa_v1_parallel.cpp
  - Located in: research/kernels/jonathanchan_santa25-ensemble-sa-fractional-translation/
  - Contains: SA + fractional translation + population optimization
  - Expected improvement: 1-2 points

SECOND PRIORITY: Ensemble from more sources
  - Download more pre-optimized submissions
  - For each N, pick best config across all sources

THIRD PRIORITY: Backward propagation
  - After ensemble, run backward propagation
  - Can improve smaller N values using larger N configs


In [4]:
# Check what pre-optimized files we have
import os
import glob

preopt_dir = '/home/code/preoptimized'
print("Pre-optimized files available:")
for item in os.listdir(preopt_dir):
    full_path = os.path.join(preopt_dir, item)
    if os.path.isfile(full_path):
        print(f"  {item}")
    elif os.path.isdir(full_path):
        print(f"  {item}/ (directory)")
        for subitem in os.listdir(full_path)[:5]:
            print(f"    - {subitem}")
        if len(os.listdir(full_path)) > 5:
            print(f"    ... and {len(os.listdir(full_path))-5} more files")

Pre-optimized files available:
  bbox3
  ensemble.csv
  telegram/ (directory)
    - telegram_extracted
    - telegram-public-shared-solution-for-santa-2025.zip
    - 72.49.csv
    - 71.97.csv
  santa-2025-csv/ (directory)
    - bbox3
    - santa-2025.csv
  bucket-of-chump/ (directory)
    - bbox3
    - submission.csv
    - submission visualization.pdf
  submission.csv
  santa-2025.csv
  submission visualization.pdf
  bucket-of-chump.zip
  best_ensemble.csv
  chistyakov/ (directory)
    - submission_best.csv
    - santa2025-packed-version-of-current-best-public.zip
  santa-2025-csv.zip


In [5]:
# The key insight: sa_v1_parallel.cpp is the gold standard
print("="*60)
print("sa_v1_parallel.cpp KEY FEATURES")
print("="*60)
print()
print("From jonathanchan kernel analysis:")
print()
print("1. opt_v3() function:")
print("   - Runs SA with fractional translation")
print("   - Population-based (keeps top 3 solutions)")
print("   - Adaptive iterations based on N:")
print("     * N <= 20: 1.5x iterations, 6+ rounds")
print("     * N <= 50: 1.3x iterations, 5+ rounds")
print("     * N > 150: 0.8x iterations, 4+ rounds")
print()
print("2. fractional_translation() function:")
print("   - Step sizes: 0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001")
print("   - 8 directions per step")
print("   - Max 200 iterations")
print("   - Improvement threshold: 1e-12")
print()
print("3. Main loop:")
print("   - Endless mode with generation tracking")
print("   - Saves best solution to solutions/ directory")
print("   - Stops after max_retries generations without improvement")

sa_v1_parallel.cpp KEY FEATURES

From jonathanchan kernel analysis:

1. opt_v3() function:
   - Runs SA with fractional translation
   - Population-based (keeps top 3 solutions)
   - Adaptive iterations based on N:
     * N <= 20: 1.5x iterations, 6+ rounds
     * N <= 50: 1.3x iterations, 5+ rounds
     * N > 150: 0.8x iterations, 4+ rounds

2. fractional_translation() function:
   - Step sizes: 0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001
   - 8 directions per step
   - Max 200 iterations
   - Improvement threshold: 1e-12

3. Main loop:
   - Endless mode with generation tracking
   - Saves best solution to solutions/ directory
   - Stops after max_retries generations without improvement
