# Loop 3 Strategic Analysis

## Key Questions:
1. What is the gap between our best score and target?
2. What techniques have NOT been tried yet?
3. What does the fractional translation kernel actually achieve?

In [1]:
import pandas as pd
import numpy as np
import json
import os
import glob

# Load session state
with open('/home/code/session_state.json', 'r') as f:
    state = json.load(f)

print("=" * 60)
print("CURRENT STATUS")
print("=" * 60)
print(f"Target Score: 68.919154")
print(f"Best CV Score: {min(e['cv_score'] for e in state['experiments']):.6f}")
print(f"Best LB Score: {state['submissions'][0]['lb_score']:.6f}")
print(f"Gap to Target: {state['submissions'][0]['lb_score'] - 68.919154:.6f}")
print(f"\nSubmissions Used: {len(state['submissions'])}/100")
print(f"Experiments Run: {len(state['experiments'])}")

print("\n" + "=" * 60)
print("EXPERIMENT HISTORY")
print("=" * 60)
for exp in state['experiments']:
    print(f"  {exp['id']}: {exp['name']} | CV: {exp['cv_score']:.6f}")

print("\n" + "=" * 60)
print("SUBMISSION HISTORY")
print("=" * 60)
for sub in state['submissions']:
    lb = sub['lb_score'] if sub['lb_score'] else 'REJECTED'
    print(f"  {sub['model_name']}: CV={sub['cv_score']:.6f} | LB={lb}")
    if sub.get('error'):
        print(f"    ERROR: {sub['error']}")

CURRENT STATUS
Target Score: 68.919154
Best CV Score: 70.659944
Best LB Score: 70.676102
Gap to Target: 1.756948

Submissions Used: 2/100
Experiments Run: 3

EXPERIMENT HISTORY
  exp_000: 001_baseline | CV: 70.676102
  exp_001: 002_extended_optimization | CV: 70.659944
  exp_002: 003_strict_validation | CV: 70.676102

SUBMISSION HISTORY
  001_baseline: CV=70.676102 | LB=70.676102398091
  002_extended_optimization: CV=70.659944 | LB=REJECTED
    ERROR: Overlapping trees in group 069


In [2]:
# Analyze what techniques have been tried vs not tried
print("=" * 60)
print("TECHNIQUES ANALYSIS")
print("=" * 60)

tried = [
    "Pre-optimized baseline from snapshots",
    "bbox3 optimizer (short runs)",
    "tree_packer_v21 optimizer",
    "Eazy optimizer",
    "Ensemble from multiple CSVs",
    "fix_direction rotation",
    "Symmetric packing for small N (limited search)",
    "Perturbed restarts with small noise (0.05-0.2)",
    "Strict overlap detection"
]

not_tried = [
    "FRACTIONAL TRANSLATION (tiny steps 0.001-0.00001 in 8 directions)",
    "PROPER ENSEMBLE from 15+ diverse sources",
    "BACKWARD PROPAGATION (bp.cpp) - systematic application",
    "LATTICE-BASED PACKING for large N (grid placements)",
    "MUCH LONGER optimization runs (hours, not minutes)",
    "Large perturbations (0.5-1.0 units) to escape local optima",
    "Symmetric solutions with finer search grids"
]

print("\n✓ TRIED:")
for t in tried:
    print(f"  - {t}")

print("\n✗ NOT TRIED (HIGH PRIORITY):")
for t in not_tried:
    print(f"  - {t}")

TECHNIQUES ANALYSIS

✓ TRIED:
  - Pre-optimized baseline from snapshots
  - bbox3 optimizer (short runs)
  - tree_packer_v21 optimizer
  - Eazy optimizer
  - Ensemble from multiple CSVs
  - fix_direction rotation
  - Symmetric packing for small N (limited search)
  - Perturbed restarts with small noise (0.05-0.2)
  - Strict overlap detection

✗ NOT TRIED (HIGH PRIORITY):
  - FRACTIONAL TRANSLATION (tiny steps 0.001-0.00001 in 8 directions)
  - PROPER ENSEMBLE from 15+ diverse sources
  - BACKWARD PROPAGATION (bp.cpp) - systematic application
  - LATTICE-BASED PACKING for large N (grid placements)
  - MUCH LONGER optimization runs (hours, not minutes)
  - Large perturbations (0.5-1.0 units) to escape local optima
  - Symmetric solutions with finer search grids


In [3]:
# Check what the fractional translation kernel achieves
print("=" * 60)
print("FRACTIONAL TRANSLATION ANALYSIS")
print("=" * 60)

print("""
From jonathanchan's kernel:

1. ENSEMBLE from 15+ sources:
   - bucket-of-chump
   - SmartManoj/Santa-Scoreboard
   - santa-2025-try3
   - santa25-public
   - telegram-public-shared-solution
   - Multiple notebooks

2. FRACTIONAL TRANSLATION post-processing:
   - Steps: [0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001]
   - Directions: 8 (up, down, left, right, 4 diagonals)
   - For each tree, try tiny movements in all directions
   - Keep if bounding box improves without overlap

3. SA optimization with:
   - 15000-20000 iterations
   - 80+ rounds
   - Population-based (keep top 3 candidates)

The kernel achieves ~70.5x scores, which is BETTER than our 70.676.
The fractional translation is a CRITICAL technique we haven't tried!
""")

FRACTIONAL TRANSLATION ANALYSIS

From jonathanchan's kernel:

1. ENSEMBLE from 15+ sources:
   - bucket-of-chump
   - SmartManoj/Santa-Scoreboard
   - santa-2025-try3
   - santa25-public
   - telegram-public-shared-solution
   - Multiple notebooks

2. FRACTIONAL TRANSLATION post-processing:
   - Steps: [0.001, 0.0005, 0.0002, 0.0001, 0.00005, 0.00002, 0.00001]
   - Directions: 8 (up, down, left, right, 4 diagonals)
   - For each tree, try tiny movements in all directions
   - Keep if bounding box improves without overlap

3. SA optimization with:
   - 15000-20000 iterations
   - 80+ rounds
   - Population-based (keep top 3 candidates)

The kernel achieves ~70.5x scores, which is BETTER than our 70.676.
The fractional translation is a CRITICAL technique we haven't tried!



In [4]:
# Check available snapshots for ensemble
print("=" * 60)
print("AVAILABLE SNAPSHOTS FOR ENSEMBLE")
print("=" * 60)

csv_files = glob.glob('/home/nonroot/snapshots/santa-2025/*/code/**/*.csv', recursive=True)
print(f"\nFound {len(csv_files)} CSV files in snapshots")

# Group by snapshot
snapshots = {}
for f in csv_files:
    parts = f.split('/')
    snapshot_id = parts[4]  # The snapshot ID
    if snapshot_id not in snapshots:
        snapshots[snapshot_id] = []
    snapshots[snapshot_id].append(f.split('/')[-1])

print(f"\nSnapshots with CSV files: {len(snapshots)}")
for sid, files in list(snapshots.items())[:10]:
    print(f"  {sid}: {files[:3]}..." if len(files) > 3 else f"  {sid}: {files}")

AVAILABLE SNAPSHOTS FOR ENSEMBLE

Found 728 CSV files in snapshots

Snapshots with CSV files: 1
  santa-2025: ['submission.csv', 'submission_best.csv', 'submission_v18.csv']...


In [5]:
# Key insight: The gap is 1.76 points
# Top kernels achieve ~70.5 with fractional translation
# Target is 68.919 - that's 1.58 points BELOW the best public kernel!

print("=" * 60)
print("GAP ANALYSIS")
print("=" * 60)

print("""
Current best: 70.676102
Target:       68.919154
Gap:          1.756948 (2.5%)

Best public kernel (fractional translation): ~70.5
Gap from public kernel to target: ~1.58 points

This means:
1. We need to FIRST match the best public kernel (~70.5)
2. Then find 1.58 MORE points of improvement

The target of 68.919 is VERY aggressive - it's better than
any publicly shared solution!

Possible paths:
1. Implement fractional translation to get to ~70.5
2. Run MUCH longer optimization (hours)
3. Try fundamentally different approaches (lattice packing)
4. Focus on small N values (highest score contribution)
""")

GAP ANALYSIS

Current best: 70.676102
Target:       68.919154
Gap:          1.756948 (2.5%)

Best public kernel (fractional translation): ~70.5
Gap from public kernel to target: ~1.58 points

This means:
1. We need to FIRST match the best public kernel (~70.5)
2. Then find 1.58 MORE points of improvement

The target of 68.919 is VERY aggressive - it's better than
any publicly shared solution!

Possible paths:
1. Implement fractional translation to get to ~70.5
2. Run MUCH longer optimization (hours)
3. Try fundamentally different approaches (lattice packing)
4. Focus on small N values (highest score contribution)



In [6]:
# Check the C++ optimizer files available
print("=" * 60)
print("AVAILABLE OPTIMIZERS")
print("=" * 60)

import os

optimizers = [
    '/home/code/bbox3_compiled',
    '/home/code/tree_packer_v21',
    '/home/code/eazy_compiled'
]

for opt in optimizers:
    if os.path.exists(opt):
        print(f"  ✓ {opt}")
    else:
        print(f"  ✗ {opt} (NOT FOUND)")

# Check for source files
print("\nSource files:")
for f in glob.glob('/home/code/*.cpp'):
    print(f"  {f}")

AVAILABLE OPTIMIZERS
  ✓ /home/code/bbox3_compiled
  ✓ /home/code/tree_packer_v21
  ✗ /home/code/eazy_compiled (NOT FOUND)

Source files:
  /home/code/bbox3.cpp
  /home/code/eazy.cpp
  /home/code/tree_packer_v21.cpp


## STRATEGIC RECOMMENDATIONS

### IMMEDIATE PRIORITY: Implement Fractional Translation

The fractional translation technique from jonathanchan's kernel is:
1. Low-risk (post-processing, doesn't break existing solution)
2. Proven to work (achieves ~70.5 vs our 70.67)
3. Not yet tried

### NEXT: Build Proper Ensemble

Combine best N from multiple sources:
1. Scan all snapshots for best configuration per N
2. Validate no overlaps
3. Apply fractional translation to ensemble

### IF STILL STUCK: Lattice Approach for Large N

For N > 100, try grid-based placements (nx × ny grids).
This is fundamentally different from random optimization.