# ARC Prize 2025 - Clean Solver Submission

**Approach:** Practical ensemble solver with task classification  
**Target:** 15-25% accuracy (realistic baseline)  
**Author:** Ryan Cardwell & Claude  
**Date:** November 2025

## Strategy:
1. **Task Classification** - Route tasks to specialized solvers
2. **Ensemble Voting** - Multiple solvers vote, agreement = confidence  
3. **Dual Attempts** - Submit 2 solutions per task
4. **Time Budget Management** - Allocate time efficiently
5. **Robust Fallbacks** - Always submit something

## Setup

In [None]:
import numpy as np
import json
import time
from pathlib import Path
import sys

print("Environment check:")
print(f"  Python: {sys.version.split()[0]}")
print(f"  NumPy: {np.__version__}")
print(f"  Working dir: {Path.cwd()}")

## Copy Solver File

If using Kaggle dataset, copy the solver file. Otherwise it should be in the working directory.

In [None]:
import shutil

# Try to copy from Kaggle dataset
solver_file = 'arc_clean_solver.py'
kaggle_path = Path(f'/kaggle/input/arc-solver-clean/{solver_file}')
working_path = Path(f'/kaggle/working/{solver_file}')

if kaggle_path.exists():
    shutil.copy(kaggle_path, working_path)
    print(f"Copied {solver_file} from dataset")
elif Path(solver_file).exists():
    print(f"{solver_file} already in working directory")
else:
    print(f"WARNING: {solver_file} not found!")
    print("Will try to import anyway (may be inline)")

## Import Solver

In [None]:
# Add working directory to path
sys.path.insert(0, '/kaggle/working')
sys.path.insert(0, str(Path.cwd()))

try:
    from arc_clean_solver import ARCCleanSolver, SolverConfig, save_submission
    print("Successfully imported ARC Clean Solver")
except ImportError as e:
    print(f"ERROR: Could not import solver: {e}")
    print("\nYou need to either:")
    print("  1. Upload arc_clean_solver.py as a Kaggle dataset")
    print("  2. Include it in this notebook's files")
    raise

## Load Test Data

In [None]:
print("="*70)
print("ARC PRIZE 2025 - CLEAN SOLVER")
print("="*70)

# Determine data path
data_paths = [
    Path('/kaggle/input/arc-prize-2025'),
    Path('.'),
]

data_path = None
for p in data_paths:
    if (p / 'arc-agi_test_challenges.json').exists():
        data_path = p
        break

if data_path is None:
    raise FileNotFoundError("Could not find ARC test data!")

# Load test tasks
test_path = data_path / 'arc-agi_test_challenges.json'
print(f"\nLoading: {test_path}")

with open(test_path, 'r') as f:
    test_tasks = json.load(f)

print(f"Loaded {len(test_tasks)} test tasks")
print(f"\nSample tasks:")
for i, (task_id, task) in enumerate(list(test_tasks.items())[:3]):
    print(f"  {i+1}. {task_id}: {len(task['train'])} train, {len(task['test'])} test")

## Initialize Solver

In [None]:
# Configure solver
config = SolverConfig(
    total_time_budget=6 * 3600,  # 6 hours (leave buffer for 9hr limit)
    min_time_per_task=0.5,
    max_time_per_task=30.0,
    enable_task_classification=True,
    enable_ensemble_voting=True,
    output_path='/kaggle/working'
)

print("\nInitializing solver...")
solver = ARCCleanSolver(config)
print("Solver ready!")
print(f"\nConfiguration:")
print(f"  Time budget: {config.total_time_budget/3600:.1f} hours")
print(f"  Time per task: {config.min_time_per_task}s - {config.max_time_per_task}s")
print(f"  Task classification: {config.enable_task_classification}")
print(f"  Ensemble voting: {config.enable_ensemble_voting}")

## Solve Test Set

This will take approximately 3-6 hours depending on task complexity.

In [None]:
print("\n" + "="*70)
print("STARTING SOLVE PROCESS")
print("="*70)

start_time = time.time()

try:
    submission = solver.solve_test_set(test_tasks)
    solve_success = True
except Exception as e:
    print(f"\nERROR during solving: {e}")
    print("Creating fallback submission...")
    
    # Emergency fallback
    submission = {}
    for task_id, task in test_tasks.items():
        test_input = np.array(task['test'][0]['input'])
        num_outputs = len(task['test'])
        submission[task_id] = [{
            'attempt_1': test_input.tolist(),
            'attempt_2': np.rot90(test_input).tolist()
        } for _ in range(num_outputs)]
    
    solve_success = False

solve_time = time.time() - start_time

print(f"\nSolve time: {solve_time:.0f}s ({solve_time/60:.1f} min)")
print(f"Tasks in submission: {len(submission)}")

## Validate Submission

In [None]:
def validate_submission(submission: dict, expected_tasks: dict) -> bool:
    """Validate submission format"""
    print("\n" + "="*70)
    print("VALIDATING SUBMISSION")
    print("="*70)
    
    errors = []
    
    # Check task count
    if len(submission) != len(expected_tasks):
        errors.append(f"Task count mismatch: {len(submission)} vs {len(expected_tasks)}")
    
    # Check each task
    for task_id, task in expected_tasks.items():
        if task_id not in submission:
            errors.append(f"Missing task: {task_id}")
            continue
        
        task_solution = submission[task_id]
        
        # Check format
        if not isinstance(task_solution, list):
            errors.append(f"{task_id}: Not a list")
            continue
        
        # Check number of test outputs
        expected_outputs = len(task['test'])
        if len(task_solution) != expected_outputs:
            errors.append(f"{task_id}: Wrong number of outputs ({len(task_solution)} vs {expected_outputs})")
        
        # Check each output has attempt_1 and attempt_2
        for i, output in enumerate(task_solution):
            if not isinstance(output, dict):
                errors.append(f"{task_id}[{i}]: Not a dict")
                continue
            
            if 'attempt_1' not in output or 'attempt_2' not in output:
                errors.append(f"{task_id}[{i}]: Missing attempt_1 or attempt_2")
    
    # Print results
    if errors:
        print(f"\nFOUND {len(errors)} ERRORS:")
        for error in errors[:10]:  # Show first 10
            print(f"  - {error}")
        if len(errors) > 10:
            print(f"  ... and {len(errors)-10} more")
        print("\nVALIDATION FAILED")
        return False
    else:
        print(f"\nAll checks passed!")
        print(f"  Tasks: {len(submission)}")
        print(f"  Format: Valid")
        print("\nVALIDATION PASSED")
        return True

# Validate
is_valid = validate_submission(submission, test_tasks)

if not is_valid:
    raise ValueError("Submission validation failed!")

## Save Submission

In [None]:
# Save using the built-in save function
print("\n" + "="*70)
print("SAVING SUBMISSION")
print("="*70)

clean_submission = save_submission(submission, config)

# Verify file exists
submission_path = Path('/kaggle/working/submission.json')
if submission_path.exists():
    file_size = submission_path.stat().st_size
    print(f"\nSubmission file created successfully!")
    print(f"  Path: {submission_path}")
    print(f"  Size: {file_size/1024:.1f} KB")
    print(f"  Tasks: {len(clean_submission)}")
else:
    print(f"\nWARNING: Could not verify submission file at {submission_path}")

## Final Summary

In [None]:
total_time = time.time() - start_time

print("\n" + "="*70)
print("SUBMISSION COMPLETE")
print("="*70)
print(f"\nStatistics:")
print(f"  Total time: {total_time:.0f}s ({total_time/60:.1f} min)")
print(f"  Tasks solved: {len(submission)}")
print(f"  Average time per task: {total_time/len(submission):.2f}s")
print(f"  Solve success: {'Yes' if solve_success else 'No (used fallback)'}")

if solver.stats:
    print(f"\nSolver Statistics:")
    print(f"  High confidence: {solver.stats.get('high_confidence', 0)}")
    print(f"  Fallbacks: {solver.stats.get('fallbacks', 0)}")
    
    by_category = solver.stats.get('by_category', {})
    if by_category:
        print(f"\n  By category:")
        for cat, count in by_category.most_common():
            print(f"    {cat}: {count}")

print("\n" + "="*70)
print("Ready for Kaggle submission!")
print("Expected accuracy: 15-25% (realistic baseline)")
print("="*70)

## Notes

### What This Solver Does:
1. **Classifies tasks** into categories (geometric, color, spatial, pattern, complex)
2. **Routes to specialists** - Different solvers for different task types
3. **Ensemble voting** - Multiple solvers agree = higher confidence
4. **Dual attempts** - Submits 2 different solutions per task
5. **Time management** - Allocates time efficiently across all tasks
6. **Robust fallbacks** - Always submits something, never fails

### Why 15-25% is Realistic:
- Baseline (random): ~4%
- Simple pattern matching: ~10-15%
- This solver (multi-strategy): ~15-25%
- SOTA (complex neural): ~35-45%
- Human performance: ~80-90%

### Future Improvements:
1. Add more specialized solvers (object tracking, counting, etc.)
2. Implement learned features from training data
3. Add symbolic reasoning for complex tasks
4. Improve variation generation for attempt_2
5. Meta-learning across tasks