<a href="https://www.kaggle.com/code/psywarrior/project-abstraction?scriptVersionId=255091060" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
# ==============================================================================
# ARC Prize 2025 - Full Submission Template
#
# Author: Sanyam Sanjay Sharma
# Date: August 9, 2025
#
# Description:
# This script serves as a complete, self-contained, and compliant submission
# for the Kaggle ARC Prize 2025 competition. It is designed to run within the
# Kaggle Notebook environment, adhering to all rules, including the 12-hour
# runtime limit and the "no internet access" constraint.
#
# The script includes:
#   1. The foundational "Object Identification" module (`find_objects`).
#   2. A placeholder for the Domain-Specific Language (DSL) functions.
#   3. A main solver function (`solve_task`) to orchestrate the process.
#   4. A main execution block that simulates the Kaggle environment by reading
#      the input JSON, processing each task, and writing the required
#      `submission.json` file.
#
# This template provides the complete structure needed to compete. Our goal is
# to incrementally add intelligence to the `solve_task` function.
# ==============================================================================

import json
import os
from pathlib import Path
import numpy as np
from collections import deque

In [2]:
# ==============================================================================
# SECTION 1: CORE PERCEPTION MODULE (The "Eyes")
# This is the `find_objects` function we developed. It is the foundation of
# our solver's ability to understand the content of a grid.
# ==============================================================================

def find_objects(grid):
    """
    Identifies all distinct, contiguous objects in a grid.
    An "object" is defined as a group of connected cells of the same color that
    are not the background color. The background color is determined by finding
    the most frequent color in the grid.
    """
    if not isinstance(grid, np.ndarray):
        grid = np.array(grid, dtype=int)

    height, width = grid.shape
    
    if grid.size == 0:
        return [] # Handle empty grids

    colors, counts = np.unique(grid, return_counts=True)
    background_color = colors[np.argmax(counts)]

    visited = np.zeros_like(grid, dtype=bool)
    objects = []
    object_id_counter = 1

    for r in range(height):
        for c in range(width):
            if visited[r, c] or grid[r, c] == background_color:
                continue

            obj_color = grid[r, c]
            new_object = {
                'id': object_id_counter, 'color': int(obj_color), 'pixels': [],
                'min_row': r, 'max_row': r, 'min_col': c, 'max_col': c
            }
            
            q = deque([(r, c)])
            visited[r, c] = True
            
            while q:
                row, col = q.popleft()
                new_object['pixels'].append((row, col))
                new_object['min_row'] = min(new_object['min_row'], row)
                new_object['max_row'] = max(new_object['max_row'], row)
                new_object['min_col'] = min(new_object['min_col'], col)
                new_object['max_col'] = max(new_object['max_col'], col)
                
                for dr, dc in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
                    nr, nc = row + dr, col + dc
                    if 0 <= nr < height and 0 <= nc < width and \
                       not visited[nr, nc] and grid[nr, nc] == obj_color:
                        visited[nr, nc] = True
                        q.append((nr, nc))
            
            objects.append(new_object)
            object_id_counter += 1
            
    return objects

In [3]:
# ==============================================================================
# SECTION 2: DOMAIN-SPECIFIC LANGUAGE (DSL) (The "Hands")
# This section will contain our library of functions that can manipulate grids
# and objects. We will build these out in our next steps.
# ==============================================================================

# --- Placeholder DSL Functions ---
# These are examples of functions we will implement later. For now, they
# simply return the grid unmodified.

def move_object(grid, obj, dx, dy):
    # Future logic to move an object will go here.
    return grid

def recolor_object(grid, obj, new_color):
    # Future logic to change an object's color will go here.
    return grid

In [4]:
# ==============================================================================
# SECTION 3: THE MAIN SOLVER
# This is the "brain" of our operation. It takes a single task and tries to
# find a solution.
# ==============================================================================

def solve_task(task):
    """
    Analyzes a task's training pairs and attempts to predict the test outputs.

    Args:
        task (dict): A dictionary containing 'train' and 'test' pairs for a task.

    Returns:
        list: A list of predicted output grids for each test input.
    """
    predictions = []
    
    # --- Solver Logic ---
    # This is where the core reasoning will happen. For now, we will implement
    # a very simple baseline strategy: for each test input, our prediction
    # will be the input grid itself. This is a valid, though low-scoring,
    # starting point.

    # In the future, this logic will:
    # 1. Use `find_objects` on all training pairs.
    # 2. Search for a program (a sequence of DSL functions) that transforms
    #    each training input to its corresponding output.
    # 3. If a program is found, apply it to the test inputs to generate predictions.

    for test_pair in task['test']:
        test_input_grid = test_pair['input']
        
        # Baseline prediction: return the input as the output.
        predicted_grid = test_input_grid
        
        predictions.append(predicted_grid)
        
    return predictions

In [5]:
# ==============================================================================
# SECTION 4: KAGGLE SUBMISSION BOILERPLATE
# This main execution block handles the competition's file I/O requirements.
# It reads the test file, iterates through tasks, calls our solver, and
# writes the `submission.json` file in the specified format.
# ==============================================================================

if __name__ == '__main__':
    # Define file paths. In the Kaggle environment, the input data is located
    # in `/kaggle/input/` and the output must be written to `/kaggle/working/`.
    data_path = Path('/kaggle/input/arc-prize-2025')
    test_challenges_file = data_path / 'arc-agi_test_challenges.json'
    submission_file = Path('/kaggle/working/submission.json')

    # If not in the Kaggle environment, create dummy files for local testing.
    if not data_path.exists():
        # Create dummy directories and files for local execution
        print("Kaggle environment not found. Creating dummy files for local testing.")
        os.makedirs('/kaggle/input/arc-prize-2025', exist_ok=True)
        os.makedirs('/kaggle/working', exist_ok=True)
        
        # Create a dummy test file with one task
        dummy_task_id = "00576224"
        dummy_data = {
            dummy_task_id: {
                "train": [{"input": [[1,0],[0,0]], "output": [[0,1],[0,0]]}],
                "test": [{"input": [[0,2],[0,0]], "output": [[0,0],[0,0]]}] # output is hidden
            }
        }
        with open(test_challenges_file, 'w') as f:
            json.dump(dummy_data, f)
        data_path = Path('.') # Adjust path for local run

    # Load the test challenges from the JSON file.
    try:
        with open(test_challenges_file, 'r') as f:
            tasks = json.load(f)
    except FileNotFoundError:
        print(f"Error: Test file not found at {test_challenges_file}")
        tasks = {}

    # Initialize the submission dictionary.
    submission = {}

    # Process each task.
    for task_id, task_data in tasks.items():
        print(f"Processing task: {task_id}")
        
        # Call our main solver function for the current task.
        predicted_grids = solve_task(task_data)
        
        # Format the predictions according to the competition's `submission.json` schema.
        # Each prediction must have two attempts. We will use the same grid for both.
        task_predictions = []
        for grid in predicted_grids:
            task_predictions.append({
                "attempt_1": grid,
                "attempt_2": grid
            })
        
        submission[task_id] = task_predictions

    # Write the final submission dictionary to `submission.json`.
    with open(submission_file, 'w') as f:
        json.dump(submission, f)

    print("-" * 30)
    print(f"Submission file created at: {submission_file}")
    print(f"Total tasks processed: {len(submission)}")
    print("Script finished successfully.")

Processing task: 00576224
Processing task: 007bbfb7
Processing task: 009d5c81
Processing task: 00d62c1b
Processing task: 00dbd492
Processing task: 017c7c7b
Processing task: 025d127b
Processing task: 03560426
Processing task: 045e512c
Processing task: 0520fde7
Processing task: 05269061
Processing task: 05a7bcf2
Processing task: 05f2a901
Processing task: 0607ce86
Processing task: 0692e18c
Processing task: 06df4c85
Processing task: 070dd51e
Processing task: 08ed6ac7
Processing task: 09629e4f
Processing task: 0962bcdd
Processing task: 09c534e7
Processing task: 0a1d4ef5
Processing task: 0a2355a6
Processing task: 0a938d79
Processing task: 0b148d64
Processing task: 0b17323b
Processing task: 0bb8deee
Processing task: 0becf7df
Processing task: 0c786b71
Processing task: 0c9aba6e
Processing task: 0ca9ddb6
Processing task: 0d3d703e
Processing task: 0d87d2a6
Processing task: 0e206a2e
Processing task: 0e671a1a
Processing task: 0f63c0b9
Processing task: 103eff5b
Processing task: 10fcaaa3
Processing t