## Pipeline: LLM-powered program generation for solving ARC-AGI

### Imports

In [113]:
import numpy as np
import ollama
import os
import json
from dotenv import load_dotenv
from openai import OpenAI

### Preparation

#### Load API-Key

In [114]:
def load_api_key(file_path="key.env"):
    load_dotenv(file_path)
    import openai
    openai.api_key = os.getenv("OPENAI_API_KEY")
    if not openai.api_key:
        print("No API key found. Please set OPENAI_API_KEY in key.env.")
    global client
    client = OpenAI()

### Shared Variables

In [115]:
COLOR_MAP = """\
Each integer of the grids corresponds to a color using the following mapping:

0 = black  
1 = blue  
2 = red  
3 = green  
4 = yellow  
5 = grey  
6 = pink  
7 = orange  
8 = light blue  
9 = brown
"""

### Prompts

#### Basic Prompt

In [116]:
basic_prompt = f"""
You are a programming expert specialized in solving ARC-AGI (Abstraction and Reasoning Corpus - Artificial General Intelligence) tasks. 
Your task is to write efficient and correct Python functions that solve given ARC tasks based on example input-output pairs.

{COLOR_MAP}

Your Python solution should include a function called `solve(grid: List[List[int]]) -> List[List[int]]` that performs the transformation.
Only return the function definition and any necessary imports (e.g., `import numpy as np`).
Avoid explanation, comments, or print statements—only return valid code that could be run as-is in a script or notebook cell.
"""

In [117]:
print(basic_prompt)


You are a programming expert specialized in solving ARC-AGI (Abstraction and Reasoning Corpus - Artificial General Intelligence) tasks. 
Your task is to write efficient and correct Python functions that solve given ARC tasks based on example input-output pairs.

Each integer of the grids corresponds to a color using the following mapping:

0 = black  
1 = blue  
2 = red  
3 = green  
4 = yellow  
5 = grey  
6 = pink  
7 = orange  
8 = light blue  
9 = brown


Your Python solution should include a function called `solve(grid: List[List[int]]) -> List[List[int]]` that performs the transformation.
Only return the function definition and any necessary imports (e.g., `import numpy as np`).
Avoid explanation, comments, or print statements—only return valid code that could be run as-is in a script or notebook cell.



#### Prompt 1

In [None]:
prompt_1 = f"""
You are an expert in visual pattern recognition. Your task is to analyze ARC-AGI tasks and list direct visual observations from the training pairs.

Instructions:
- List only what is visually present in the input and output grids.
- Focus on colors, shapes, positions, object counts, and differences between input and output.
- Avoid any reasoning, explanations, or rules.
- Use bullet points — maximum 10 items total.
- Be concise. No full sentences, no introductions, no extra formatting.

Color mapping for reference:
{COLOR_MAP}

Begin listing your visual observations:
"""

In [119]:
print(prompt_1)


You are an expert in visual pattern recognition. Your task is to analyze ARC-AGI tasks and list direct visual observations from the training pairs.

Instructions:
- List only what is visually present in the input and output grids.
- Focus on colors, shapes, positions, object counts, and differences between input and output.
- Avoid any reasoning, explanations, or rules.
- Use bullet points — maximum 10 items total.
- Be concise. No full sentences, no introductions, no extra formatting.

Color mapping for reference:
Each integer of the grids corresponds to a color using the following mapping:

0 = black  
1 = blue  
2 = red  
3 = green  
4 = yellow  
5 = grey  
6 = pink  
7 = orange  
8 = light blue  
9 = brown


Begin listing your visual observations:



#### Prompt 2

In [120]:
prompt_2 = f"""
You are an expert in abstract reasoning and transformation analysis.

Your task is to examine an ARC-AGI (Abstraction and Reasoning Corpus – Artificial General Intelligence) task and describe, in natural language, the transformation that occurs from input to output in each demonstration pair.

{COLOR_MAP}

Instructions:
- Provide a concise transformation analysis using 3 to 5 short sentences.
- Focus on what changes between the input and output grids: object movement, color changes, deletion, duplication, or reorganization.
- Emphasize if transformations are applied individually or based on spatial/contextual relationships.
- Identify any patterns, conditional logic, or geometric rules (e.g., alignment, mirroring, filtering).
- Do not include code, pseudocode, or implementation hints.
"""

In [121]:
print(prompt_2)


You are an expert in abstract reasoning and transformation analysis.

Your task is to examine an ARC-AGI (Abstraction and Reasoning Corpus – Artificial General Intelligence) task and describe, in natural language, the transformation that occurs from input to output in each demonstration pair.

Each integer of the grids corresponds to a color using the following mapping:

0 = black  
1 = blue  
2 = red  
3 = green  
4 = yellow  
5 = grey  
6 = pink  
7 = orange  
8 = light blue  
9 = brown


Instructions:
- Provide a concise transformation analysis using 3 to 5 short sentences.
- Focus on what changes between the input and output grids: object movement, color changes, deletion, duplication, or reorganization.
- Emphasize if transformations are applied individually or based on spatial/contextual relationships.
- Identify any patterns, conditional logic, or geometric rules (e.g., alignment, mirroring, filtering).
- Do not include code, pseudocode, or implementation hints.



#### Prompt 3

In [122]:
prompt_3 = f"""
You are a programming expert preparing to solve an ARC-AGI (Abstraction and Reasoning Corpus – Artificial General Intelligence) task using Python.

Your task is to reflect — in natural language — on how you would implement a function that transforms the input grid(s) into the correct output grid(s), based on the demonstrated examples.

{COLOR_MAP}

Write a short, clear reflection (3–5 sentences) covering:
- Your high-level approach to solving the task using Python,
- The logical steps or operations the program might need to perform,
- How the identified patterns and transformations would guide your implementation,
- Any assumptions or uncertainties you'd consider during implementation.

Do not return code or pseudocode. Focus only on your reasoning and planning process.
"""

In [123]:
print(prompt_3)


You are a programming expert preparing to solve an ARC-AGI (Abstraction and Reasoning Corpus – Artificial General Intelligence) task using Python.

Your task is to reflect — in natural language — on how you would implement a function that transforms the input grid(s) into the correct output grid(s), based on the demonstrated examples.

Each integer of the grids corresponds to a color using the following mapping:

0 = black  
1 = blue  
2 = red  
3 = green  
4 = yellow  
5 = grey  
6 = pink  
7 = orange  
8 = light blue  
9 = brown


Write a short, clear reflection (3–5 sentences) covering:
- Your high-level approach to solving the task using Python,
- The logical steps or operations the program might need to perform,
- How the identified patterns and transformations would guide your implementation,
- Any assumptions or uncertainties you'd consider during implementation.

Do not return code or pseudocode. Focus only on your reasoning and planning process.



#### Prompt 4

In [124]:
#TODO: Add outputs of secondary prompts to other secondary prompts (especially for prompt 3) Maybe "buildPrompt" function.
#TODO: Create Revision prompt.

In [125]:
prompt_4 = ""

### Functions

In [126]:
def load_tasks(folder):
    tasks = []
    for filename in sorted(os.listdir(folder)):
        if filename.endswith(".json"):
            with open(os.path.join(folder, filename), "r") as f:
                data = json.load(f)
                tasks.append({"filename": filename, "data": data})
    return tasks

#### Building Prompts

Adds the tasks demonstration pairs to the prompt:

In [127]:
def build_prompt(prompt, task_data):
    full_prompt = prompt.strip() + "\n\nHere are the demonstration pairs (JSON data):\n"
    for i, pair in enumerate(task_data['train']):
        full_prompt += f"\nTrain Input {i+1}: {pair['input']}\n"
        full_prompt += f"Train Output {i+1}: {pair['output']}\n"
    return full_prompt

Combines Secondary prompt 1 and 2:

In [128]:
def combine_prompts_1_and_2(prompt_1_response, prompt_2_template):
    combined_prompt = f"""{prompt_2_template.strip()}

Here are visual observations of the task at hand, that may assist you in identifying the transformation:

{prompt_1_response.strip()}

Now provide your transformation analysis based on these observations."""
    return combined_prompt

Combines Secondary prompt 1, 2 and 3:

In [129]:
def combine_prompts_1_2_and_3(prompt_1_response, prompt_2_response, prompt_3_template):
    combined_prompt = f"""{prompt_3_template.strip()}

Here are visual observations of the task that may help inform your implementation:
{prompt_1_response.strip()}

Here are the transformation rules that have been identified based on the task:
{prompt_2_response.strip()}

Now reflect on how you would implement a solution to this task in Python, following the instructions above.
"""
    return combined_prompt


#### Call GPT

In [130]:
def call_gpt(prompt):
    response = client.chat.completions.create(
        model="o3-mini",
        messages=[
            {"role": "system", "content": "You are a Python programming assistant."},
            {"role": "user", "content": prompt}
        ]
    )
    return response.choices[0].message.content.strip()


### Pipeline

In [131]:
tasks = load_tasks("evaluation_set")
load_api_key()

for i, task in enumerate(tasks[:1]):
    # Build and run secondary prompts:
    full_prompt_1 = build_prompt(prompt_1, task["data"])
    response_1 = call_gpt(full_prompt_1)
    print(f"Prompt 1 Response for task {i + 1}:\n{response_1}\n")
    combined_prompt_2 = combine_prompts_1_and_2(response_1, prompt_2)
    full_prompt_2 = build_prompt(combined_prompt_2, task["data"])
    response_2 = call_gpt(full_prompt_2)
    print(f"Prompt 2 Response for task {i + 1}:\n{response_2}\n")
    combined_prompt_3 = combine_prompts_1_2_and_3(response_1, response_2, prompt_3)
    full_prompt_3 = build_prompt(combined_prompt_3, task["data"])
    response_3 = call_gpt(full_prompt_3)
    print(f"Prompt 3 Response for task {i + 1}:\n{response_3}\n")


Prompt 1 Response for task 1:
- Train Pair 1: Input shows scattered green (3) marks amid continuous pink (6) bands and blocky light blue (8) areas.
- Train Pair 1: Output replaces many colored elements with black (0), especially in border rows.
- Train Pair 1: Central rows exhibit horizontal groupings of pink (6) and light blue (8) with fewer isolated greens.
- Train Pair 2: Input contains clusters of blue (1), red (2), green (3), and light blue (8) arranged in patterned rows.
- Train Pair 2: Output removes most blue (1) and red (2) markings, leaving clearer green (3) and light blue (8) blocks.
- Train Pair 2: Several rows in output are entirely black (0) compared to patterned input rows.
- Train Pair 3: Input displays blocky arrangements of red (2) and green (3) with occasional blue (1) details.
- Train Pair 3: Repeating horizontal stripes of colored blocks are visible in the central portion of the input.
- Train Pair 3: Output features a uniform black (0) background with preserved ce

In [132]:
print(full_prompt_1)
print("="*50)
print(full_prompt_2)
print("="*50)
print(full_prompt_3)

You are an expert in visual pattern recognition. Your task is to analyze ARC-AGI tasks and list direct visual observations from the training pairs.

Instructions:
- List only what is visually present in the input and output grids.
- Focus on colors, shapes, positions, object counts, and differences between input and output.
- Avoid any reasoning, explanations, or rules.
- Use bullet points — maximum 10 items total.
- Be concise. No full sentences, no introductions, no extra formatting.

Color mapping for reference:
Each integer of the grids corresponds to a color using the following mapping:

0 = black  
1 = blue  
2 = red  
3 = green  
4 = yellow  
5 = grey  
6 = pink  
7 = orange  
8 = light blue  
9 = brown


Begin listing your visual observations:

Here are the demonstration pairs (JSON data):

Train Input 1: [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0], [0, 0, 6, 6, 6, 6, 6, 6, 0, 0, 6, 6, 6, 6, 3, 6, 0, 0, 0, 0, 0, 0], [0, 0, 8, 8, 3, 3, 8, 8, 0, 0, 8, 3, 3