# LLM-based Permutation Solver - Testing Notebook

This notebook demonstrates how to use the permutation solver with different LLM providers.

In [1]:
# Import required modules
import numpy as np
import json
from main import (
    PermutationSolver,
    PermutationConfig,
    get_llm_client,
    generate_random_permutation,
    apply_moves,
    is_trivial,
    verify_solution,
    get_system_prompt,
    get_main_prompt,
    extract_python_code
)

## 1. Basic Permutation Operations

Let's first understand the basic operations (L, R, X).

In [2]:
from main import apply_L, apply_R, apply_X

# Example array
arr = np.array([0, 1, 2, 3])
print(f"Original: {arr}")
print(f"After L (left shift): {apply_L(arr)}")
print(f"After R (right shift): {apply_R(arr)}")
print(f"After X (swap first two): {apply_X(arr)}")

Original: [0 1 2 3]
After L (left shift): [1 2 3 0]
After R (right shift): [3 0 1 2]
After X (swap first two): [1 0 2 3]


In [3]:
# Apply a sequence of moves
arr = np.array([1, 2, 3, 0])
moves = ['L', 'L', 'X', 'R']

print(f"Original: {arr}")
result = apply_moves(arr, moves)
print(f"After moves {moves}: {result}")
print(f"Is trivial: {is_trivial(result)}")

Original: [1 2 3 0]
After moves ['L', 'L', 'X', 'R']: [2 0 3 1]
Is trivial: False


## 2. Generate Random Permutations

In [4]:
# Generate random permutations of different lengths
for n in [4, 6, 8]:
    perm = generate_random_permutation(n)
    print(f"Length {n}: {perm}")

Length 4: [0 2 1 3]
Length 6: [1 3 4 5 0 2]
Length 8: [7 2 1 3 0 6 5 4]


## 3. Prompts Configuration

### 3.1 Custom Prompts (Editable)

You can customize the prompts below and use them with the solver. Edit these cells and set `USE_CUSTOM_PROMPTS = True` in section 5 to use your custom prompts.

In [5]:
# Custom System Prompt - edit as needed
CUSTOM_SYSTEM_PROMPT = """You are an expert algorithm designer specializing in combinatorial puzzles and permutation theory.
Your task is to create efficient, constructive algorithms that solve permutation problems using only allowed moves.
You must provide working Python code that is self-contained and can be executed directly.
Always follow the constraints specified in the task and provide polynomial-time solutions."""

In [6]:
CUSTOM_MAIN_PROMPT = """Task: Implement a constructive sorting algorithm that sorts a given vector using ONLY allowed moves (L, R, X).

Input: A vector a of length n (0-indexed) containing distinct integers from 0 to n-1.

Allowed moves:
L: Left cyclic shift — shifts all elements one position to the left, with the first element moving to the end. Example: [0,1,2,3] -> [1,2,3,0].
R: Right cyclic shift — shifts all elements one position to the right, with the last element moving to the beginning. Example: [0,1,2,3] -> [3,0,1,2].
X: Transposition of the first two elements — swaps the elements at positions 0 and 1. Example: [0,1,2,3] -> [1,0,2,3].

CRITICAL CONSTRAINTS:
1. NO BFS, DFS, or any graph search algorithms are allowed
2. The algorithm must run in POLYNOMIAL TIME (O(n^k) for some constant k)
3. No exponential-time algorithms (like brute force search through permutations)
4. Must use a constructive, iterative approach that builds the solution step by step
5. No storing or exploring multiple states simultaneously

Strict operational constraints:
- No other operations, slicing, built-in sorting functions, or creating new arrays are allowed (except for a copy to simulate sorting)
- All moves must be appended to the moves list immediately after performing them (as strings: 'L', 'R', or 'X')
- Applying the sequence of moves sequentially to a copy of the input vector must yield a fully sorted ascending array [0, 1, 2, ..., n-1]
- Moves can be used multiple times as needed
- The algorithm must continue applying moves until the array is fully sorted

ALGORITHMIC REQUIREMENTS:
- Use a constructive approach: develop a strategy that systematically brings elements to their correct positions
- Think in terms of bringing the smallest unsorted element to the front, then "locking" it in place
- Consider how L and R can help position elements for X swaps
- The solution should work for any n and have predictable, polynomial-time complexity

Expected approach types (choose one):
1. Adaptation of bubble sort/insertion sort using available moves
2. Strategy of bringing smallest element to front, then second smallest, etc.
3. Any other polynomial-time constructive approach

Implementation requirements:
- Implement a function solve(vector) that returns a tuple (moves, sorted_array):
    - moves: list of strings representing all moves performed (e.g., ['L', 'X', 'R', ...])
    - sorted_array: the final sorted array after applying all moves (as a list)
- Include CLI interface:
    - When script is executed directly, accept vector as command-line argument (parse sys.argv[1] as JSON)
    - Use {default_vector} as fallback if no arg is given
    - Output should be JSON object with keys "moves" and "sorted_array"
- Include minimal example in main block for quick testing
- Code must be fully self-contained and executable without external dependencies (only sys, json allowed)
- JSON output must always be structured and parseable for automated testing

Example usage:
    python solve_module.py "[3,1,2,0,4]"

Example output (for illustration):
{{
    "moves": ["X", "L", "R", "X"],
    "sorted_array": [0,1,2,3,4]
}}

IMPORTANT: Focus on developing a polynomial-time constructive algorithm, NOT graph search.
Provide ONLY the Python code, no explanations before or after."""

## 4. Using Different LLM Providers

Set your API keys as environment variables:
- `OPENAI_API_KEY` for OpenAI
- `GOOGLE_API_KEY` for Gemini
- `ANTHROPIC_API_KEY` for Claude

In [13]:
# Set API keys (uncomment and fill in your keys)
import os

if os.getenv("OPENAI_API_KEY") is None:
    raise RuntimeError(
        "OPENAI_API_KEY is not set. "
        "Add it via Kaggle Secrets or environment variables."
    )
else:
    os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")
model_name = 'gpt-5.2'

### 4.1 OpenAI Example

In [9]:
# Create OpenAI client
try:
    openai_client = get_llm_client('openai', model=model_name)
    print(f"Created client: {openai_client.name}")
except ValueError as e:
    print(f"Error: {e}")

Created client: OpenAI (gpt-5.2)


### 4.2 ANTHROPIC Example

In [10]:
os.getenv("ANTHROPIC_API_KEY")

In [12]:
if os.getenv("ANTHROPIC_API_KEY") is None:
    raise RuntimeError(
        "ANTHROPIC_API_KEY is not set. "
        "Add it via Kaggle Secrets or environment variables."
    )
else:
    os.environ['ANTHROPIC_API_KEY'] = os.getenv("ANTHROPIC_API_KEY")
try:
    claude_client = get_llm_client('claude', model='claude-sonnet-4-20250514')
    print(f"Created client: {claude_client.name}")
except ValueError as e:
    print(f"Error: {e}")

RuntimeError: ANTHROPIC_API_KEY is not set. Add it via Kaggle Secrets or environment variables.

## 5. Full Solver Pipeline

Here's how to use the full solver pipeline.

In [None]:
# Choose your provider (change as needed)
from ast import mod


PROVIDER = 'openai'  # 'openai', or 'claude'

# Set USE_CUSTOM_PROMPTS = True to use your custom prompts from section 3.1
USE_CUSTOM_PROMPTS = False

try:
    client = get_llm_client(PROVIDER, model=model_name)
    
    if USE_CUSTOM_PROMPTS:
        # Use custom prompts defined in section 3.1
        config = PermutationConfig(
            length=4,
            system_prompt=CUSTOM_SYSTEM_PROMPT,
            main_prompt_template=CUSTOM_MAIN_PROMPT
        )
        print("Using CUSTOM prompts")
    else:
        # Use default prompts from main.py
        config = PermutationConfig(length=4)
        print("Using DEFAULT prompts")
    
    solver = PermutationSolver(client, config)
    print(f"Solver created with {client.name}")
except Exception as e:
    print(f"Error creating solver: {e}")
    solver = None

Using DEFAULT prompts
Solver created with OpenAI (gpt-5.2)


In [11]:
# Generate the algorithm (this calls the LLM)
if solver:
    code = solver.generate_algorithm(4)
    print("Generated code:")
    print("=" * 60)
    print(code)
    print("=" * 60)

Generating algorithm using OpenAI (gpt-5.2)...
Generated code:
import sys
import json

def solve(vector):
    a = list(vector)  # allowed copy to simulate sorting
    n = len(a)
    moves = []

    def do_L():
        if n <= 1:
            return
        first = a[0]
        i = 0
        while i < n - 1:
            a[i] = a[i + 1]
            i += 1
        a[n - 1] = first
        moves.append("L")

    def do_R():
        if n <= 1:
            return
        last = a[n - 1]
        i = n - 1
        while i > 0:
            a[i] = a[i - 1]
            i -= 1
        a[0] = last
        moves.append("R")

    def do_X():
        if n <= 1:
            return
        a[0], a[1] = a[1], a[0]
        moves.append("X")

    def rotate_to_front(pos):
        # Rotate (L or R) to bring index pos to 0 with fewer moves
        if n <= 1:
            return
        if pos <= n - pos:
            i = 0
            while i < pos:
                do_L()
                i += 1
        else:
  

In [12]:
# Test on a specific permutation
if solver and solver.generated_code:
    test_perm = np.array([1, 2, 3, 0])
    print(f"Testing permutation: {test_perm}")
    
    result = solver.solve(test_perm)
    
    if result['success']:
        print(f"Success!")
        print(f"Moves: {result['moves']}")
        print(f"Number of moves: {len(result['moves'])}")
        print(f"Sorted array: {result['sorted_array']}")
        print(f"Verification: {result['verification']['is_correct']}")
    else:
        print(f"Failed: {result.get('error', 'Unknown error')}")

Testing permutation: [1 2 3 0]
Success!
Moves: ['L', 'R', 'L', 'L', 'X', 'R', 'R', 'L', 'X', 'R', 'L', 'L', 'R', 'R', 'X', 'L', 'R', 'L', 'L', 'R', 'R']
Number of moves: 21
Sorted array: [0, 1, 2, 3]
Verification: True


In [21]:
# Run multiple random tests
if solver and solver.generated_code:
    print("Running 10 random tests...")
    results = solver.test_random(n=4, num_tests=10)
    
    print(f"\nFinal Results:")
    print(f"Success rate: {results['success_rate']*100:.1f}%")
    print(f"Passed: {results['success_count']}/{results['num_tests']}")

Running 10 random tests...
Test 1/10: [1, 3, 2, 0]
  ✗ Failed: Unknown error
Test 2/10: [1, 3, 0, 2]
  ✗ Failed: Unknown error
Test 3/10: [3, 2, 1, 0]
  ✗ Failed: Unknown error
Test 4/10: [3, 1, 0, 2]
  ✗ Failed: Unknown error
Test 5/10: [0, 3, 2, 1]
  ✗ Failed: Unknown error
Test 6/10: [1, 3, 2, 0]
  ✗ Failed: Unknown error
Test 7/10: [0, 2, 3, 1]
  ✗ Failed: Unknown error
Test 8/10: [1, 3, 2, 0]
  ✗ Failed: Unknown error
Test 9/10: [3, 1, 0, 2]
  ✗ Failed: Unknown error
Test 10/10: [3, 2, 1, 0]
  ✗ Failed: Unknown error

Final Results:
Success rate: 0.0%
Passed: 0/10


## 6. Manual Code Execution and Verification

You can also manually test code snippets.

In [28]:
from main import execute_generated_code

# Example: A simple (but inefficient) sorting algorithm
sample_code = '''
import sys
import json


def apply_L(a):
    n = len(a)
    first = a[0]
    for i in range(n - 1):
        a[i] = a[i + 1]
    a[n - 1] = first


def apply_R(a):
    n = len(a)
    last = a[n - 1]
    for i in range(n - 1, 0, -1):
        a[i] = a[i - 1]
    a[0] = last


def apply_X(a):
    a[0], a[1] = a[1], a[0]


def swap_adjacent(a, pos, moves):
    for _ in range(pos):
        apply_L(a)
        moves.append('L')
    
    apply_X(a)
    moves.append('X')
    
    for _ in range(pos):
        apply_R(a)
        moves.append('R')


def solve(vector):

    a = list(vector)
    moves = []
    n = len(a)
    
    # Handle trivial cases
    if n <= 1:
        return moves, a
    
    # Bubble sort: O(n^2) passes
    # In each pass, bubble the largest unsorted element to its correct position
    for i in range(n - 1):
        swapped = False
        for j in range(n - 1 - i):
            if a[j] > a[j + 1]:
                swap_adjacent(a, j, moves)
                swapped = True
        # Early termination if no swaps occurred (array is sorted)
        if not swapped:
            break
    
    return moves, a

if __name__ == "__main__":
    if len(sys.argv) > 1:
        vector = json.loads(sys.argv[1])
    else:
        vector = [3, 1, 2]
    
    moves, sorted_array = solve(vector)
    print(json.dumps({"moves": moves, "sorted_array": sorted_array}))

'''

# Test the sample code
test_vector = [1,2,3,4,5,6,8,7,0]
result = execute_generated_code(sample_code, test_vector)

print(f"Input: {test_vector}")
if result['success']:
    print(f"Output: {result['sorted_array']}")
    print(f"Moves ({len(result['moves'])}): {result['moves']}")
    print
else:
    print(f"Error: {result.get('error', 'Unknown')}")

Input: [1, 2, 3, 4, 5, 6, 8, 7, 0]
Output: [0, 1, 2, 3, 4, 5, 6, 7, 8]
Moves (77): ['L', 'L', 'L', 'L', 'L', 'L', 'X', 'R', 'R', 'R', 'R', 'R', 'R', 'L', 'L', 'L', 'L', 'L', 'L', 'L', 'X', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'L', 'L', 'L', 'L', 'L', 'L', 'X', 'R', 'R', 'R', 'R', 'R', 'R', 'L', 'L', 'L', 'L', 'L', 'X', 'R', 'R', 'R', 'R', 'R', 'L', 'L', 'L', 'L', 'X', 'R', 'R', 'R', 'R', 'L', 'L', 'L', 'X', 'R', 'R', 'R', 'L', 'L', 'X', 'R', 'R', 'L', 'X', 'R', 'X']


In [29]:
# Verify the solution manually
if result['success']:
    original = np.array(test_vector)
    expected = np.arange(len(test_vector))
    
    verification = verify_solution(original, result['moves'], expected)
    print(f"Verification result: {verification['is_correct']}")
    print(f"Original: {verification['original']}")
    print(f"Result: {verification['result']}")
    print(f"Expected: {verification['expected']}")

Verification result: True
Original: [1, 2, 3, 4, 5, 6, 8, 7, 0]
Result: [0, 1, 2, 3, 4, 5, 6, 7, 8]
Expected: [0, 1, 2, 3, 4, 5, 6, 7, 8]


## 7. Compare Different Providers

Test the same permutation with different LLM providers.

In [None]:
def compare_providers(providers: list, permutation: np.ndarray):
    """Compare solutions from different LLM providers."""
    results = {}
    
    for provider in providers:
        print(f"\n{'='*60}")
        print(f"Testing {provider.upper()}")
        print(f"{'='*60}")
        
        try:
            client = get_llm_client(provider)
            solver = PermutationSolver(client, PermutationConfig(length=len(permutation)))
            solver.generate_algorithm(len(permutation))
            result = solver.solve(permutation)
            
            results[provider] = {
                'success': result['success'],
                'num_moves': len(result.get('moves', [])) if result['success'] else None,
                'is_correct': result.get('verification', {}).get('is_correct', False)
            }
            
            if result['success']:
                print(f"✓ Success with {len(result['moves'])} moves")
            else:
                print(f"✗ Failed: {result.get('error', 'Unknown')}")
                
        except Exception as e:
            results[provider] = {'error': str(e)}
            print(f"✗ Error: {e}")
    
    return results

# Uncomment to run comparison (requires API keys for all providers)
test_perm = np.array([1, 2, 3, 0])
comparison = compare_providers(['openai', 'gemini', 'claude'], test_perm)
print("\nComparison Results:")
print(json.dumps(comparison, indent=2))

## 8. Save Generated Code for Later Use

In [None]:
from main import save_code_to_file

# If you have generated code, save it
if solver and solver.generated_code:
    filepath = save_code_to_file(solver.generated_code, 'generated_solver.py')
    print(f"Code saved to: {filepath}")