# LLM-based Permutation Solver - Testing Notebook

This notebook demonstrates how to use the permutation solver with different LLM providers.

In [1]:
# Import required modules
import numpy as np
import json
from main import (
    PermutationSolver,
    PermutationConfig,
    get_llm_client,
    generate_random_permutation,
    apply_moves,
    is_trivial,
    verify_solution,
    get_system_prompt,
    get_main_prompt,
    extract_python_code
)

## 1. Basic Permutation Operations

Let's first understand the basic operations (L, R, X).

In [31]:
from main import apply_L, apply_R, apply_X

# Example array
arr = np.array([0, 1, 2, 3])
print(f"Original: {arr}")
print(f"After L (left shift): {apply_L(arr)}")
print(f"After R (right shift): {apply_R(arr)}")
print(f"After X (swap first two): {apply_X(arr)}")

Original: [0 1 2 3]
After L (left shift): [1 2 3 0]
After R (right shift): [3 0 1 2]
After X (swap first two): [1 0 2 3]


In [32]:
# Apply a sequence of moves
arr = np.array([1, 2, 3, 0])
moves = ['L', 'L', 'X', 'R']

print(f"Original: {arr}")
result = apply_moves(arr, moves)
print(f"After moves {moves}: {result}")
print(f"Is trivial: {is_trivial(result)}")

Original: [1 2 3 0]
After moves ['L', 'L', 'X', 'R']: [2 0 3 1]
Is trivial: False


## 2. Generate Random Permutations

In [33]:
# Generate random permutations of different lengths
for n in [4, 6, 8]:
    perm = generate_random_permutation(n)
    print(f"Length {n}: {perm}")

Length 4: [2 3 0 1]
Length 6: [5 1 4 3 0 2]
Length 8: [0 7 1 3 4 6 5 2]


## 3. View Prompts

Let's examine the system and main prompts.

In [34]:
print("=" * 60)
print("SYSTEM PROMPT:")
print("=" * 60)
print(get_system_prompt())

SYSTEM PROMPT:
You are an expert algorithm designer specializing in combinatorial puzzles and permutation theory.
Your task is to create efficient, constructive algorithms that solve permutation problems using only allowed moves.
You must provide working Python code that is self-contained and can be executed directly.
Always follow the constraints specified in the task and provide polynomial-time solutions.


In [35]:
print("=" * 60)
print("MAIN PROMPT (for length=4):")
print("=" * 60)
print(get_main_prompt(4))

MAIN PROMPT (for length=4):
Task: Implement a constructive sorting algorithm that sorts a given vector using ONLY allowed moves (L, R, X).

Input: A vector a of length n (0-indexed) containing distinct integers from 0 to n-1.

Allowed moves:
L: Left cyclic shift — shifts all elements one position to the left, with the first element moving to the end. Example: [0,1,2,3] -> [1,2,3,0].
R: Right cyclic shift — shifts all elements one position to the right, with the last element moving to the beginning. Example: [0,1,2,3] -> [3,0,1,2].
X: Transposition of the first two elements — swaps the elements at positions 0 and 1. Example: [0,1,2,3] -> [1,0,2,3].

CRITICAL CONSTRAINTS:
1. NO BFS, DFS, or any graph search algorithms are allowed
2. The algorithm must run in POLYNOMIAL TIME (O(n^k) for some constant k)
3. No exponential-time algorithms (like brute force search through permutations)
4. Must use a constructive, iterative approach that builds the solution step by step
5. No storing or explo

## 4. Using Different LLM Providers

Set your API keys as environment variables:
- `OPENAI_API_KEY` for OpenAI
- `GOOGLE_API_KEY` for Gemini
- `ANTHROPIC_API_KEY` for Claude

In [None]:
# Set API keys (uncomment and fill in your keys)
import os

os.environ['OPENAI_API_KEY'] = ''
# os.environ['GOOGLE_API_KEY'] = 'your-google-key'
# os.environ['ANTHROPIC_API_KEY'] = 'your-anthropic-key'

### 4.1 OpenAI Example

In [37]:
# Create OpenAI client
try:
    openai_client = get_llm_client('openai', model='gpt-4o')
    print(f"Created client: {openai_client.name}")
except ValueError as e:
    print(f"Error: {e}")

Created client: OpenAI (gpt-4o)


## 5. Full Solver Pipeline

Here's how to use the full solver pipeline.

In [38]:
# Choose your provider (change as needed)
PROVIDER = 'openai'  # 'openai', 'gemini', or 'claude'

try:
    client = get_llm_client(PROVIDER)
    config = PermutationConfig(length=4)
    solver = PermutationSolver(client, config)
    print(f"Solver created with {client.name}")
except Exception as e:
    print(f"Error creating solver: {e}")
    solver = None

Solver created with OpenAI (gpt-4o)


In [39]:
# Generate the algorithm (this calls the LLM)
if solver:
    code = solver.generate_algorithm(4)
    print("Generated code:")
    print("=" * 60)
    print(code)
    print("=" * 60)

Generating algorithm using OpenAI (gpt-4o)...
Generated code:
import sys
import json

def solve(vector):
    n = len(vector)
    moves = []

    # We will use a simple constructive approach:
    # 1. Bring each element to the front one by one.
    # 2. Use X to swap it into the correct position.

    for i in range(n):
        # Find the position of the smallest element not yet sorted
        min_pos = i
        for j in range(i, n):
            if vector[j] < vector[min_pos]:
                min_pos = j
        
        # Bring the smallest element to the front using R moves
        while min_pos > 0:
            vector = [vector[-1]] + vector[:-1]
            moves.append('R')
            min_pos -= 1
        
        # Swap it into the correct position using X
        for j in range(i, n-1):
            vector[0], vector[1] = vector[1], vector[0]
            moves.append('X')
            vector = vector[1:] + [vector[0]]
            moves.append('L')
    
    return moves, vector

i

In [40]:
# Test on a specific permutation
if solver and solver.generated_code:
    test_perm = np.array([1, 2, 3, 0])
    print(f"Testing permutation: {test_perm}")
    
    result = solver.solve(test_perm)
    
    if result['success']:
        print(f"Success!")
        print(f"Moves: {result['moves']}")
        print(f"Number of moves: {len(result['moves'])}")
        print(f"Sorted array: {result['sorted_array']}")
        print(f"Verification: {result['verification']['is_correct']}")
    else:
        print(f"Failed: {result.get('error', 'Unknown error')}")

Testing permutation: [1 2 3 0]
Success!
Moves: ['R', 'R', 'R', 'X', 'L', 'X', 'L', 'X', 'L', 'R', 'R', 'X', 'L', 'X', 'L', 'R', 'R', 'X', 'L', 'R', 'R', 'R']
Number of moves: 22
Sorted array: [0, 3, 2, 1]
Verification: False


In [27]:
result

{'success': True,
 'moves': ['L', 'L', 'X', 'R', 'R', 'R', 'L', 'X', 'R', 'R', 'X', 'R'],
 'sorted_array': [1, 3, 0, 2],
 'code': 'import sys\nimport json\n\ndef solve(vector):\n    n = len(vector)\n    moves = []\n    \n    def perform_L(arr):\n        arr.append(arr.pop(0))\n        moves.append(\'L\')\n    \n    def perform_R(arr):\n        arr.insert(0, arr.pop())\n        moves.append(\'R\')\n    \n    def perform_X(arr):\n        arr[0], arr[1] = arr[1], arr[0]\n        moves.append(\'X\')\n    \n    # Create a copy of the vector to perform operations on\n    arr = vector[:]\n    \n    for i in range(n):\n        # Find the index of the smallest unsorted element\n        min_index = i\n        for j in range(i + 1, n):\n            if arr[j] < arr[min_index]:\n                min_index = j\n\n        # Bring the smallest unsorted element to the front of the unsorted section\n        while min_index > i:\n            if min_index == i + 1:\n                perform_X(arr)\n        

In [21]:
# Run multiple random tests
if solver and solver.generated_code:
    print("Running 10 random tests...")
    results = solver.test_random(n=4, num_tests=10)
    
    print(f"\nFinal Results:")
    print(f"Success rate: {results['success_rate']*100:.1f}%")
    print(f"Passed: {results['success_count']}/{results['num_tests']}")

Running 10 random tests...
Test 1/10: [1, 3, 2, 0]
  ✗ Failed: Unknown error
Test 2/10: [1, 3, 0, 2]
  ✗ Failed: Unknown error
Test 3/10: [3, 2, 1, 0]
  ✗ Failed: Unknown error
Test 4/10: [3, 1, 0, 2]
  ✗ Failed: Unknown error
Test 5/10: [0, 3, 2, 1]
  ✗ Failed: Unknown error
Test 6/10: [1, 3, 2, 0]
  ✗ Failed: Unknown error
Test 7/10: [0, 2, 3, 1]
  ✗ Failed: Unknown error
Test 8/10: [1, 3, 2, 0]
  ✗ Failed: Unknown error
Test 9/10: [3, 1, 0, 2]
  ✗ Failed: Unknown error
Test 10/10: [3, 2, 1, 0]
  ✗ Failed: Unknown error

Final Results:
Success rate: 0.0%
Passed: 0/10


## 6. Manual Code Execution and Verification

You can also manually test code snippets.

In [None]:
from main import execute_generated_code

# Example: A simple (but inefficient) sorting algorithm
sample_code = '''
import sys
import json

def solve(vector):
    a = list(vector)
    n = len(a)
    moves = []
    
    def do_L():
        nonlocal a
        a = a[1:] + [a[0]]
        moves.append('L')
    
    def do_R():
        nonlocal a
        a = [a[-1]] + a[:-1]
        moves.append('R')
    
    def do_X():
        nonlocal a
        a[0], a[1] = a[1], a[0]
        moves.append('X')
    
    # Simple bubble-sort-like approach
    for target in range(n):
        # Find position of element 'target'
        pos = a.index(target)
        
        # Rotate to bring target to position 0 or 1
        while pos > 1:
            do_L()
            pos = a.index(target)
        
        # If at position 1, swap to position 0
        if pos == 1:
            do_X()
        
        # Rotate right to move element to its final position
        for _ in range(target):
            do_R()
    
    return moves, a

if __name__ == "__main__":
    if len(sys.argv) > 1:
        vector = json.loads(sys.argv[1])
    else:
        vector = [3, 1, 2]
    
    moves, sorted_array = solve(vector)
    print(json.dumps({"moves": moves, "sorted_array": sorted_array}))
'''

# Test the sample code
test_vector = [1, 2, 3, 0]
result = execute_generated_code(sample_code, test_vector)

print(f"Input: {test_vector}")
if result['success']:
    print(f"Output: {result['sorted_array']}")
    print(f"Moves ({len(result['moves'])}): {result['moves']}")
else:
    print(f"Error: {result.get('error', 'Unknown')}")

Input: [1, 2, 3, 0]
Error: Execution error: Traceback (most recent call last):
  File [35m"/var/folders/bc/6rm76y3s2f7180l31j70ss7h0000gn/T/solve_rat4pjrh.py"[0m, line [35m51[0m, in [35m<module>[0m
    moves, sorted_array = [31msolve[0m[1;31m(vector)[0m
                          [31m~~~~~[0m[1;31m^^^^^^^^[0m
  File [35m"/var/folders/bc/6rm76y3s2f7180l31j70ss7h0000gn/T/solve_rat4pjrh.py"[0m, line [35m28[0m, in [35msolve[0m
    pos = a.index(target)
[1;35mValueError[0m: [35m4 is not in list[0m



In [None]:
# Verify the solution manually
if result['success']:
    original = np.array(test_vector)
    expected = np.arange(len(test_vector))
    
    verification = verify_solution(original, result['moves'], expected)
    print(f"Verification result: {verification['is_correct']}")
    print(f"Original: {verification['original']}")
    print(f"Result: {verification['result']}")
    print(f"Expected: {verification['expected']}")

## 7. Compare Different Providers

Test the same permutation with different LLM providers.

In [None]:
def compare_providers(providers: list, permutation: np.ndarray):
    """Compare solutions from different LLM providers."""
    results = {}
    
    for provider in providers:
        print(f"\n{'='*60}")
        print(f"Testing {provider.upper()}")
        print(f"{'='*60}")
        
        try:
            client = get_llm_client(provider)
            solver = PermutationSolver(client, PermutationConfig(length=len(permutation)))
            solver.generate_algorithm(len(permutation))
            result = solver.solve(permutation)
            
            results[provider] = {
                'success': result['success'],
                'num_moves': len(result.get('moves', [])) if result['success'] else None,
                'is_correct': result.get('verification', {}).get('is_correct', False)
            }
            
            if result['success']:
                print(f"✓ Success with {len(result['moves'])} moves")
            else:
                print(f"✗ Failed: {result.get('error', 'Unknown')}")
                
        except Exception as e:
            results[provider] = {'error': str(e)}
            print(f"✗ Error: {e}")
    
    return results

# Uncomment to run comparison (requires API keys for all providers)
test_perm = np.array([1, 2, 3, 0])
comparison = compare_providers(['openai', 'gemini', 'claude'], test_perm)
print("\nComparison Results:")
print(json.dumps(comparison, indent=2))

## 8. Save Generated Code for Later Use

In [None]:
from main import save_code_to_file

# If you have generated code, save it
if solver and solver.generated_code:
    filepath = save_code_to_file(solver.generated_code, 'generated_solver.py')
    print(f"Code saved to: {filepath}")