# Program of Thought (PoT) with DSPy

This notebook demonstrates how to implement Program of Thought reasoning with DSPy for mathematical problem solving and code generation.

Based on the DSPy tutorial: [Program Of Thought](https://dspy.ai/tutorials/program_of_thought/)

## Setup

Import necessary libraries and configure the environment.

In [None]:
import os
import sys
sys.path.append('../../')

import dspy
from utils import setup_default_lm, print_step, print_result, print_error
from utils.datasets import get_sample_math_data
from dotenv import load_dotenv
import ast
import traceback

# Load environment variables
load_dotenv('../../.env')

## Language Model Configuration

Set up DSPy with a language model capable of code generation.

In [None]:
print_step("Setting up Language Model", "Configuring DSPy for program generation")

try:
    # Use a model good at code generation
    lm = setup_default_lm(provider="openai", model="gpt-4o", max_tokens=1500)
    dspy.configure(lm=lm)
    print_result("Language model configured successfully!", "Status")
except Exception as e:
    print_error(f"Failed to configure language model: {e}")

## Program of Thought Signatures

Define signatures for generating and executing programs to solve problems.

In [None]:
class GenerateProgram(dspy.Signature):
    """Generate a Python program to solve the given problem."""
    
    problem = dspy.InputField(desc="Mathematical or logical problem to solve")
    program = dspy.OutputField(desc="Python code that solves the problem and prints the answer")

class VerifyProgram(dspy.Signature):
    """Verify if a program correctly solves the given problem."""
    
    problem = dspy.InputField(desc="Original problem statement")
    program = dspy.InputField(desc="Python program to verify")
    expected_answer = dspy.InputField(desc="Expected answer if known")
    verification = dspy.OutputField(desc="Whether the program is correct and why")

class RefineProgram(dspy.Signature):
    """Refine a program based on errors or verification feedback."""
    
    problem = dspy.InputField(desc="Original problem statement")
    program = dspy.InputField(desc="Current program that needs refinement")
    error_feedback = dspy.InputField(desc="Error messages or verification feedback")
    refined_program = dspy.OutputField(desc="Improved Python program")

## Safe Code Execution Environment

Create a safe environment for executing generated programs.

In [None]:
import io
import contextlib
import sys
from typing import Dict, Any, Tuple

class SafeCodeExecutor:
    """Safely execute Python code with limited capabilities."""
    
    def __init__(self):
        # Define safe built-ins for code execution
        self.safe_builtins = {
            'abs': abs, 'round': round, 'min': min, 'max': max,
            'sum': sum, 'len': len, 'range': range, 'enumerate': enumerate,
            'zip': zip, 'list': list, 'dict': dict, 'set': set, 'tuple': tuple,
            'int': int, 'float': float, 'str': str, 'bool': bool,
            'print': print, 'sorted': sorted, 'reversed': reversed,
        }
        
        # Safe modules
        self.safe_modules = {
            'math': __import__('math'),
            'random': __import__('random'),
            'itertools': __import__('itertools'),
            'collections': __import__('collections'),
        }
    
    def execute_code(self, code: str, timeout: int = 5) -> Tuple[bool, str, Any]:
        """
        Execute code safely and return success status, output, and result.
        
        Returns:
            (success, output, result): Tuple of execution status, captured output, and result
        """
        try:
            # Create a restricted execution environment
            exec_globals = {
                '__builtins__': self.safe_builtins,
                **self.safe_modules
            }
            exec_locals = {}
            
            # Capture stdout
            output_buffer = io.StringIO()
            
            with contextlib.redirect_stdout(output_buffer):
                # Execute the code
                exec(code, exec_globals, exec_locals)
            
            output = output_buffer.getvalue()
            
            # Try to find a result variable or the last expression
            result = exec_locals.get('result', exec_locals.get('answer', output.strip()))
            
            return True, output, result
            
        except Exception as e:
            error_msg = f"Execution error: {str(e)}"
            return False, error_msg, None
    
    def validate_code(self, code: str) -> Tuple[bool, str]:
        """Validate code syntax without executing it."""
        try:
            ast.parse(code)
            return True, "Code syntax is valid"
        except SyntaxError as e:
            return False, f"Syntax error: {str(e)}"

# Initialize the code executor
code_executor = SafeCodeExecutor()

# Test the executor
test_code = """
import math
x = 5
y = 3
result = x + y
print(f"The sum of {x} and {y} is {result}")
"""

success, output, result = code_executor.execute_code(test_code)
print_result(f"Execution successful: {success}")
print_result(f"Output: {output}")
print_result(f"Result: {result}")

## Program of Thought Module

Create a module that uses program generation to solve problems.

In [None]:
class ProgramOfThought(dspy.Module):
    """Program of Thought reasoning module."""
    
    def __init__(self, max_attempts: int = 3):
        super().__init__()
        self.generate_program = dspy.ChainOfThought(GenerateProgram)
        self.verify_program = dspy.ChainOfThought(VerifyProgram) 
        self.refine_program = dspy.ChainOfThought(RefineProgram)
        self.code_executor = SafeCodeExecutor()
        self.max_attempts = max_attempts
    
    def forward(self, problem: str, expected_answer: str = None):
        """Solve a problem using program generation and execution."""
        
        print_step("Program of Thought Reasoning", f"Solving: {problem}")
        
        attempts = 0
        current_program = None
        
        while attempts < self.max_attempts:
            attempts += 1
            print_step(f"Attempt {attempts}")
            
            if current_program is None:
                # Generate initial program
                print("Generating initial program...")
                program_result = self.generate_program(problem=problem)
                current_program = program_result.program
            
            print_result(f"Generated Program:\n{current_program}", "Code")
            
            # Validate syntax
            is_valid, validation_msg = self.code_executor.validate_code(current_program)
            if not is_valid:
                print_error(f"Syntax validation failed: {validation_msg}")
                if attempts < self.max_attempts:
                    refine_result = self.refine_program(
                        problem=problem,
                        program=current_program,
                        error_feedback=validation_msg
                    )
                    current_program = refine_result.refined_program
                    continue
                else:
                    return dspy.Prediction(
                        program=current_program,
                        success=False,
                        error=validation_msg,
                        answer=None
                    )
            
            # Execute the program
            print("Executing program...")
            success, output, result = self.code_executor.execute_code(current_program)
            
            if success:
                print_result(f"Execution Output:\n{output}", "Output")
                print_result(f"Final Answer: {result}", "Answer")
                
                # Verify the result if expected answer is provided
                if expected_answer:
                    verify_result = self.verify_program(
                        problem=problem,
                        program=current_program,
                        expected_answer=expected_answer
                    )
                    print_result(verify_result.verification, "Verification")
                
                return dspy.Prediction(
                    program=current_program,
                    success=True,
                    output=output,
                    answer=result,
                    attempts=attempts
                )
            else:
                print_error(f"Execution failed: {output}")
                if attempts < self.max_attempts:
                    # Try to refine the program
                    refine_result = self.refine_program(
                        problem=problem,
                        program=current_program,
                        error_feedback=output
                    )
                    current_program = refine_result.refined_program
                else:
                    return dspy.Prediction(
                        program=current_program,
                        success=False,
                        error=output,
                        answer=None
                    )
        
        return dspy.Prediction(
            program=current_program,
            success=False,
            error="Max attempts reached",
            answer=None
        )

# Initialize the Program of Thought module
pot = ProgramOfThought(max_attempts=3)

## Example 1: Mathematical Calculation

Solve a mathematical problem using program generation.

In [None]:
# Simple mathematical problem
math_problem = """
A rectangle has a length of 12 meters and a width of 8 meters. 
What is the area of the rectangle in square meters?
"""

result = pot(problem=math_problem, expected_answer="96")

if result.success:
    print_step("Solution Summary")
    print(f"✓ Problem solved successfully in {result.attempts} attempt(s)")
    print(f"✓ Final answer: {result.answer}")
else:
    print_error(f"Failed to solve problem: {result.error}")

## Example 2: Complex Mathematical Reasoning

Solve a more complex problem involving multiple steps.

In [None]:
# Complex mathematical problem
complex_problem = """
A company has 150 employees. They want to give each employee a bonus based on their years of service:
- 0-2 years: $500 bonus
- 3-5 years: $750 bonus  
- 6-10 years: $1000 bonus
- 11+ years: $1500 bonus

The distribution of employees is:
- 45 employees with 0-2 years of service
- 38 employees with 3-5 years of service
- 42 employees with 6-10 years of service
- 25 employees with 11+ years of service

What is the total bonus amount the company will pay?
"""

result = pot(problem=complex_problem)

if result.success:
    print_step("Complex Problem Solution")
    print(f"✓ Problem solved successfully")
    print(f"✓ Total bonus amount: {result.answer}")
else:
    print_error(f"Failed to solve complex problem: {result.error}")

## Example 3: Algorithm Implementation

Use PoT to implement and test an algorithm.

In [None]:
# Algorithm problem
algorithm_problem = """
Write a program to find the nth Fibonacci number using an efficient method.
Calculate the 15th Fibonacci number.

Remember: The Fibonacci sequence starts with 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, ...
So the 15th Fibonacci number (0-indexed) should be 610.
"""

result = pot(problem=algorithm_problem, expected_answer="610")

if result.success:
    print_step("Algorithm Implementation Success")
    print(f"✓ Algorithm implemented and executed successfully")
    print(f"✓ 15th Fibonacci number: {result.answer}")
else:
    print_error(f"Failed to implement algorithm: {result.error}")

## Example 4: Data Analysis Problem

Use PoT for data analysis and statistics.

In [None]:
# Data analysis problem
data_problem = """
Given the following test scores: [85, 92, 78, 96, 88, 91, 87, 93, 89, 84]

Calculate:
1. The mean (average) score
2. The median score
3. The standard deviation
4. How many scores are above the mean

Print each result clearly.
"""

result = pot(problem=data_problem)

if result.success:
    print_step("Data Analysis Complete")
    print(f"✓ Statistical analysis completed successfully")
    print("✓ All statistics calculated and displayed")
else:
    print_error(f"Failed to analyze data: {result.error}")

## Example 5: Word Problem with Logic

Solve a word problem requiring logical reasoning.

In [None]:
# Logical reasoning problem
logic_problem = """
Three friends Alice, Bob, and Charlie are comparing their ages.
- Alice is 3 years older than Bob
- Charlie is 2 years younger than Alice
- The sum of all their ages is 63

What are their individual ages?
"""

result = pot(problem=logic_problem)

if result.success:
    print_step("Logic Problem Solved")
    print(f"✓ Ages calculated successfully using algebraic equations")
    print(f"✓ Solution: {result.answer}")
else:
    print_error(f"Failed to solve logic problem: {result.error}")

## Advanced PoT: Self-Improving Programs

Create a version that can improve its own programs based on performance.

In [None]:
class SelfImprovingPoT(dspy.Module):
    """Program of Thought with self-improvement capabilities."""
    
    def __init__(self):
        super().__init__()
        self.basic_pot = ProgramOfThought()
        self.improve_program = dspy.ChainOfThought(
            "problem, current_program, performance_metrics -> improved_program"
        )
    
    def solve_with_improvement(self, problem: str, performance_target: dict = None):
        """Solve a problem and improve the solution if needed."""
        
        # First attempt with basic PoT
        initial_result = self.basic_pot(problem=problem)
        
        if not initial_result.success:
            return initial_result
        
        print_step("Performance Analysis")
        
        # Analyze performance (simplified)
        performance_metrics = {
            "attempts_needed": initial_result.attempts,
            "code_efficiency": "needs_analysis",
            "readability": "needs_analysis"
        }
        
        print_result(f"Initial performance: {performance_metrics}")
        
        # If performance can be improved, try to enhance the program
        if performance_target and initial_result.attempts > performance_target.get("max_attempts", 1):
            print_step("Attempting Program Improvement")
            
            # This would ideally use a more sophisticated improvement process
            improved_result = f"""
# Improved version of the solution
{initial_result.program}

# Performance notes:
# - Solved in {initial_result.attempts} attempts
# - Could be optimized for better efficiency
"""
            
            return dspy.Prediction(
                program=improved_result,
                success=True,
                output=initial_result.output,
                answer=initial_result.answer,
                improvements_made=True
            )
        
        return initial_result

# Test self-improvement
self_improving_pot = SelfImprovingPoT()

improvement_test = """
Calculate the compound interest on $1000 invested at 5% annual interest for 3 years,
compounded annually. Use the formula A = P(1 + r)^t where A is final amount, 
P is principal, r is rate, and t is time.
"""

improved_result = self_improving_pot.solve_with_improvement(
    problem=improvement_test,
    performance_target={"max_attempts": 1}
)

print_step("Self-Improvement Results")
if improved_result.success:
    print("✓ Problem solved with self-improvement analysis")
    if hasattr(improved_result, 'improvements_made'):
        print("✓ Program improvements suggested")
    print(f"✓ Final answer: {improved_result.answer}")

## Benefits of Program of Thought

### Advantages of PoT over Standard Chain of Thought:

1. **Computational Accuracy**: Programs can perform exact calculations
2. **Complex Logic**: Can handle multi-step algorithms and data processing
3. **Verifiability**: Code can be inspected and verified
4. **Reusability**: Generated programs can be saved and reused
5. **Debugging**: Errors in reasoning become visible in code
6. **Scalability**: Can handle large datasets and complex computations

### Use Cases:

- Mathematical problem solving
- Data analysis and statistics
- Algorithm implementation
- Financial calculations
- Scientific computations
- Logic puzzles and reasoning

### Limitations:

- Requires safe code execution environment
- Limited by available libraries and functions
- May need multiple attempts for complex problems
- Security considerations for code execution

## Conclusion

Program of Thought (PoT) extends Chain of Thought reasoning by generating executable code to solve problems. This approach is particularly powerful for:

- Problems requiring precise calculations
- Multi-step algorithms
- Data processing tasks
- Mathematical and scientific reasoning

The combination of natural language reasoning and code generation makes PoT a versatile tool for AI problem-solving applications.