### Self-Consistency and Multiple Paths of Reasoning Tutorial
#### Overview
This tutorial explores the concept of self-consistency and multiple paths of reasoning in prompt engineering. We'll focus on techniques for generating diverse reasoning paths and aggregating results to improve the quality and reliability of AI-generated answers.

#### Motivation
Large language models can sometimes produce inconsistent or unreliable outputs. By leveraging multiple reasoning paths and aggregating results, we can enhance the robustness and accuracy of AI-generated responses. This approach is particularly useful for complex problem-solving tasks where a single path of reasoning might be insufficient or prone to errors.

#### Key Components
Generating multiple reasoning paths
Aggregating results for better answers
Implementing self-consistency checks
Applying these techniques to various problem-solving scenarios

In [1]:
import os
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from dotenv import load_dotenv
import random
from collections import Counter

# Load environment variables
load_dotenv()

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini")

#### Method Details
Our approach involves the following steps:

Setting up the environment with necessary libraries (OpenAI and LangChain)

Designing prompts that encourage diverse reasoning paths

Generating multiple responses using these prompts

Implementing aggregation methods to combine and analyze the generated responses

Applying self-consistency checks to evaluate the reliability of the results

Demonstrating the effectiveness of this approach on various problem types

Throughout the tutorial, we'll use practical examples to illustrate how these techniques can be applied to enhance the quality and reliability of AI-generated answers.

### Generating Multiple Reasoning Paths
Let's create a function that generates multiple reasoning paths for a given problem.

In [1]:
def generate_multiple_paths(problem, num_paths=3):
    """
    Generate multiple reasoning paths for a given problem.
    
    Args:
    problem (str): The problem statement.
    num_paths (int): Number of reasoning paths to generate.
    
    Returns:
    list: A list of generated reasoning paths.
    """
    prompt_template = PromptTemplate(
        input_variables=["problem", "path_number"],
        template="""Solve the following problem using a unique approach. This is reasoning path {path_number}.
        Problem: {problem}
        Reasoning path {path_number}:"""
    )

    paths = []
    for i in range(num_paths):
        chain = prompt_template | llm
        response = chain.invoke({"problem": problem, "path_number": i+1}).content
        paths.append(response)
    
    return paths

In [3]:
problem = "A ball is thrown upwards with an initial velocity of 20 m/s. How high will it go?"
paths = generate_multiple_paths(problem)

for i, path in enumerate(paths, 1):
    print(f"Path {i}:\n{path}\n")

Path 1:
To solve the problem using Reasoning Path 1, we can utilize the principles of physics, particularly kinematics, to find the maximum height reached by the ball.

1. **Identify the given values:**
   - Initial velocity (\( u \)) = 20 m/s
   - Final velocity at the maximum height (\( v \)) = 0 m/s (since the ball momentarily stops at its peak)
   - Acceleration due to gravity (\( a \)) = -9.81 m/s² (the negative sign indicates that gravity acts downwards)

2. **Use the kinematic equation:**
   One of the kinematic equations relates initial velocity, final velocity, acceleration, and displacement (height in this case):
   \[
   v^2 = u^2 + 2a s
   \]
   where:
   - \( s \) is the displacement (maximum height),
   - \( u \) is the initial velocity,
   - \( v \) is the final velocity,
   - \( a \) is the acceleration.

3. **Substitute the values:**
   Plugging in the given values:
   \[
   0^2 = (20 \, \text{m/s})^2 + 2(-9.81 \, \text{m/s}^2) s
   \]

   Simplifying:
   \[
   0 = 400

#### Aggregating Results
Now that we have multiple reasoning paths, let's create a function to aggregate the results and determine the most consistent answer.

In [4]:
def aggregate_results(paths):
    """
    Aggregate results from multiple reasoning paths.
    
    Args:
    paths (list): List of reasoning paths.
    
    Returns:
    str: The most consistent answer.
    """
    prompt_template = PromptTemplate(
        input_variables=["paths"],
        template="""Analyze the following reasoning paths and determine the most consistent answer. If there are discrepancies, explain why and provide the most likely correct answer.
        Reasoning paths:`
        {paths}
        
        Most consistent answer:"""
    )

    chain = prompt_template | llm
    response = chain.invoke({"paths": "\n".join(paths)}).content
    return response

In [5]:
aggregated_result = aggregate_results(paths)
print("Aggregated Result:\n", aggregated_result)

Aggregated Result:
 After analyzing the provided reasoning paths, both approaches to solving the problem yield the same result: the maximum height reached by a ball thrown upwards with an initial velocity of 20 m/s is approximately **20.39 meters**. Here’s a summary of the reasoning and conclusions from both paths:

### Reasoning Path 1: Kinematic Equations
1. **Given Values**: Initial velocity (\(u = 20 \, \text{m/s}\)), final velocity at maximum height (\(v = 0 \, \text{m/s}\)), and acceleration due to gravity (\(a = -9.81 \, \text{m/s}^2\)).
2. **Kinematic Equation**: Used the equation \(v^2 = u^2 + 2as\) to solve for height \(s\).
3. **Calculation**: Arriving at the conclusion \(s \approx 20.39 \, \text{m}\).

### Reasoning Path 2: Energy Conservation
1. **Energy Concepts**: Evaluated the conversion of kinetic energy into potential energy.
2. **Kinetic Energy Calculation**: The ball’s kinetic energy at launch was calculated as \(200m \, \text{Joules}\).
3. **Potential Energy at Max

#### Self-Consistency Check
To further improve our results, let's implement a self-consistency check that evaluates the reliability of our aggregated answer.

In [6]:
def self_consistency_check(problem, aggregated_result):
    """
    Perform a self-consistency check on the aggregated result.
    
    Args:
    problem (str): The original problem statement.
    aggregated_result (str): The aggregated result to check.
    
    Returns:
    str: An evaluation of the result's consistency and reliability.
    """
    prompt_template = PromptTemplate(
        input_variables=["problem", "result"],
        template="""Evaluate the consistency and reliability of the following result for the given problem.
        Problem: {problem}
        Result: {result}
        
        Evaluation (consider factors like logical consistency, adherence to known facts, and potential biases):"""
    )

    chain = prompt_template | llm
    response = chain.invoke({"problem": problem, "result": aggregated_result}).content
    return response

In [7]:
consistency_evaluation = self_consistency_check(problem, aggregated_result)
print("Self-Consistency Evaluation:\n", consistency_evaluation)

Self-Consistency Evaluation:
 The evaluation of the result regarding the maximum height reached by a ball thrown upwards with an initial velocity of \(20 \, \text{m/s}\) reveals a few important aspects regarding the consistency and reliability of the conclusion.

### Logical Consistency
1. **Kinematic Equations**: The derivation using kinematic equations is appropriate and follows a logical sequence:
   - The use of the equation \(v^2 = u^2 + 2as\) to find the maximum height (where final velocity \(v=0\) at max height) is standard. 
   - Plugging in the values gives:
     \[
     0 = (20)^2 + 2(-9.81)s \implies 0 = 400 - 19.62s \implies 19.62s = 400 \implies s \approx 20.39 \, \text{meters}.
     \]
   - This correctly leads to the conclusion of approximately \(20.39\) meters.

2. **Energy Conservation**: The energy conservation approach is also sound:
   - The kinetic energy at the launch was calculated correctly as \(\frac{1}{2}mv^2\), leading to \(KE = 200m\) (correct if mass is tak

In [None]:
def solve_problem(problem):
    """
    Solve a problem using multiple reasoning paths, aggregation, and self-consistency check.
    
    Args:
    problem (str): The problem statement.
    
    Returns:
    tuple: (aggregated_result, consistency_evaluation)
    """
    paths = generate_multiple_paths(problem)
    aggregated_result = aggregate_results(paths)
    consistency_evaluation = self_consistency_check(problem, aggregated_result)
    return aggregated_result, consistency_evaluation

# Example problems
problems = [
    "What is the capital of France?",
    "Explain the concept of supply and demand in economics.",
    "If a train travels at 60 km/h, how long will it take to cover 180 km?"
]

for problem in problems:
    print(f"Problem: {problem}")
    result, evaluation = solve_problem(problem)
    print("Aggregated Result:\n", result)
    print("\nConsistency Evaluation:\n", evaluation)
    print("\n" + "-"*50 + "\n")

Problem: What is the capital of France?
Aggregated Result:
 The most consistent answer across all three reasoning paths is **Paris**. 

**Discrepancies and Explanation:**
1. All reasoning paths consistently point towards Paris by highlighting different aspects: historical context, cultural significance, geography, and political importance.
2. None of the paths provide conflicting information or alternatives; each reinforces the notion that Paris is the capital by showcasing its unique characteristics and significance.
3. The conclusion across all three paths is aligned, solidifying Paris as the capital without any contradictions.

By synthesizing these perspectives, it's clear that the reasoning solidly supports the answer of **Paris** without any discrepancies.

Consistency Evaluation:
 The evaluation of the consistency and reliability of the result regarding the capital of France, which is identified as **Paris**, reveals several positive aspects. Here are the key factors:

1. **Logi