# Self-Consistency Prompting Example

This notebook demonstrates **Self-Consistency**, a technique that improves accuracy on reasoning tasks by "asking the audience" (where the audience is the model itself).

**The Concept:**
LLMs are probabilistic. If you ask a complex question once, it might take a wrong turn in its reasoning.
**Self-Consistency** mitigates this by:
1.  **Sampling**: Asking the *same* question multiple times (e.g., 5 times).
2.  **Reasoning**: Letting the model generate a full Chain of Thought for each attempt.
3.  **Voting**: Comparing the final answers and selecting the one that appears most frequently.

**Key Mechanics:**
* **Diversity**: We use a non-zero temperature to ensure each reasoning path is slightly different.
* **Consensus**: If 4 out of 5 reasoning paths lead to "Answer: 42," we can be much more confident than if we only asked once.

In [None]:
%pip install openai python-dotenv --quiet

### 1. Setup and Authorization

We start by importing the necessary libraries and loading your OpenAI API key. We also import `Counter` from Python's standard library to help us tally the votes.

In [None]:
from openai import OpenAI
import os
from dotenv import load_dotenv
from collections import Counter

load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    api_key = input("Paste your OpenAI API key: ").strip()

client = OpenAI(api_key=api_key)
print("OpenAI client ready!")

### 2. The Self-Consistency Engine

This function runs the core logic.

**How it works:**
1.  **Loop**: It runs a loop `num_samples` times (default is 5).
2.  **Generate**: It calls the API for each sample.
    * **Crucial Detail**: We set `temperature=0.7`. If we used 0, every answer would be identical, defeating the purpose. We need slight variations in the reasoning path to test robustness.
3.  **Extract**: It parses the model's output to find the final answer (assuming the answer is on the last line).
4.  **Vote**: It uses `Counter` to find the most common result.

In [None]:
SYSTEM_PROMPT = """
Solve using chain of thought: Break down step by step and give a final answer.
Ensure your final answer is on the very last line of your response.
"""

def get_self_consistent_answer(query, num_samples=5):
    print(f"Generating {num_samples} reasoning paths for: '{query}'...\n")
    answers = []
    
    for i in range(num_samples):
        try:
            # 1. Generate a Reasoning Path
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=[
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": query}
                ],
                temperature=0.7  # Higher temp = diverse reasoning paths
            )
            
            reply = response.choices[0].message.content.strip()
            
            # 2. Extract Final Answer (Simple heuristic: take the last line)
            # In a real app, you might ask for a specific format like "Answer: <X>"
            final_ans = reply.split("\n")[-1].strip()
            answers.append(final_ans)
            
            print(f"--- Path {i+1} ---")
            print(f"Reasoning: {reply[:100]}...") # Show snippet
            print(f"Result: {final_ans}\n")
            
        except Exception as e:
            print(f"Error on path {i+1}: {e}")
            continue

    if not answers:
        return None, 0, {}

    # 3. Aggregation (Majority Voting)
    print("-" * 30)
    vote_counts = Counter(answers)
    most_common, count = vote_counts.most_common(1)[0]
    
    return most_common, count, vote_counts

### 3. Run the Consistency Check

Now we test it. This technique is particularly famous for solving math word problems or logic puzzles where there is a single correct answer, but many ways to get lost.

**Try asking:**
* "If I have 3 apples, eat one, buy two more, and drop half of them, how many do I have?"
* "Janet is older than Bob. Bob is younger than Tom. Tom is older than Janet. Is this possible?"

In [None]:
query = input("Enter a problem for self-consistency: ")

if query:
    winner, count, all_votes = get_self_consistent_answer(query, num_samples=5)
    
    print(f"üèÜ MOST CONSISTENT ANSWER: {winner}")
    print(f"Confidence: {count}/5 votes")
    print(f"Vote Distribution: {dict(all_votes)}")