# Chain of thought (CoT) prompting

This notebook introduces chain of thought (CoT) prompting — an powerful prompt engineering technique that guides LLMs to solve problems by reasoning through them step-by-step. Much like how a human would “show their work” in math or logic, CoT prompting encourages the model to think aloud before arriving at a final answer.

As language models grow more powerful, our expectations shift from just correct outputs to transparent and trustworthy reasoning. CoT prompting directly addresses this need by structuring the model’s internal reasoning process in the prompt itself. CoT prompting is useful because it helps with:
- Accuracy — by making the reasoning process more structured.
- Transparency — because we can see and verify each step.
- Debugging — since it’s easier to spot where things might have gone wrong in the model’s thought process.

This approach is especially helpful for tasks that involve multiple steps, like math problems, logic puzzles, or anything that requires reasoning beyond a simple lookup.

In [1]:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

# Load environment variables
load_dotenv()

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

### Initialize the language model
We instantiate a lightweight GPT model from OpenAI using LangChain.

In [2]:
# Initialize the language model
llm = ChatOpenAI(model="gpt-4o-mini-2024-07-18")

## Basic chain of thought Prompting
To understand how CoT prompting works, we will start with a simple example: calculating average speed. The goal is to compare a standard direct prompt with a CoT-style prompt that encourages the model to reason through each step before answering.

In traditional prompting, we ask the model for an answer directly. With CoT prompting, we add an instruction to reason "step by step," nudging the model to walk through the logic before producing the final answer.

In [3]:
# Standard prompt: Asks for a concise answer without elaboration
standard_prompt = PromptTemplate(
    input_variables=["question"],
    template="Answer the following question concisely: {question}."
)

# Chain of Thought prompt: Adds an instruction to reason step by step
cot_prompt = PromptTemplate(
    input_variables=["question"],
    template="Answer the following question step by step concisely: {question}"
)

# Create chains
standard_chain = standard_prompt | llm
cot_chain = cot_prompt | llm

# Define a simple math question to test both prompts
question = "If a train travels 120 km in 2 hours, what is its average speed in km/h?"

# Run both prompts through the chains and collect responses
standard_response = standard_chain.invoke(question).content
cot_response = cot_chain.invoke(question).content

print("Standard Response:")
print(standard_response)
print("\nChain of Thought Response:")
print(cot_response)

Standard Response:
The average speed of the train is 60 km/h.

Chain of Thought Response:
To find the average speed, you can use the formula:

\[ \text{Average Speed} = \frac{\text{Total Distance}}{\text{Total Time}} \]

1. **Identify the total distance**: The train travels 120 km.
2. **Identify the total time**: The time taken is 2 hours.
3. **Apply the formula**:

   \[ \text{Average Speed} = \frac{120 \text{ km}}{2 \text{ hours}} \]

4. **Calculate**:

   \[ \text{Average Speed} = 60 \text{ km/h} \]

Therefore, the average speed of the train is **60 km/h**.


This side-by-side comparison demonstrates how a minor change in prompt wording can lead to more thoughtful, transparent outputs.
1. Prompt definition - We define two `PromptTemplate`s:
   - The standard prompt simply asks the model to answer the question concisely.
   - The CoT prompt adds the phrase “step by step,” signaling the model to work through the reasoning process.
2. Prompt execution - Using LangChain’s `|` operator, we pipe each prompt into the language model (`llm`), creating two chains — `standard_chain` and `cot_chain`. These chains handle prompt formatting and response generation.
3. Inference - We provide both chains with the same input question.

This basic comparison illustrates a key idea behind CoT prompting: even small prompt changes can significantly affect how the model reasons and communicates. In more complex problems, this difference becomes even more valuable — CoT prompting helps the model avoid mistakes by breaking the problem into smaller, more manageable parts.

## Advanced chain of thought techniques
Let’s now extend the CoT idea by introducing a more structured format that explicitly encourages multi-step reasoning. Instead of simply asking the model to "think step by step," we guide it through a defined sequence: first stating what needs to be calculated, then identifying the relevant formula, performing the computation, and finally explaining the result.

This kind of structured reasoning closely mirrors how humans approach problem-solving — particularly in domains where precision is critical, such as mathematics, finance and decision support tools. By making each reasoning step explicit, we reduce the likelihood of the model skipping over important details or making silent mistakes.

The result is not only greater reliability and accuracy, but also improved interpretability. Each step is clearly laid out, making it easier for users to verify the model’s logic and understand how it arrived at its final answer.

In [4]:
# Define a structured CoT prompt for multi-step quantitative reasoning
advanced_cot_prompt = PromptTemplate(
    input_variables=["question"],
    template="""Solve the following problem step by step. For each step:
1. State what you're going to calculate
2. Write the formula you'll use (if applicable)
3. Perform the calculation
4. Explain the result

Question: {question}

Solution:"""
)

# Pipe the prompt into the language model to create a CoT chain
advanced_cot_chain = advanced_cot_prompt | llm

# Example: Multi-leg trip average speed problem
complex_question = "A car travels 150 km at 60 km/h, then another 100 km at 50 km/h. What is the average speed for the entire journey?"

# Run the advanced CoT chain with the complex question
advanced_cot_response = advanced_cot_chain.invoke(complex_question).content
print(advanced_cot_response)

To find the average speed for the entire journey, we will use the following steps.

### Step 1: Calculate the time taken for each segment of the journey.
**What I'm going to calculate:** The time taken to travel each segment of the journey.

**Formula:** 
\[
\text{Time} = \frac{\text{Distance}}{\text{Speed}}
\]

#### Segment 1:
- Distance = 150 km
- Speed = 60 km/h

**Calculation:**
\[
\text{Time}_1 = \frac{150 \text{ km}}{60 \text{ km/h}} = 2.5 \text{ hours}
\]

#### Segment 2:
- Distance = 100 km
- Speed = 50 km/h

**Calculation:**
\[
\text{Time}_2 = \frac{100 \text{ km}}{50 \text{ km/h}} = 2 \text{ hours}
\]

### Explanation of result for Step 1:
The time taken for the first segment of the journey is 2.5 hours, and for the second segment, it is 2 hours. This means it took a total of 4.5 hours for the entire journey.

---

### Step 2: Calculate the total distance traveled.
**What I'm going to calculate:** The total distance traveled during the journey.

**Formula:** 
\[
\text{Total D

- We first define a detailed CoT-style prompt that gives the model a reasoning structure to follow — not just a hint, but a full step-by-step scaffold.
- We then create a LangChain pipeline (`advanced_cot_chain`) by combining the prompt and the model.
- A more complex math question is passed into this chain: the average speed of a car with two legs of different distances and speeds.
- The model walks through the problem using the structure we provided, showing its full reasoning process along the way.

## Comparative analysis
Now that we have explored both basic and advanced Chain of Thought prompting, let’s put it to the test. In this section, we compare how standard prompting and CoT prompting perform on a more complex, real-world problem — one that involves volume calculations, unit conversions, and reasoning over rates.

This kind of problem is often where large language models can struggle. Without a structured approach, the model may:
- Misapply formulas.
- Skip key unit conversions.
- Give a final answer without showing the necessary steps.

In [5]:
# Define a challenging question to test both prompts
challenging_question = """
A cylindrical water tank with a radius of 1.5 meters and a height of 4 meters is 2/3 full.
If water is being added at a rate of 10 liters per minute, how long will it take for the tank to overflow?
Give your answer in hours and minutes, rounded to the nearest minute.
(Use 3.14159 for π and 1000 liters = 1 cubic meter)"""

# Invoke the standard prompting chain
standard_response = standard_chain.invoke(challenging_question).content
# Invoke the advanced CoT prompting chain
cot_response = advanced_cot_chain.invoke(challenging_question).content

print("Standard Response:")
print(standard_response)
print("\nChain of Thought Response:")
print(cot_response)

Standard Response:
First, we calculate the volume of the cylindrical tank using the formula for the volume of a cylinder: 

\[ V = \pi r^2 h \]

Where:
- \( r = 1.5 \) meters
- \( h = 4 \) meters

Calculating the volume:

\[ V = 3.14159 \times (1.5)^2 \times 4 \]
\[ V = 3.14159 \times 2.25 \times 4 \]
\[ V = 3.14159 \times 9 = 28.27436 \text{ cubic meters} \]

Since the tank is 2/3 full, we find the current volume of water:

\[ \text{Current Volume} = \frac{2}{3} \times 28.27436 = 18.84957 \text{ cubic meters} \]

The volume of the tank when full is:

\[ 28.27436 \text{ cubic meters} \]

The volume needed to fill the tank to the top is:

\[ \text{Volume to fill} = 28.27436 - 18.84957 = 9.42479 \text{ cubic meters} \]

Converting cubic meters to liters (1 cubic meter = 1000 liters):

\[ \text{Volume to fill in liters} = 9.42479 \times 1000 = 9424.79 \text{ liters} \]

Water is being added at a rate of 10 liters per minute. To find the time it takes to fill the tank to the top, we use:



- Defines a challenging problem: This question requires understanding geometry (volume of a cylinder), fractional values (2/3 full), unit conversion (cubic meters to liters), and applying a rate (liters per minute).
- Sends the same input to both chains:
  - The `standard_chain` uses a direct prompt that asks for the answer.
  - The `advanced_cot_chain` walks the model through a multi-step reasoning process.
- Prints and compares the results: This allows us to assess not just the correctness of the answer, but also the transparency of the model’s thinking.

Real-world tasks often require layered reasoning. Standard prompting may suffice for straightforward queries, but when multiple steps are involved, CoT prompting helps ensure that each part of the logic is processed carefully and correctly. By observing the two outputs side by side, we can clearly see the benefits of CoT in action — particularly when the problem requires mathematical precision and logical sequencing.

## Problem-solving applications: Logical reasoning
Now, let's apply CoT prompting to a more complex logical reasoning task. CoT prompting is not just for math — it excels in logical reasoning and deduction tasks too.

**Why use CoT for logical reasoning?**

CoT prompting encourages the model to approach the problem step-by-step. Instead of simply asking for an answer, we guide the model through a structured process that mirrors how humans solve logical puzzles:
1. List the facts: Summarize the given information clearly.
2. Identify possible roles: Determine what roles could be assigned to each person (truth-teller, liar, alternator).
3. Generate scenarios: Consider all possible combinations of roles.
4. Test each scenario: Analyze each scenario based on the statements provided.
5. Eliminate inconsistent scenarios: Discard any scenarios that lead to contradictions.
6. Conclude the solution: Identify the correct roles and explain the reasoning behind the solution.

This method ensures a transparent and thorough reasoning process, which is important for solving complex logical problems.

In [6]:
# Define the prompt template
logical_reasoning_prompt = PromptTemplate(
    input_variables=["scenario"],
    template="""Analyze the following logical puzzle thoroughly. Follow these steps in your analysis:

List the Facts:

Summarize all the given information and statements clearly.
Identify all the characters or elements involved.
Identify Possible Roles or Conditions:

Determine all possible roles, behaviors, or states applicable to the characters or elements (e.g., truth-teller, liar, alternator).
Note the Constraints:

Outline any rules, constraints, or relationships specified in the puzzle.
Generate Possible Scenarios:

Systematically consider all possible combinations of roles or conditions for the characters or elements.
Ensure that all permutations are accounted for.
Test Each Scenario:

For each possible scenario:
Assume the roles or conditions you've assigned.
Analyze each statement based on these assumptions.
Check for consistency or contradictions within the scenario.
Eliminate Inconsistent Scenarios:

Discard any scenarios that lead to contradictions or violate the constraints.
Keep track of the reasoning for eliminating each scenario.
Conclude the Solution:

Identify the scenario(s) that remain consistent after testing.
Summarize the findings.
Provide a Clear Answer:

State definitively the role or condition of each character or element.
Explain why this is the only possible solution based on your analysis.
Scenario:

{scenario}

Analysis:""")

# Pipe the prompt into the language model to create a logical reasoning chain
logical_reasoning_chain = logical_reasoning_prompt | llm

# Define a logucal challenging question to test the prompt
logical_puzzle = """In a room, there are three people: Amy, Bob, and Charlie.
One of them always tells the truth, one always lies, and one alternates between truth and lies.
Amy says, 'Bob is a liar.'
Bob says, 'Charlie alternates between truth and lies.'
Charlie says, 'Amy and I are both liars.'
Determine the nature (truth-teller, liar, or alternator) of each person."""

# Invoke the logical reasoning chain
logical_reasoning_response = logical_reasoning_chain.invoke(logical_puzzle).content
print(logical_reasoning_response)

Let's analyze the logical puzzle step-by-step as requested.

### List the Facts:

1. **Characters Involved**: 
   - Amy
   - Bob
   - Charlie

2. **Statements Made**:
   - **Amy**: "Bob is a liar."
   - **Bob**: "Charlie alternates between truth and lies."
   - **Charlie**: "Amy and I are both liars."

### Identify Possible Roles or Conditions:

- **Roles**: There are three roles:
  - Truth-teller (always tells the truth)
  - Liar (always lies)
  - Alternator (alternates between truth and lies)

### Note the Constraints:

1. One of the three characters must be a truth-teller.
2. One must be a liar.
3. One must be an alternator.
4. The statements made must be evaluated based on their assigned roles.

### Generate Possible Scenarios:

Let's denote the roles as follows:
- T = Truth-teller
- L = Liar
- A = Alternator

We will evaluate all permutations of roles for the three characters (Amy, Bob, Charlie).

**Possible Role Combinations**:
1. Amy (T), Bob (L), Charlie (A)
2. Amy (T), Bob (A)

1. Define the prompt template: The `logical_reasoning_prompt` is a CoT prompt template that guides the model through the logical puzzle. This template breaks down the problem into a structured series of steps, ensuring the model does not just give an answer, but explains the reasoning behind its solution.
   - The **facts** are listed first, helping the model clarify what information is provided.
   - The **roles** are considered next (truth-teller, liar, alternator), which is a critical part of solving the puzzle.
   - The **constraints** are noted, which describe how each role behaves.
   - The model then **generates all possible scenarios**, assuming different roles for Amy, Bob, and Charlie.
   - Each **scenario is tested** by analyzing the statements made by each individual based on the assumed roles.
   - **Inconsistent scenarios** are eliminated (for example, if a contradiction arises when assuming someone is the truth-teller, that scenario is discarded).
   - Finally, the model **concludes the solution** by identifying the correct roles and explaining why those are the only valid solutions.

2. Create the chain: The next line creates the `logical_reasoning_chain` by combining the `logical_reasoning_prompt` with the language model (`llm`). This chain enables the model to apply the reasoning process described in the prompt template to the puzzle.
3. Define the puzzle: The `logical_puzzle` variable holds the actual puzzle text. It describes the scenario which the model will analyze.
4. Invoke the chain: The next line invokes the `logical_reasoning_chain` on the puzzle, causing the model to follow the reasoning process outlined in the template and analyze the puzzle. The `.content` extracts the text of the model's response, which will include the logical steps and the final answer.