
# dspy.ChainOfThought &mdash; Chain-of-Thought Prompting Strategy

Focuses on `dspy.ChainOfThought` that teaches the LM to think step-by-step before committing to the signature's response.

According to `openai/gpt-5-nano-2025-08-07`:

> **Chain-of-thought prompting** asks a LM model to generate its step-by-step reasoning before providing the final answer, which can improve accuracy on complex tasks.

In [0]:
%pip install -qU dspy>=3.0.4 mlflow>=3.8.1
%restart_python

In [0]:
import dspy
print(f"DSPy version: {dspy.__version__}")

In [0]:
import mlflow
print(f"MLflow version: {mlflow.__version__}")

In [0]:
import mlflow

# FIXME Add the descriptions of the arguments
mlflow.dspy.autolog(
    log_traces=True,
    log_traces_from_compile=True,
    log_traces_from_eval=True,
    log_compiles=True,
    log_evals=True,
    disable=False,
    silent=False,
)

## DSPy Modules

From [DSPy's official docs](https://dspy.ai/learn/programming/modules/) with some changes (mostly inspired by the [PyTorch docs](https://docs.pytorch.org/docs/stable/notes/modules.html)):

1. A **DSPy module** is a building block for LM programs in DSPy.
1. DSPy's built-in modules abstract prompting techniques (e.g., [Chain-of-Thought](https://www.promptingguide.ai/techniques/cot), [ReAct](https://www.promptingguide.ai/techniques/react)).
1. A DSPy module inherits from the base `dspy.Module` class.
1. Modules can contain other modules (to compose bigger modules (_programs_)).
1. Inspired directly by [PyTorch modules](https://docs.pytorch.org/docs/stable/notes/modules.html), but applied to LM programs.
1. Like PyTorch modules, DSPy modules have **learnable parameters** for DSPy Optimizers to update.
    1. For any given module, its parameters consist of its direct parameters as well as the parameters of all submodules. This means that calls to `parameters()` and `named_parameters()` will recursively include child parameters, allowing for convenient optimization of all parameters within the parent module.
    1. Parameters can be optimized with one of the [DSPy Optimizers](https://dspy.ai/learn/optimization/optimizers/)
1. DSPy modules define `forward()` function that performs a "LM computation".
    * Following PyTorch's naming convention, _"This name (`forward()`) is in reference to the concept of "forward pass", which applies to each module. The "forward pass" is responsible for applying the computation represented by the module to the given input(s)."_
1. Modules are callable (`__call__`), and calling them invokes their `forward()` functions.

## DSPy Built-In Modules

They mainly change the internal behavior with which your signature is implemented!

Many DSPy modules (except `dspy.Predict`) return auxiliary information by expanding your signature under the hood.

| DSPy Module | Description |
|-|-|
| `dspy.BestOfN` | Runs the given module up to `N` times with different rollout IDs at `temperature=1.0` and returns the best prediction out of `N` attempts or the first prediction that passes the `threshold`.<br>Similar to `dspy.Refine`. |
| `dspy.ChainOfThought` | Teaches the LM to reason step-by-step to predict the output (committing to the signature's response). |
| `dspy.CodeAct` | Uses the Code Interpreter (`ProgramOfThought`) and predefined tools (`ReAct`) to solve the problem. |
| `dspy.MultiChainComparison` | Compares multiple outputs from `dspy.ChainOfThought` to produce the final prediction. |
| `dspy.Predict` | The most fundamental DSPy module that maps inputs to outputs using a language model.<br>Does not modify the signature.<br>Handles the key forms of learning (i.e., storing the instructions and demonstrations and updates to the LM).<br>All other DSPy modules are built using `dspy.Predict`.<br>Positional arguments not allowed. |
| `dspy.ProgramOfThought` | Teaches the LM to output Python code that will be executed to solve a problem. |
| `dspy.ReAct` | Implements the [Reasoning and Acting (ReAct)](https://arxiv.org/abs/2210.03629) pattern for building tool-using agents.<br>An agent that can use tools to implement the given signature. |
| `dspy.Refine` | Refines the given module by running it up to `N` times with different rollout IDs at `temperature=1.0` and returns the best prediction.<br>Similar to `dspy.BestOfN`. |

## dspy.ChainOfThought

Reasons step by step in order to predict the output of a task.

Adds `reasoning` field that includes the LM's reasoning before it generates the output summary.

In [0]:
help(dspy.ChainOfThought)

In [0]:
import dspy
lm = dspy.LM(model="databricks/databricks-claude-sonnet-4-5")
dspy.configure(lm=lm)

In [0]:
sentence = "it's a charming and often affecting journey."  # example from the SST-2 dataset.

classify = dspy.ChainOfThought(
    signature="sentence -> sentiment: bool",
)
response = classify(sentence=sentence)

print(response.sentiment)


## Module Parameters

`parameters()` gives the full set of parameters registered by a module

`named_parameters()` gives the parameters and their names


`dspy.ChainOfThought` module injects a `reasoning` before the output field(s) of your signature.

In [0]:
print(response.reasoning)

In [0]:
import rich
for parameter in classify.parameters():
  rich.print(parameter)

In [0]:
import rich
for parameter in classify.named_parameters():
  rich.print(parameter)

In [0]:
import rich
rich.print(classify.dump_state())

## Examples of DSPy Modules on Simple Tasks

From the [DSPy official docs](https://dspy.ai/learn/programming/modules/#what-other-dspy-modules-are-there-how-can-i-use-them).

In [0]:
math = dspy.ChainOfThought(
    signature="question -> answer: float"
)

question="Two dice are tossed. What is the probability that the sum equals two?"
prediction = math(question=question)

print(f"answer: {prediction.answer}")
print(f"reasoning:\n{prediction.reasoning}")

In [0]:
# FIXME This does not work

# def search(query: str) -> list[str]:
#     """Retrieves abstracts from Wikipedia."""
#     results = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")(query, k=3)
#     return [x['text'] for x in results]

# rag = dspy.ChainOfThought("context, question -> response")

# question = "What's the name of the castle that David Gregory inherited?"
# rag(context=search(question), question=question)

In [0]:
from typing import Literal

class Classify(dspy.Signature):
    """Classify sentiment of a given sentence."""

    sentence: str = dspy.InputField()
    sentiment: Literal['positive', 'negative', 'neutral'] = dspy.OutputField()
    confidence: float = dspy.OutputField()

classify = dspy.Predict(Classify)
classify(sentence="This book was super fun to read, though not the last chapter.")

In [0]:
text = "Apple Inc. announced its latest iPhone 14 today. The CEO, Tim Cook, highlighted its new features in a press release."

module = dspy.Predict(
    signature="text -> title, headings: list[str], entities_and_metadata: list[dict[str, str]]",
)
module(text=text)

In [0]:
import dspy

# Requires deno to be installed
# deno does not have proper permissions to execute
def evaluate_math(expression: str) -> float:
    # return dspy.PythonInterpreter({}).execute(expression)
    # Always returns 0!
    return 0

# FIXME
def search_wikipedia(query: str) -> str:
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]

react = dspy.ReAct(
    signature="question -> answer: float",
    tools=[
        evaluate_math,
    ]
)

question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?"
react(question=question)

In [0]:
# https://dspy.ai/learn/programming/signatures/#example-a-sentiment-classification

summarize = dspy.ChainOfThought('document -> summary')

# Example from the XSum dataset.
document = """
The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season.
Lee had two loan spells in League One last term, with Blackpool and then Colchester United.
He scored twice for the U's but was unable to save them from relegation.
The length of Lee's contract with the promoted Tykes has not been revealed.
Find all the latest football transfers on our dedicated page.
"""
response = summarize(document=document)

print(response.summary)

In [0]:
# https://dspy.ai/learn/programming/signatures/#example-d-a-metric-that-evaluates-faithfulness-to-citations
class CheckCitationFaithfulness(dspy.Signature):
    """Verify that the text is based on the provided context."""

    context: str = dspy.InputField(desc="facts here are assumed to be true")
    text: str = dspy.InputField()
    faithfulness: bool = dspy.OutputField()
    evidence: dict[str, list[str]] = dspy.OutputField(desc="Supporting evidence for claims")

context = "The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page."

text = "Lee scored 3 goals for Colchester United."

faithfulness = dspy.ChainOfThought(CheckCitationFaithfulness)
faithfulness(context=context, text=text)

## Resources

1. The original paper [Chain-of-Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903)
1. [Prompt Engineering Guide](https://www.promptingguide.ai/techniques/cot)
1. [DSPy Tutorial: Math Reasoning](https://dspy.ai/tutorials/math/)