# DSPy Tutorial: Understanding Prompt Engineering Enhancement

This notebook demonstrates how **DSPy** (Declarative Self-improving Python) enhances and optimizes prompts for LLMs. We'll explore:

1. **Zero-shot prompting** - Direct predictions without examples
2. **Few-shot prompting** - Learning from examples
3. **Chain of Thought** - Step-by-step reasoning
4. **Custom Signatures** - Structured input/output definitions

## What is DSPy?

DSPy is a framework that treats prompts as code, making them:
- **Composable**: Build complex pipelines from simple components
- **Optimizable**: Automatically improve prompts using validation
- **Maintainable**: Separate logic from prompt engineering
- **Type-safe**: Use Python types and signatures

---



***IMPORTANT INSTALLATIONS BEFORE EXECUTION***

In [1]:
# Setup: This cell ensures all necessary libraries are installed for the user
import sys
!{sys.executable} -m pip install -U google-generativeai
!{sys.executable} -m pip install -U dspy-ai "litellm>=1.63.2" "openai<1.62.0" requests python-dotenv


Collecting dspy-ai
  Using cached dspy_ai-3.0.4-py3-none-any.whl.metadata (285 bytes)
Collecting litellm>=1.63.2
  Using cached litellm-1.80.10-py3-none-any.whl.metadata (30 kB)
Collecting dspy>=3.0.4 (from dspy-ai)
  Using cached dspy-3.0.4-py3-none-any.whl.metadata (8.4 kB)
Collecting grpcio<1.68.0,>=1.62.3 (from litellm>=1.63.2)
  Using cached grpcio-1.67.1-cp313-cp313-win_amd64.whl.metadata (4.0 kB)
INFO: pip is looking at multiple versions of litellm to determine which version is compatible with other requirements. This could take a while.
Collecting litellm>=1.63.2
  Using cached litellm-1.80.9-py3-none-any.whl.metadata (30 kB)
  Using cached litellm-1.80.8-py3-none-any.whl.metadata (30 kB)
  Using cached litellm-1.80.7-py3-none-any.whl.metadata (30 kB)
  Using cached litellm-1.80.6-py3-none-any.whl.metadata (30 kB)
  Using cached litellm-1.80.5-py3-none-any.whl.metadata (30 kB)
  Using cached litellm-1.80.0-py3-none-any.whl.metadata (30 kB)
  Using cached litellm-1.79.3-py3-none

### Configure DSPy with Language Model

We'll use Google's Gemini model. Replace the API key with your own.


In [1]:
import dspy
import os
from dotenv import load_dotenv

# Load keys from .env file
load_dotenv()
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
# Configure DSPy with a language model
lm = dspy.LM("gemini/gemini-2.5-flash", api_key=GOOGLE_API_KEY)
dspy.configure(lm=lm)
print("DSPy configured successfully!")

DSPy configured successfully!


## 2. Basic Language Model Usage

Before diving into DSPy's enhancements, let's see basic LM usage:


In [2]:
# Basic LM call with a simple string
result1 = lm("Say this is a test!", temperature=0.0)
print("Result 1:", result1)

# Basic LM call with messages format
result2 = lm(messages=[{"role": "user", "content": "hii this is a test!"}])
print("Result 2:", result2)

Result 1: ['This is a test!']
Result 2: ['Hi there! Test received. How can I help you today? ðŸ˜Š']


## 3. Zero-Shot Prompting with `dspy.Predict`

**Zero-shot** means making predictions without providing examples. DSPy's `Predict` module structures your inputs and outputs.

### Simple Zero-Shot Example: Question Answering


In [5]:
# Define a simple signature using a string format
# Format: "input_field -> output_field: type"

qa = dspy.Predict("question -> answer")
result = qa(question="What is the capital of France?")
print(f"Question: What is the capital of France?")
print(f"Answer: {result.answer}")


Question: What is the capital of France?
Answer: Paris


  PydanticSerializationUnexpectedValue(Expected 8 fields but got 5: Expected `Message` - serialized value may not be as expected [field_name='choices', input_value=Message(content='[[ ## an...er_specific_fields=None), input_type=Message])
  PydanticSerializationUnexpectedValue(Expected `StreamingChoices` - serialized value may not be as expected [field_name='choices', input_value=Choices(finish_reason='st...r_specific_fields=None)), input_type=Choices])
  return self.__pydantic_serializer__.to_python(


### Zero-Shot with Custom Signature Class

For more complex scenarios, define a Signature class with type hints and documentation.


In [5]:
from typing import Literal

# Define a custom signature for sentiment classification
class SentimentAnalysis(dspy.Signature):
    """Analyze the sentiment of a given text and provide a confidence score."""
    
    text: str = dspy.InputField(desc="The text to analyze")
    sentiment: Literal["positive", "negative", "neutral"] = dspy.OutputField(desc="The sentiment classification")
    confidence: float = dspy.OutputField(desc="Confidence score between 0 and 1")

# Create a Predict module from the signature
classify = dspy.Predict(SentimentAnalysis)

# Use it for zero-shot prediction
result = classify(text="I absolutely love this new product! It's amazing!")
print(f"Text: I absolutely love this new product! It's amazing!")
print(f"Sentiment: {result.sentiment}")
print(f"Confidence: {result.confidence}")


Text: I absolutely love this new product! It's amazing!
Sentiment: positive
Confidence: 0.98


### Zero-Shot: Text Summarization

Another example showing how DSPy structures prompt engineering:


In [7]:
class Summarize(dspy.Signature):
    """Summarize the given text in 2-3 sentences."""
    
    long_text: str = dspy.InputField()
    summary: str = dspy.OutputField()

summarizer = dspy.Predict(Summarize)

long_text = """
Machine learning is a subset of artificial intelligence that enables computers to learn 
and improve from experience without being explicitly programmed. It uses algorithms to 
analyze large amounts of data, identify patterns, and make predictions or decisions. 
Applications range from recommendation systems and image recognition to natural language 
processing and autonomous vehicles.
"""

result = summarizer(long_text=long_text)
print("Original text:")
print(long_text)
print("\nSummary:")
print(result.summary)


Original text:

Machine learning is a subset of artificial intelligence that enables computers to learn 
and improve from experience without being explicitly programmed. It uses algorithms to 
analyze large amounts of data, identify patterns, and make predictions or decisions. 
Applications range from recommendation systems and image recognition to natural language 
processing and autonomous vehicles.


Summary:
Machine learning, a branch of artificial intelligence, allows computers to learn and improve autonomously through experience rather than explicit programming. It operates by employing algorithms to analyze extensive data, pinpoint patterns, and consequently make informed predictions or decisions. This technology finds wide application in areas such as recommendation systems, image recognition, and autonomous vehicles.


## 4. Few-Shot Prompting with `dspy`

**Few-shot learning** provides examples to help the model understand the task better.

### Creating Training Examples

First, let's define some examples:


In [8]:
# Define a signature for our task
class Translate(dspy.Signature):
    """Translate English text to French."""
    
    english: str = dspy.InputField()
    french: str = dspy.OutputField()

# Create training examples
examples = [
    dspy.Example(english="Hello", french="Bonjour").with_inputs("english"),
    dspy.Example(english="Good morning", french="Bonjour").with_inputs("english"),
    dspy.Example(english="How are you?", french="Comment allez-vous?").with_inputs("english"),
    dspy.Example(english="Thank you", french="Merci").with_inputs("english"),
]

print(f"Created {len(examples)} examples for few-shot learning")


Created 4 examples for few-shot learning


### Few-Shot Predictor

Now create a FewShot predictor that uses these examples:


In [9]:
# In DSPy, few-shot learning can be demonstrated by including examples in the signature description
# or by using the teleprompter system for optimization. For this tutorial, we'll show a conceptual approach.

# Method 1: Zero-shot (no examples) - baseline
zero_shot_translator = dspy.Predict(Translate)

# Method 2: Enhanced signature with example pattern in description
class TranslateWithExamples(dspy.Signature):
    """Translate English text to French.
    
    Examples:
    - Hello -> Bonjour
    - Good morning -> Bonjour  
    - How are you? -> Comment allez-vous?
    - Thank you -> Merci
    """
    english: str = dspy.InputField()
    french: str = dspy.OutputField()

few_shot_translator = dspy.Predict(TranslateWithExamples)

# Test with a new phrase
test_phrase = "Good evening"
print("=== Zero-Shot Translation ===")
zero_result = zero_shot_translator(english=test_phrase)
print(f"English: {test_phrase}")
print(f"French: {zero_result.french}\n")

print("=== Few-Shot Translation (with examples in signature) ===")
few_result = few_shot_translator(english=test_phrase)
print(f"English: {test_phrase}")
print(f"French: {few_result.french}")


=== Zero-Shot Translation ===
English: Good evening
French: Bonsoir

=== Few-Shot Translation (with examples in signature) ===
English: Good evening
French: Bonsoir


### Note on Few-Shot Learning in DSPy

**Important**: In DSPy, few-shot learning can be done in several ways:
1. **Signature-based** (shown here): Include examples in the signature docstring
2. **Teleprompter system**: Use `dspy.LabeledFewShot` or other teleprompters for automatic optimization
3. **Manual prompt engineering**: Include examples directly in prompts

For production use and automatic optimization, DSPy's teleprompter system (like `LabeledFewShot`, `BootstrapFewShot`) is recommended. The signature-based approach shown here demonstrates the concept clearly.


### Few-Shot: Classification with Examples

Let's see how few-shot learning improves classification:


In [13]:
class TopicClassifier(dspy.Signature):
    """Classify the topic of the given article title."""
    
    title: str = dspy.InputField()
    topic: Literal["technology", "sports", "science", "politics", "entertainment"] = dspy.OutputField()

# Create classification examples for reference
# These examples demonstrate the pattern we want the model to follow
classification_examples = [
    dspy.Example(title="New AI Model Breaks Records", topic="technology").with_inputs("title"),
    dspy.Example(title="World Cup Finals This Weekend", topic="sports").with_inputs("title"),
    dspy.Example(title="Discovery of New Planet", topic="science").with_inputs("title"),
    dspy.Example(title="Election Results Announced", topic="politics").with_inputs("title"),
    dspy.Example(title="Oscar Nominations Revealed", topic="entertainment").with_inputs("title"),
]

# Enhanced signature with few-shot examples in the docstring
class TopicClassifierFewShot(dspy.Signature):
    """Classify the topic of the given article title.
    
    Examples:
    - "New AI Model Breaks Records" -> technology
    - "World Cup Finals This Weekend" -> sports
    - "Discovery of New Planet" -> science
    - "Election Results Announced" -> politics
    - "Oscar Nominations Revealed" -> entertainment
    """
    title: str = dspy.InputField()
    topic: Literal["technology", "sports", "science", "politics", "entertainment"] = dspy.OutputField()

# Create classifiers
zero_shot_classifier = dspy.Predict(TopicClassifier)
few_shot_classifier = dspy.Predict(TopicClassifierFewShot)

# Test classification on new titles
test_titles = [
    "Machine Learning Advances in Healthcare",
    "Championship Game Results",
    "Breakthrough in Quantum Computing"
]

print("Few-Shot Classification Results:")
print("="*60)
for title in test_titles:
    few_result = few_shot_classifier(title=title)
    print(f"Title: {title}")
    print(f"Topic: {few_result.topic}\n")


Few-Shot Classification Results:
Title: Machine Learning Advances in Healthcare
Topic: technology

Title: Championship Game Results
Topic: sports

Title: Breakthrough in Quantum Computing
Topic: technology



### Comparing Zero-Shot vs Few-Shot

Let's compare the same task with and without examples:


In [14]:
class FormatDate(dspy.Signature):
    """Format the date in a readable format."""
    
    date_input: str = dspy.InputField()
    formatted_date: str = dspy.OutputField()

# Enhanced signature with few-shot examples
class FormatDateFewShot(dspy.Signature):
    """Format the date in a readable format.
    
    Examples:
    - "2024-01-15" -> "January 15, 2024"
    - "2023-12-25" -> "December 25, 2023"
    - "2024-07-04" -> "July 4, 2024"
    """
    date_input: str = dspy.InputField()
    formatted_date: str = dspy.OutputField()

# Zero-shot (no examples)
zero_shot_formatter = dspy.Predict(FormatDate)

# Few-shot (with examples in signature)
few_shot_formatter = dspy.Predict(FormatDateFewShot)

test_date = "2024-03-20"

print("=== Zero-Shot Formatting ===")
zero_result = zero_shot_formatter(date_input=test_date)
print(f"Input: {test_date}")
print(f"Output: {zero_result.formatted_date}\n")

print("=== Few-Shot Formatting ===")
few_result = few_shot_formatter(date_input=test_date)
print(f"Input: {test_date}")
print(f"Output: {few_result.formatted_date}")


=== Zero-Shot Formatting ===
Input: 2024-03-20
Output: March 20, 2024

=== Few-Shot Formatting ===
Input: 2024-03-20
Output: March 20, 2024


## 5. Chain of Thought (CoT) Reasoning

**Chain of Thought** breaks down complex problems into reasoning steps. DSPy's `ChainOfThought` module automatically structures this.

### Basic Chain of Thought Example


In [10]:
# ChainOfThought automatically adds reasoning steps
math = dspy.ChainOfThought("question -> answer: float")

result = math(question="Two dice are tossed. What is the probability that the sum equals two?")
print(f"Question: Two dice are tossed. What is the probability that the sum equals two?")
print(f"\nReasoning: {result.reasoning}")
print(f"Answer: {result.answer}")


Question: Two dice are tossed. What is the probability that the sum equals two?

Reasoning: When two dice are tossed, there are 6 possible outcomes for the first die and 6 possible outcomes for the second die. The total number of possible outcomes is 6 * 6 = 36.

We want to find the number of outcomes where the sum of the two dice equals two.
The minimum value a single die can show is 1.
Therefore, the only way to get a sum of two is if both dice show a 1.
This corresponds to the outcome (1, 1).

There is only 1 favorable outcome.

The probability is calculated as:
P(sum = 2) = (Number of favorable outcomes) / (Total number of possible outcomes)
P(sum = 2) = 1 / 36

To express this as a float: 1 / 36 = 0.027777777777777776
Rounding to a reasonable number of decimal places or using the exact fraction if not explicitly asked for a specific precision. As it asks for a float, I'll provide the exact division.
Answer: 0.027777777777777776


### Chain of Thought with Custom Signature

For more control, define a custom signature with explicit reasoning field:


In [16]:
class SolveProblem(dspy.Signature):
    """Solve the given math problem step by step."""
    
    problem: str = dspy.InputField()
    reasoning: str = dspy.OutputField(desc="Show your step-by-step reasoning")
    answer: float = dspy.OutputField()

# Use ChainOfThought with custom signature
math_solver = dspy.ChainOfThought(SolveProblem)

result = math_solver(problem="If a train travels 120 km in 2 hours, what is its average speed?")
print(f"Problem: If a train travels 120 km in 2 hours, what is its average speed?")
print(f"\nReasoning:\n{result.reasoning}")
print(f"\nAnswer: {result.answer} km/h")


Problem: If a train travels 120 km in 2 hours, what is its average speed?

Reasoning:
The problem asks for the average speed of a train.
The formula for average speed is:
Average Speed = Total Distance / Total Time

Given:
Total Distance = 120 km
Total Time = 2 hours

Substitute the given values into the formula:
Average Speed = 120 km / 2 hours
Average Speed = 60 km/h

Therefore, the average speed of the train is 60 km/h.

Answer: 60.0 km/h


### Chain of Thought: Logic Problems

CoT is especially useful for logical reasoning:


In [None]:
class LogicPuzzle(dspy.Signature):
    """Solve the logic puzzle step by step."""
    
    puzzle: str = dspy.InputField()
    reasoning: str = dspy.OutputField()
    answer: str = dspy.OutputField()

logic_solver = dspy.ChainOfThought(LogicPuzzle)

puzzle = """
Three friends are sitting in a row: Alice, Bob, and Charlie.
Alice is sitting to the left of Bob.
Charlie is sitting to the left of Alice.
Who is sitting in the middle?
"""

result = logic_solver(puzzle=puzzle)
print("Puzzle:")
print(puzzle)
print(f"\nReasoning:\n{result.reasoning}")
print(f"\nAnswer: {result.answer}")


Puzzle:

Three friends are sitting in a row: Alice, Bob, and Charlie.
Alice is sitting to the left of Bob.
Charlie is sitting to the left of Alice.
Who is sitting in the middle?


Reasoning:
Let's denote the positions from left to right.
1. "Alice is sitting to the left of Bob."
   This implies the partial order: A, B

2. "Charlie is sitting to the left of Alice."
   This implies the partial order: C, A

Now, we combine these two pieces of information:
From (2), we have C, A.
From (1), we know that B is to the right of A.
So, extending C, A to include B, we get C, A, B.

The full order from left to right is Charlie, Alice, Bob.
In this sequence, Alice is in the middle.

Answer: Alice


## 6. Combining Techniques: Few-Shot + Chain of Thought

You can combine few-shot examples with chain of thought reasoning:


In [12]:
class WordProblem(dspy.Signature):
    """Solve the word problem showing your work.
    
    Examples with reasoning:
    - Problem: "Sarah has 5 apples. She gives 2 to her friend. How many does she have left?"
      Reasoning: "Sarah starts with 5 apples. After giving 2 away: 5 - 2 = 3"
      Answer: 3.0
    
    - Problem: "A pizza has 8 slices. If 3 people each eat 2 slices, how many slices are left?"
      Reasoning: "Total slices eaten: 3 people Ã— 2 slices = 6 slices. Remaining: 8 - 6 = 2"
      Answer: 2.0
    """
    problem: str = dspy.InputField()
    reasoning: str = dspy.OutputField()
    answer: float = dspy.OutputField()

# Combine ChainOfThought with few-shot examples (in signature)
# This gives us both example-based learning AND step-by-step reasoning
combined_solver = dspy.ChainOfThought(WordProblem)

test_problem = "Tom has 12 stickers. He buys 5 more, then uses 3. How many stickers does he have now?"
result = combined_solver(problem=test_problem)

print("="*60)
print(f"Problem: {test_problem}")
print("\n" + "="*60)
print(f"Reasoning:\n{result.reasoning}")
print("\n" + "="*60)
print(f"Answer: {result.answer}")


Problem: Tom has 12 stickers. He buys 5 more, then uses 3. How many stickers does he have now?

Reasoning:
Tom starts with 12 stickers. He buys 5 more, so he has 12 + 5 = 17 stickers. Then he uses 3 stickers, so he has 17 - 3 = 14 stickers left.

Answer: 14.0


### How DSPy Enhances Prompts

DSPy automatically enhances prompts in several ways:

1. **Structured Formatting**: Converts signatures into well-formatted prompts
2. **Type Safety**: Ensures outputs match expected types
3. **Documentation**: Uses docstrings and field descriptions in prompts
4. **Example Selection**: Intelligently selects relevant examples for few-shot
5. **Reasoning Chains**: Automatically structures reasoning steps



## 8. Advanced Example: Multi-Step Reasoning

Let's create a more complex example combining multiple techniques:


In [11]:
class AnalyzeText(dspy.Signature):
    """Analyze the given text for sentiment, extract key topics, and summarize."""
    
    text: str = dspy.InputField()
    reasoning: str = dspy.OutputField(desc="Think step by step about sentiment and topics")
    sentiment: Literal["positive", "negative", "neutral"] = dspy.OutputField()
    topics: str = dspy.OutputField(desc="Comma-separated list of main topics")
    summary: str = dspy.OutputField(desc="Brief summary in 1-2 sentences")

# Use ChainOfThought for multi-aspect analysis
analyzer = dspy.ChainOfThought(AnalyzeText)

sample_text = """
Artificial intelligence is transforming healthcare in remarkable ways. 
Machine learning algorithms can now detect diseases earlier than ever before, 
leading to better patient outcomes. However, there are concerns about data 
privacy and the need for human oversight in critical medical decisions.
"""

result = analyzer(text=sample_text)

print("Input Text:")
print(sample_text)
print("\n" + "="*60)
print(f"Reasoning:\n{result.reasoning}\n")
print("="*60)
print(f"Sentiment: {result.sentiment}")
print(f"Topics: {result.topics}")
print(f"Summary: {result.summary}")


Input Text:

Artificial intelligence is transforming healthcare in remarkable ways. 
Machine learning algorithms can now detect diseases earlier than ever before, 
leading to better patient outcomes. However, there are concerns about data 
privacy and the need for human oversight in critical medical decisions.


Reasoning:
The text highlights the positive impact of AI and machine learning in healthcare, specifically mentioning earlier disease detection and better patient outcomes. However, it also raises concerns about data privacy and the necessity of human oversight. The overall tone is cautiously optimistic, focusing on the benefits while acknowledging critical challenges. Therefore, the sentiment is primarily positive due to the stated benefits, with important caveats. Topics clearly include artificial intelligence, healthcare, disease detection, data privacy, and human oversight.

Sentiment: positive
Topics: Artificial intelligence, Healthcare, Machine learning, Disease detection,

## 9. Using DSPy Evaluators to Measure Performance

You can use DSPy's built-in evaluators to measure the accuracy or quality of your modules. Below is an example that evaluates a few-shot classifier using the `Accuracy` evaluator.

In [None]:
# Example: Evaluating a Few-Shot Classifier with DSPy Evaluators
import dspy
from typing import Literal

# Define a signature for topic classification
class TopicClassifier(dspy.Signature):
    """Classify the topic of the given article title."""
    title: str = dspy.InputField()
    topic: Literal["technology", "sports", "science", "politics", "entertainment"] = dspy.OutputField()

# Few-shot examples
examples = [
    dspy.Example(title="New AI Model Breaks Records", topic="technology").with_inputs("title"),
    dspy.Example(title="World Cup Finals This Weekend", topic="sports").with_inputs("title"),
    dspy.Example(title="Discovery of New Planet", topic="science").with_inputs("title"),
    dspy.Example(title="Election Results Announced", topic="politics").with_inputs("title"),
    dspy.Example(title="Oscar Nominations Revealed", topic="entertainment").with_inputs("title"),
]

# Create a few-shot classifier using the examples
classifier = dspy.Predict(TopicClassifier, examples=examples)

# Evaluation set
eval_set = [
    dspy.Example(title="Breakthrough in Quantum Computing", topic="technology").with_inputs("title"),
    dspy.Example(title="World Cup Qualifiers Announced", topic="sports").with_inputs("title"),
    dspy.Example(title="Mars Rover Sends New Photos", topic="science").with_inputs("title"),
    dspy.Example(title="Senate Passes New Bill", topic="politics").with_inputs("title"),
    dspy.Example(title="Film Festival Winners Announced", topic="entertainment").with_inputs("title"),
]

# Use DSPy's built-in Accuracy evaluator
def accuracy_metric(example, pred, trace=None):
    return example.topic == pred.topic  # Adjust field names as needed

evaluate = dspy.Evaluate(devset=eval_set, metric=accuracy_metric)
results = evaluate(classifier)
print(f"Accuracy: {results}")

2025/12/22 15:56:55 INFO dspy.evaluate.evaluate: Average Metric: 4 / 5 (80.0%)


Accuracy: 80.0


## Summary: Key Takeaways

### What We've Learned

1. **Zero-Shot (`dspy.Predict`)**: Direct predictions using signatures
   - Use for tasks where the model has inherent understanding
   - Simple and fast
   - Best for straightforward tasks

2. **Few-Shot (Examples in Signatures)**: Learning from examples
   - Provides context through examples in signature docstrings
   - Improves performance on specific patterns
   - Best when you have good examples
   - Can also use DSPy's teleprompter for automatic few-shot optimization

3. **Chain of Thought (`dspy.ChainOfThought`)**: Step-by-step reasoning
   - Breaks down complex problems
   - Shows reasoning process
   - Best for math, logic, and complex reasoning tasks

4. **Custom Signatures**: Structured input/output definitions
   - Type safety and validation
   - Clear documentation
   - Reusable components

5. **DSPy Evaluators**:DSPy's built-in evaluators to measure the accuracy or quality of your modules.  
   -Enhances accuracy  
   -Evaluates Performance Of the modules    

### How DSPy Enhances Prompts

- âœ… **Automatic formatting** from Python code
- âœ… **Type validation** ensures correct outputs
- âœ… **Smart example selection** for few-shot
- âœ… **Reasoning chain structure** for CoT
- âœ… **Composable modules** for complex pipelines

### Next Steps

- Explore **optimization** with `dspy.teleprompter`
- Learn about **retrieval** and **RAG** integration
- Build **pipelines** combining multiple modules
