## <b><font color='darkblue'>Preface</font></b>
([source](https://www.analyticsvidhya.com/blog/2025/01/prompting-with-dspy/)) <font size='3ptx'><b>[DSPy](https://dspy.ai/) (Declarative Self-improving Python), or Declarative Self-improving Language Programs, revolutionizes how developers interact with Large Language Models (LLMs)</b>. By abstracting the intricacies of prompt engineering, it enables users to develop, test, and improve their apps more effectively and dependably.</font>

This comprehensive tutorial delves deeply into DSPy, offering thorough insights to assist you in getting started and creating potent AI-powered apps.

In [109]:
import os
from dotenv import load_dotenv
import dspy

# Expand the '~' to the user's home directory path
home_directory = os.path.expanduser('~')

# Construct the full path to your .env file
dotenv_path = os.path.join(home_directory, '.env')

# Load the environment variables from the specified path
load_dotenv(dotenv_path)
API_KEY = os.environ.get('GOOGLE_API_KEY')

### <b><font color='darkgreen'>Learning Objectives</font></b>
* Understand DSPy’s declarative approach for simplifying language model application development.
* Learn how DSPy automates prompt engineering and optimizes performance for complex tasks.
* Explore practical examples of DSPy in action, such as math problem-solving and [sentiment analysis](https://www.analyticsvidhya.com/blog/2021/06/nlp-sentiment-analysis/).
* Discover the advantages of DSPy, including modularity, scalability, and continuous self-improvement.
* Gain insights into integrating DSPy into existing systems and optimizing LLM-powered workflows.

### <b><font color='darkgreen'>Background</font></b>
In the traditional approach, we manually construct the prompt string, manage the API key configuration directly in the logic, and parse the unstructured text response to do Sentimental analysis. For example:

In [112]:
#!pip install google-generativeai
!pip freeze | grep 'google-generativeai'

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


google-generativeai==0.8.5


In [115]:
import google.generativeai as genai

# 1. Setup (Vendor specific)
genai.configure(api_key=API_KEY)
model = genai.GenerativeModel("gemini-2.5-flash")

In [124]:
def analyze_sentiment_manual(text):
    # 2. Manual Prompt Engineering (String manipulation)
    prompt = f"""
    You are a sentiment classifier. 
    Analyze the following text and output ONLY one of these labels: Positive, Negative, Neutral.
    
    Text: "{text}"
    Label:
    """

    # 3. Call the API
    response = model.generate_content(prompt)

    # 4. Manual Parsing (Cleaning whitespace, handling potential markdown)
    return response.text.strip()

# Run it
sentence = "I got a perfect score and feel happy!"
print(f"Manual Result: {analyze_sentiment_manual(sentence)}")

Manual Result: Positive


In DSPy, the code looks nearly identical to the `analyze_sentiment_manual` version. You only change the `lm` definition. <b>DSPy handles the prompt structure, meaning you don't write "You are a classifier..." text</b>.

In [117]:
import dspy

# 1. Setup (Unified interface)
# We tell DSPy to use Gemini. The syntax is "provider/model_name"
lm = dspy.LM("gemini/gemini-2.5-flash", api_key=API_KEY)
dspy.settings.configure(lm=lm)

In [127]:
# 2. Define the Signature (Declarative Logic)
# Notice: No prompt engineering here. Just inputs and outputs.
class SentimentAnalysis(dspy.Signature):
    sentence = dspy.InputField()
    sentiment = dspy.OutputField(desc="Positive, Negative, or Neutral")
    reason=dspy.OutputField(desc="The judging reason.")

In [129]:
# 3. Define the Module
# dspy.Predict compiles the prompt for Gemini automatically
classifier = dspy.Predict(SentimentAnalysis)

In [130]:
# 4. Run it (Structured Object In -> Structured Object Out)
response = classifier(sentence=sentence)

print(f"DSPy Result:   {response.sentiment} ({response.reason})")

DSPy Result:   Positive (The sentence contains two strong indicators of positive sentiment: "perfect score" which signifies a great achievement, and "feel happy!" which is an explicit declaration of a positive emotion.)


## <b><font color='darkblue'>What is DSPy?</font></b>
<font size='3ptx'><b>[DSPy](https://dspy.ai/) is a framework designed to simplify the development of language model-powered applications</b>. It introduces a declarative approach where users specify what they want the model to do without getting bogged down in the implementation details.</font>

Here are the core components of DSPy:

### <b><font color='darkgreen'>Key Components of DSPy</font></b>
* <b><font size='3ptx'>[Signatures](https://dspy.ai/learn/programming/signatures/)</font></b>: Declarative specifications known as signatures specify how a DSPy module should behave both in terms of input and output. For instance, “`question -> answer`” could be a signature for a task that requires answering questions. Signatures make it easier to specify exactly what the model is supposed to do.
* <b><font size='3ptx'>[Modules](https://dspy.ai/learn/programming/modules/)</font></b>: Within an LLM pipeline, modules abstract standard prompting mechanisms. Every built-in module manages a distinct DSPy signature and prompting method. Building complicated LLM applications is made easier by the ability to combine modules to form larger, more intricate modules.
* <b><font size='3ptx'>[Optimizers](https://dspy.ai/learn/optimization/overview/)</font></b>: Optimizers modify a DSPy program’s parameters, such as language model weights and prompts, to improve predetermined metrics, such as accuracy. <b>Developers can concentrate on higher-level program logic since this automation eliminates the need for manual prompt engineering</b>.

## <b><font color='darkblue'>How DSPy Works?</font></b>
<font size='3ptx'><b>DSPy is a framework that helps simplify the creation of workflows by using modular components and a declarative programming style.</b> It automates many aspects of workflow design, optimization, and execution, allowing users to focus on defining their goals rather than the implementation details.</font>

Below is a detailed explanation of how DSPy works:

### <b><font color='darkgreen'>Task Definition</font></b>
* <b><font size='3ptx'>Objective Specification</font></b>: Clearly define the task you aim to accomplish, such as text summarization, question answering, or sentiment analysis.
* <b><font size='3ptx'>Performance Metrics</font></b>: Establish criteria to evaluate the success of the task, like accuracy, relevance, or response time.

### <b><font color='darkgreen'>Data Collection</font></b>
* <b><font size='3ptx'>Example Gathering</font></b>: Collect input examples pertinent to the task. These can be labeled (with expected outputs) or unlabeled, depending on the requirements.
* <b><font size='3ptx'>Dataset Preparation</font></b>: Organize the collected data into a structured format suitable for processing within DSPy.

### <b><font color='darkgreen'>Pipeline Construction</font></b>
* <b><font size='3ptx'>[Module](https://dspy.ai/learn/programming/modules/) Selection</font></b>: Choose from DSPy’s built-in modules that correspond to various natural language processing tasks.
* <b><font size='3ptx'>[Signature](https://dspy.ai/learn/programming/signatures/) Definition</font></b>: Define the input and output types for each module using signatures, ensuring compatibility and clarity in data flow.
* <b><font size='3ptx'>Pipeline Assembly</font></b>: Arrange the selected modules into a coherent pipeline that processes inputs to produce the desired outputs.

### <b><font color='darkgreen'>Optimization</font></b>
* <b><font size='3ptx'>Prompt Refinement</font></b>: Utilize DSPy’s [**optimizers**](https://dspy.ai/learn/optimization/overview/) to automatically refine prompts and adjust parameters, enhancing the performance of each module.
* <b><font size='3ptx'>Few-Shot Example Generation</font></b>: Leverage in-context learning to generate examples that improve the model’s understanding and output quality.
* <b><font size='3ptx'>Self-Improvement</font></b>: Enable the pipeline to learn from its outputs and feedback, continuously improving performance.

### <b><font color='darkgreen'>Compilation and Execution</font></b>
* <b><font size='3ptx'>Code Generation</font></b>: Compile the optimized pipeline into executable Python code, facilitating seamless integration into applications.
* <b><font size='3ptx'>Deployment</font></b>: Deploy the compiled pipeline within your application’s environment to perform the specified tasks.
* <b><font size='3ptx'>Evaluation</font></b>: Assess the pipeline’s performance using the predefined metrics, ensuring it meets the desired standards.

### <b><font color='darkgreen'>Iteration</font></b>
* <b><font size='3ptx'>Feedback Incorporation</font></b>: Analyze performance evaluations to identify areas for improvement.
* <b><font size='3ptx'>Pipeline Refinement</font></b>: Iteratively refine the pipeline by revisiting previous steps, such as adjusting modules, updating data, or modifying optimization parameters, to achieve better results.

![idea](https://cdn.analyticsvidhya.com/wp-content/uploads/2025/01/dspy.webp)
Source: [Click Here](https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6dd1229a-71ea-41f7-af23-091ce219bce4_1908x1070.png)

By following this structured workflow, DSPy facilitates the development of robust, efficient, and adaptable language model applications. <b>It allows developers to <font size='4ptx'>concentrate on defining tasks and metrics</font> while the framework handles the intricacies of optimization and execution.</b>

## <b><font color='darkblue'>How DSPy Automates Prompt Engineering?</font></b>
<b><font size='3ptx'>DSPy uses an optimization technique that views [prompt engineering](https://www.analyticsvidhya.com/blog/2023/06/what-is-prompt-engineering/) as a [machine learning](https://www.analyticsvidhya.com/machine-learning/) problem rather than creating prompts by hand.</font></b>

This procedure entails:
* <b><font size='3ptx'>[Bootstrapping](https://www.datacamp.com/tutorial/bootstrapping)</font></b>: DSPy iteratively improves the initial seed prompt based on user-provided examples or assertions and the model’s outputs.
* <b><font size='3ptx'>Prompt chaining</font></b> is dividing difficult jobs into a series of easier sub-prompts so that the model can better handle complex questions.
* <b><font size='3ptx'>Combining several prompt variations</font></b> to increase resilience and performance is known as prompt ensembeling.

DSPy automates quick engineering procedures, improving their efficacy and efficiency and resulting in more dependable LLM applications.

## <b><font color='darkblue'>Practical Examples of Prompting with DSPy</font></b>
<b><font size='3ptx'>Below we will explore real-world applications of DSPy through practical examples, showcasing how to efficiently handle tasks like sentiment analysis and math problem-solving. </font></b>

But first we will start with the environment setup.

### <b><font color='darkgreen'>Install the library</font></b>

In [1]:
#pip install dspy
!pip freeze | grep 'dspy'

Set up the library with your AI model and API key: This initializes dspy for use with your preferred language model.

In [28]:
import os
from dotenv import load_dotenv
import dspy

# Expand the '~' to the user's home directory path
home_directory = os.path.expanduser('~')

# Construct the full path to your .env file
dotenv_path = os.path.join(home_directory, '.env')

# Load the environment variables from the specified path
load_dotenv(dotenv_path)


API_KEY = os.environ.get('GOOGLE_API_KEY')
lm = dspy.LM("gemini/gemini-2.5-flash", api_key=API_KEY)
dspy.configure(lm=lm)

Now lets start our practical example and dive deep into it .

### <b><font color='darkgreen'>Solving Math Problems with Chain of Thought</font></b>
* <b><font size='3ptx'>Purpose</font></b>: Solve mathematical problems step-by-step.
* <b><font size='3ptx'>Concept</font></b>: Use the [**Chain of Thought**](https://www.datacamp.com/tutorial/chain-of-thought-prompting?utm_cid=19589720824&utm_aid=157098106775&utm_campaign=230119_1-ps-other~dsa-tofu~all_2-b2c_3-apac_4-prc_5-na_6-na_7-le_8-pdsh-go_9-nb-e_10-na_11-na&utm_loc=9197821-&utm_mtd=-c&utm_kw=&utm_source=google&utm_medium=paid_search&utm_content=ps-other~apac-en~dsa~tofu~tutorial~artificial-intelligence&gad_source=1&gad_campaignid=19589720824&gbraid=0AAAAADQ9WsGGLCZqlrmrC_-6OEcO3H_SS&gclid=Cj0KCQiAuvTJBhCwARIsAL6DemgXsrk2FgmnjRCpbssQpbDL7Fq_J2x5P0MGD49NYTnweyRrdSyX7yEaAju5EALw_wcB) (CoT) approach to break down tasks into logical sequences.

In [3]:
math = dspy.ChainOfThought("question -> answer: float")
response = math(question="What is the distance between Earth and the Sun in kilometers?")
print(response) 

Prediction(
    reasoning='The average distance between Earth and the Sun is approximately 1 Astronomical Unit (AU). One AU is defined as 149.6 million kilometers.',
    answer=149600000.0
)


#### <b>Explanation:</b>
* <b><font size='3ptx'>ChainOfThought</font></b>: This creates a prompt structure for solving problems.
  - Input: “`question`” is the math problem.
  - Output: “`answer: float`” specifies the expected result type (a floating-point number).
* <b><font size='3ptx'>The model interprets the problem logically</font></b>, step-by-step, ensuring an accurate solution.

#### <b>Practical Use:</b>
* Scientific calculations.
* Business analytics requiring precise mathematical reasoning.

### <b><font color='darkgreen'>Sentiment Analysis</font></b>
* <b><font size='3ptx'>Purpose</font></b>: Determine the emotional tone (positive, negative, or neutral) of a given sentence.
* <b><font size='3ptx'>Concept</font></b>: Use a Signature to define the input and output fields explicitly.

In [4]:
from typing import Literal


class Classify(dspy.Signature):
    """Classify sentiment of a given sentence."""

    sentence: str = dspy.InputField()
    sentiment: Literal['positive', 'negative', 'neutral'] = dspy.OutputField()
    confidence: float = dspy.OutputField()


classify = dspy.Predict(Classify)
classify(sentence="I love learning new skills!")

Prediction(
    sentiment='positive',
    confidence=0.99
)

#### <b>Explanation:</b>
* <b><font size='3ptx'>Signature</font></b>: A structured template to define:
  - Input: sentence (a string containing the text).
  - Output:
    - `sentiment` (categorical: positive, negative, or neutral).
    - `confidence` (a float indicating the model’s certainty in its prediction).
* <b><font size='3ptx'>Predict</font></b>: Applies the defined SentimentAnalysis signature to the input sentence.

#### <b>Practical Use:</b>
* Monitor customer feedback for businesses.
* Gauge public opinion on social media.

### <b><font color='darkgreen'>Spam Detection</font></b>
* <b><font size='3ptx'>Purpose</font></b>: Detect whether an email or message is spam.
* <b><font size='3ptx'>Concept</font></b>: Use a Signature to classify text into spam or non-spam categories.

In [5]:
class SpamDetect(dspy.Signature):
    """Detect if an email is spam."""
    email: str = dspy.InputField()
    is_spam: bool = dspy.OutputField()
    confidence: float = dspy.OutputField()

spam_detector = dspy.Predict(SpamDetect)
response = spam_detector(email="Congratulations! You've won a free vacation. Click here to claim!")
print(f"Is Spam: {response.is_spam}, Confidence: {response.confidence:.2f}")

Is Spam: True, Confidence: 0.95


#### <b>Explanation:</b>
* <b><font size='3ptx'>Input</font></b>: email field contains the text of the email.
* <b><font size='3ptx'>Output</font></b>:
  - `is_spam` (boolean indicating whether the email is spam).
  - `confidence` (a float showing the certainty of the classification).
* <b><font size='3ptx'>Practical Workflow</font></b>: The model detects patterns common in spam messages, such as exaggerated claims or links to unknown websites.

#### <b>Practical Use:</b>
* Email filtering systems.
* Protecting users from phishing attempts.

### <b><font color='darkgreen'>FAQ Automation</fnot></b>
* <b><font size='3ptx'>Purpose:</font></b> Answer Frequently Asked Questions (FAQs) using AI.
* <b><font size='3ptx'>Concept:</font></b> Define a custom Signature for FAQ inputs and outputs.

In [6]:
class FAQ(dspy.Signature):
    """Answer FAQ queries."""
    question: str = dspy.InputField()
    answer: str = dspy.OutputField()

faq_handler = dspy.Predict(FAQ)
response = faq_handler(question="What is the capital of France?")
print(response.answer)  # Output: "Paris"

The capital of France is Paris.


#### <b>Explanation:</b>
* <b><font size='3ptx'>Input:</font></b> question, containing the FAQ query.
* <b><font size='3ptx'>Output:</font></b> answer, providing the AI-generated response.
* <b><font size='3ptx'>The model</font></b> retrieves the most relevant information to answer the question.

#### <b>Practical Use:</b>
* Chatbots for customer service.
* Automated knowledge bases for websites or applications.

## <b><font color='darkblue'>Building AI Applications by Customizing DSPy Modules</font></b>
([source](https://dspy.ai/tutorials/custom_module/)) <b><font size='3ptx'>In this guide, we will walk you through how to build a GenAI application by customizing dspy.Module.</font></b>

A [**DSPy module**](https://dspy.ai/learn/programming/modules/) is the building block for DSPy programs.

* Each built-in module abstracts a prompting technique (<font color='brown'>like chain of thought or ReAct</font>). Crucially, they are generalized to handle any signature.
* A DSPy module has learnable parameters (<font color='brown'>i.e., the little pieces comprising the prompt and the LM weights</font>) and can be invoked (<font color='brown'>called</font>) to process inputs and return outputs.
* Multiple modules can be composed into bigger modules (<font color='brown'>programs</font>). DSPy modules are inspired directly by NN modules in PyTorch, but applied to LM programs.

Although you can build a DSPy program without implementing a custom module, we highly recommend putting your logic with a custom module so that you can use other DSPy features, like DSPy optimizer or MLflow DSPy tracing.

### <b><font color='darkgreen'>Customize DSPy Module</font></b>
<b><font size='3ptx'>You can implement custom prompting logic and integrate external tools or services by customizing a [DSPy module](https://dspy.ai/learn/programming/modules/).</font></b>

To achieve this, subclass from dspy.Module and implement the following two key methods:

* `__init__`: This is the constructor, where you define the attributes and sub-modules of your program.
* `forward`: This method contains the core logic of your DSPy program.

Within the `forward()` method, you are not limited to calling only other DSPy modules; you can also integrate any standard Python functions, such as those for interacting with Langchain/Agno agents, MCP tools, database handlers, and more.

The basic structure for a custom DSPy module looks like this:
```python
class MyProgram(dspy.Module):
    
    def __init__(self, ...):
        # Define attributes and sub-modules here
        {constructor_code}

    def forward(self, input_name1, input_name2, ...):
        # Implement your program's logic here
        {custom_logic_code}
```

Let's illustrate this with a practical code example. We will build a simple [**Retrieval-Augmented Generation**](https://www.datacamp.com/blog/what-is-retrieval-augmented-generation-rag?utm_cid=19589720824&utm_aid=152984013054&utm_campaign=230119_1-ps-other~dsa-tofu~all_2-b2c_3-apac_4-prc_5-na_6-na_7-le_8-pdsh-go_9-nb-e_10-na_11-na&utm_loc=9197821-&utm_mtd=-c&utm_kw=&utm_source=google&utm_medium=paid_search&utm_content=ps-other~apac-en~dsa~tofu~blog~artificial-intelligence&gad_source=1&gad_campaignid=19589720824&gbraid=0AAAAADQ9WsGGLCZqlrmrC_-6OEcO3H_SS&gclid=Cj0KCQiAuvTJBhCwARIsAL6DemiMqL5H_X7QA_hFp7W2m3GJGfb_T4xGxurWXa8CZmxwInViKTy4ZmYaAstbEALw_wcB) (RAG) application with multiple stages:
1. <b><font size='3ptx'>Query Generation</font></b>: Generate a suitable query based on the user's question to retrieve relevant context.
2. <b><font size='3ptx'>Context Retrieval</font></b>: Fetch context using the generated query.
3. <b><font size='3ptx'>Answer Generation</font></b>: Produce a final answer based on the retrieved context and the original question.

The code implementation for this multi-stage program is shown below.

In [9]:
import dspy

class QueryGenerator(dspy.Signature):
    """Generate a query based on question to fetch relevant context"""
    question: str = dspy.InputField()
    query: str = dspy.OutputField()

def mock_search_wikipedia(query: str) -> list[str]:
    """Query ColBERT endpoint, which is a knowledge source based on wikipedia data"""
    return [
        'Gemini 3 has officially begun its rollout, with the Gemini 3 Pro model entering preview in mid-November 2025 and seeing wider release in December.',
        'Two of the most highlighted features of Gemini 3 are "Deep Think" and "Vibe Coding."',
        'Gemini 3 represents a major pivot from simple chatbots to autonomous agents. It powers a new agentic development platform called Google Antigravity, designed to help developers build AI that can execute multi-step workflows independently.',
    ]

class RAG(dspy.Module):
    def __init__(self):
        self.query_generator = dspy.Predict(QueryGenerator)
        self.answer_generator = dspy.ChainOfThought("question,context->answer")

    def forward(self, question, **kwargs):
        query = self.query_generator(question=question).query
        context = '\n'.join(mock_search_wikipedia(query))
        return self.answer_generator(question=question, context=context).answer

Let's take a look at the `forward` method. We first send the question to `self.query_generator`, which is a `dspy.Predict`, to get the `query` for context retrieving. Then we use the `query` to call `ColBERT` and keep the first context retrieved. Finally, we send the `question` and `context` into `self.answer_generator`, which is a `dspy.ChainOfThought` to generate the final answer.

Next, we'll create an instance of our RAG module to run the program.

<b><font color='orange'>Important</font></b>: When invoking a custom DSPy module, you should use the module instance directly (<font color='brown'>which calls the `__call__` method internally</font>), rather than calling the `forward()` method explicitly. The `__call__` method handles necessary internal processing before executing the forward logic.

In [10]:
rag = RAG()
print(rag(question="Tell me something about Gemini 3?"))

Gemini 3 began its official rollout, with the Gemini 3 Pro model entering preview in mid-November 2025 and a wider release in December. Its most highlighted features are "Deep Think" and "Vibe Coding." Gemini 3 marks a significant shift from simple chatbots to autonomous agents, and it powers Google Antigravity, a new agentic development platform that helps developers build AI capable of executing multi-step workflows independently.


That's it! In summary, to build your GenAI applications, we just put the custom logic into the `forward()` method, then create a module instance and call the instance itself.

### <b><font color='darkgreen'>Why Customizing Module?</font></b>
<font size='3ptx'><b>DSPy is a lightweight authoring and optimization framework, and our focus is to resolve the mess of prompt engineering by transforming prompting</b> (<font color='brown'>string in, string out</font>) <b>LLM into programming LLM</b> (<font color='brown'>structured inputs in, structured outputs out</font>) <b>for robust AI system.</b></font>

While we provide pre-built modules which have custom prompting logic like `dspy.ChainOfThought` for reasoning, `dspy.ReAct` for tool calling agent to facilitate building your AI applications, we don't aim at standardizing how you build agents.

In DSPy, your application logic simply goes to the `forward` method of your custom Module, which doesn't have any constraint as long as you are writing python code. <b>With this layout, DSPy is easy to migrate to from other frameworks or vanilla SDK usage, and easy to migrate off because essentially it's just python code</b>.

## <b><font color='darkblue'>Few-Shot Learning Optimizers Overview</font></b>
([source](https://codesignal.com/learn/courses/how-to-optimize-with-dspy/lessons/automatic-few-shot-learning-with-dspy)) <b><font size='3ptx'>Now, we'll dive deeper into the first category: Few-Shot Learning optimizers.</font></b>

**As you may recall, [few-shot learning](https://www.promptingguide.ai/techniques/fewshot) is a technique where we provide the language model with examples of the task before asking it to solve a new instance**. This approach helps the model understand what we're asking for and improves its performance. While you could manually select and include examples in your prompts, DSPy's few-shot optimizers automate this process, finding the most effective examples to include.

1. <b><font size='3ptx'>[LabeledFewShot](#LabeledFewShot:-Basic-Example-Selection)</font></b>: The simplest approach, which randomly selects examples from your training data.
2. <b><font size='3ptx'>[BootstrapFewShot](#BootstrapFewShot:-Self-Generated-Examples)</font></b>: A more advanced approach that generates new examples using your program itself.
3. <b><font size='3ptx'>[BootstrapFewShotWithRandomSearch](#BootstrapFewShotWithRandomSearch:-Finding-Optimal-Example-Sets)</font></b>: Extends <b><font color='blue'>BootstrapFewShot</font></b> by exploring multiple sets of examples to find the best combination.
4. <b><font size='3ptx'>[KNNFewShot](#KNNFewShot:-Context-Aware-Example-Selection)</font></b>: A retrieval-based approach that selects examples most similar to the current input.

Each optimizer has its strengths and is suited for different scenarios. If you have very few examples (<font color='brown'>around 10</font>), <b><font color='blue'>BootstrapFewShot</font></b> is a good starting point. With more data (<font color='brown'>50+ examples</font>), <b><font color='blue'>BootstrapFewShotWithRandomSearch</font></b> can yield better results. <b><font color='blue'>KNNFewShot</font></b> is particularly useful when the relevance of examples varies significantly depending on the input.

Let's explore each of these optimizers in detail, with practical examples to help you understand how to implement them in your own projects.

### <b><font color='darkgreen'>LabeledFewShot: Basic Example Selection</font></b>
<font size='3ptx'>The simplest few-shot optimizer in DSPy is [<b>LabeledFewShot</b>](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot).</font>

<b>This optimizer takes examples from your training data and includes them in the prompt sent to the language model</b>. It's straightforward but effective, especially when you have high-quality labeled examples.

Here's how to implement [**LabeledFewShot**](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot):
```python
from dspy.teleprompt import LabeledFewShot

# Create the optimizer with k=8 (8 examples will be included in each prompt)
labeled_fewshot_optimizer = LabeledFewShot(k=8)

# Compile your DSPy program with the optimizer
your_dspy_program_compiled = labeled_fewshot_optimizer.compile(
    student=your_dspy_program, 
    trainset=trainset
)
```

In this example, we create a [**LabeledFewShot**](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot) optimizer that will include 8 examples in each prompt. The k parameter controls the number of examples, and <b>you can adjust it based on your needs and the context window size of your language model</b>.

<b>When you call `compile()`, the optimizer randomly selects `k` examples from your training set and incorporates them into the prompts of your DSPy program.</b> The result is a new program (`your_dspy_program_compiled`) that includes these examples in its prompts. Since [**LabeledFewShot**](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot) selects examples randomly, the examples used may differ between runs unless you fix the random seed. If reproducibility is important—for example, when comparing results or debugging—you may want to <b>set a random seed before compiling the program to ensure consistency across runs</b>.

Let's check a real workable example as below:

In [11]:
class SentimentSignature(dspy.Signature):
    """Classify the sentiment of a sentence."""
    sentence: str = dspy.InputField()
    sentiment: str = dspy.OutputField(desc="positive, negative, or neutral")


class SentimentClassifier(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predict = dspy.Predict(SentimentSignature)

    def forward(self, sentence):
        return self.predict(sentence=sentence)

In [12]:
train_examples = [
    dspy.Example(
        sentence="I love this product, it works perfectly!",
        sentiment="positive"
    ).with_inputs("sentence"),

    dspy.Example(
        sentence="This is the worst experience I’ve ever had.",
        sentiment="negative"
    ).with_inputs("sentence"),

    dspy.Example(
        sentence="The package arrived on time.",
        sentiment="neutral"
    ).with_inputs("sentence"),
]

#### <b>Apply `LabeledFewShot` Teleprompter</b>

In [13]:
from dspy.teleprompt import LabeledFewShot

teleprompter = LabeledFewShot(k=3)

In [14]:
classifier = SentimentClassifier()

compiled_classifier = teleprompter.compile(
    classifier,
    trainset=train_examples
)

What this does:
* Selects up to k=3 labeled examples
* Injects them into the prompt
* Locks them into the compiled program

#### <b>Run Inference</b>

In [15]:
result = compiled_classifier(
    sentence="The movie was okay, nothing special."
)

print(result.sentiment)

neutral


#### <b>What `LabeledFewShot` Is Actually Doing</b>
Internally, DSPy turns your examples into something like:
```
Sentence: I love this product, it works perfectly!
Sentiment: positive

Sentence: This is the worst experience I’ve ever had.
Sentiment: negative

Sentence: The package arrived on time.
Sentiment: neutral

Sentence: The movie was okay, nothing special.
Sentiment:
```

But:
- You don’t write prompts
- You don’t manage formatting
- You can later swap teleprompters (BootstrapFewShot, COPRO, etc.)

#### <b>When to Use `LabeledFewShot`</b>
Use it when:
- You already have labeled data
- You want deterministic, simple few-shot behavior
- You don’t need optimization or self-bootstrapping

Don’t use it when:
- You want the model to discover examples
- You want automatic prompt optimization

**The main advantage of [LabeledFewShot](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot) is its simplicity. It doesn't require a metric function and doesn't perform any complex optimization.** However, this simplicity also means it doesn't adapt the examples to the specific input or try to find the most effective examples. It's a good baseline approach, especially when you have a small but high-quality training set.

### <b><font color='darkgreen'>BootstrapFewShot: Self-Generated Examples</font></b>
<font size='3ptx'>While [**LabeledFewShot**](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot) simply uses examples from your training data, <b>[BootstrapFewShot](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot) goes a step further by generating additional examples using your program itself</b>.</font>

This is particularly useful when you have limited labeled data or when you want to create more diverse examples.

Here's how to implement [**BootstrapFewShot**](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot):
```python
from dspy.teleprompt import BootstrapFewShot

# Create the optimizer
fewshot_optimizer = BootstrapFewShot(
    metric=your_defined_metric,
    max_bootstrapped_demos=4,
    max_labeled_demos=16,
    max_rounds=1,
    max_errors=5
)

# Compile your DSPy program with the optimizer
your_dspy_program_compiled = fewshot_optimizer.compile(
    student=your_dspy_program,   # DSPY program
    trainset=trainset,           # Dataset
)
```

In this example, we create a [**BootstrapFewShot**](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot) optimizer with several parameters:
* <b><font size='3ptx'>`metric`</font></b>: A function that evaluates the quality of generated examples.
* <b><font size='3ptx'>`max_bootstrapped_demos`</font></b>: The maximum number of examples to generate (4 in this case).
* <b><font size='3ptx'>`max_labeled_demos`</font></b>: The maximum number of examples to use from the training set (16 in this case).
* <b><font size='3ptx'>`max_rounds`</font></b>: The number of rounds of bootstrapping to perform.
* <b><font size='3ptx'>`max_errors`</font></b>: The maximum number of errors allowed before stopping the bootstrapping process.

The bootstrapping process works as follows:
1. The optimizer selects examples from your training set (<font color='brown'>up to `max_labeled_demos`</font>).
2. It uses your program (<font color='brown'>or a specified "teacher" program</font>) to generate complete demonstrations for these examples.
3. It evaluates the generated demonstrations using your `metric` function.
4. It includes only the successful demonstrations (<font color='brown'>those that pass the `metric`</font>) in the compiled program.

You can also use a different language model for the teacher by specifying it in the `teacher_settings`:
```python
# Using another LM for compilation
fewshot_optimizer = BootstrapFewShot(
    metric=your_defined_metric,
    max_bootstrapped_demos=4,
    max_labeled_demos=16,
    max_rounds=1,
    max_errors=5,
    teacher_settings=dict(lm=gpt4)  # Use GPT-4 as the teacher
)
```

This is particularly useful when you have access to a more powerful model (<font color='brown'>like GPT-4</font>) that can generate high-quality examples but want to optimize a program that will run on a smaller, more efficient model.

The main advantage of [**BootstrapFewShot**](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot) over [**LabeledFewShot**](https://dspy.ai/api/optimizers/LabeledFewShot/?h=labeledfewshot) is that it can <b>generate new, high-quality examples beyond what's in your training set. This can lead to better performance, especially when your training data is limited</b>.

Let's check a real example:

In [39]:
from dspy.teleprompt import BootstrapFewShot


# 1. Define the task signature
class QA(dspy.Signature):
    """You are going to receive a question and provide an answer.

    Regarding the answer, please don't use more than 5 sentences to answer the question.
    If you don't know or have no confidence, just reply "I don't know, sorry."
    """
    question = dspy.InputField()
    answer = dspy.OutputField(desc="A short, factual answer")


# 2. Create a module
class QAModule(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predict = dspy.Predict(QA)

    def forward(self, question):
        return self.predict(question=question)

In [48]:
# 3. Prepare training examples
trainset = [
    dspy.Example(
        question="What is the capital of France?",
        answer="Paris"
    ).with_inputs("question"),
    dspy.Example(
        question="Who wrote Hamlet?",
        answer="William Shakespeare"
    ).with_inputs("question"),
    dspy.Example(
        question="What is Cockatoo.AI?",
        answer="It is a famework to use models (A-B-C as shown below) to form the pipeline as the final language tutor."
    ).with_inputs("question"),
    dspy.Example(
        question="What is 1+1?",
        answer="2"
    ).with_inputs("question"),
    dspy.Example(
        question="What is LLM?",
        answer="Large Lange Model"
    ).with_inputs("question"),
    dspy.Example(
        question="Give me 3 kinds of animals.",
        answer="Dog, Cat and Bird."
    ).with_inputs("question"),
]

In [49]:
# 4. Define a metric
def exact_match(example, pred, trace=None):
    return example.answer.strip().lower() == pred.answer.strip().lower()

In [54]:
%%time
# 5. Bootstrap few-shot examples
teleprompter = BootstrapFewShot(
    metric=exact_match,
    max_bootstrapped_demos=4,
    max_labeled_demos=3)

compiled_module = teleprompter.compile(
    student=QAModule(),
    trainset=trainset
)

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:08<00:00,  1.49s/it]

Bootstrapped 0 full traces after 5 examples for up to 1 rounds, amounting to 6 attempts.
CPU times: user 225 ms, sys: 12.9 ms, total: 237 ms
Wall time: 8.94 s





In [55]:
# 6. Run inference
dspy.settings.configure(trace=[])
result = compiled_module(question="What is the capital of Taiwan?")
print(result.answer)

The capital of Taiwan is Taipei.


In [56]:
dspy.settings.trace[-1]

(Predict(QA(question -> answer
     instructions='You are going to receive a question and provide an answer.\n\nRegarding the answer, please don\'t use more than 5 sentences to answer the question.\nIf you don\'t know or have no confidence, just reply "I don\'t know, sorry."'
     question = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Question:', 'desc': '${question}'})
     answer = Field(annotation=str required=True json_schema_extra={'desc': 'A short, factual answer', '__dspy_field_type': 'output', 'prefix': 'Answer:'})
 )),
 {'question': 'What is the capital of Taiwan?'},
 Prediction(
     answer='The capital of Taiwan is Taipei.'
 ))

In [57]:
# Check last interaction from history
lm.inspect_history(n=1)





[34m[2025-12-14T04:59:48.915081][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str):
Your output fields are:
1. `answer` (str): A short, factual answer
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## question ## ]]
{question}

[[ ## answer ## ]]
{answer}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        You are going to receive a question and provide an answer.
        
        Regarding the answer, please don't use more than 5 sentences to answer the question.
        If you don't know or have no confidence, just reply "I don't know, sorry."


[31mUser message:[0m

[[ ## question ## ]]
What is the capital of France?


[31mAssistant message:[0m

[[ ## answer ## ]]
Paris

[[ ## completed ## ]]


[31mUser message:[0m

[[ ## question ## ]]
What is 1+1?


[31mAssistant message:[0m

[[ ## answer ## ]]
2

[[ ## completed ## ]]


[31mUser message:[0m

[[ ## question ##

#### <b>What [BootstrapFewShot](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot) does here</b>
1. Runs the model on the training set.
2. Collects **successful model-generated examples**.
3. Selects the best ones according to your metric.
4. Automatically inserts them as few-shot demonstrations.

#### <b>When to use [BootstrapFewShot](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot)</b>
* You have **few or no hand-written prompts**.
* You want the model to **discover effective examples itself**.
* You can **define a clear evaluation metric**.

### <b><font color='darkgreen'>BootstrapFewShotWithRandomSearch: Finding Optimal Example Sets</font></b>
<font size='3ptx'>Building on [BootstrapFewShot](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot), the [**BootstrapFewShotWithRandomSearch**](https://dspy.ai/api/optimizers/BootstrapFewShotWithRandomSearch/?h=bootstrapfewshotwithrandomsearch) optimizer adds another layer of optimization by exploring multiple sets of examples to find the best combination.</font>

<b>This is particularly useful when you have a larger training set and want to find the most effective subset of examples.</b>

Here's how to implement [**BootstrapFewShotWithRandomSearch**](https://dspy.ai/api/optimizers/BootstrapFewShotWithRandomSearch/?h=bootstrapfewshotwithrandomsearch):
```python
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

# Configure the optimizer
config = dict(
    max_bootstrapped_demos=4,
    max_labeled_demos=4,
    num_candidate_programs=10,
    num_threads=4
)

# Create the optimizer
teleprompter = BootstrapFewShotWithRandomSearch(
    metric=YOUR_METRIC_HERE,
    **config
)

# Compile your DSPy program with the optimizer
optimized_program = teleprompter.compile(
    YOUR_PROGRAM_HERE,
    trainset=YOUR_TRAINSET_HERE
)
```

n this example, we configure the optimizer with several parameters:
* <b><font size='3ptx'>max_bootstrapped_demos</font></b>: The maximum number of examples to generate (<font color='brown'>4 in this case</font>).
* <b><font size='3ptx'>max_labeled_demos</font></b>: The maximum number of examples to use from the training set (<font color='brown'>4 in this case</font>).
* <b><font size='3ptx'>num_candidate_programs</font></b>: The number of random programs to evaluate (<font color='brown'>10 in this case</font>).
* <b><font size='3ptx'>num_threads</font></b>: The number of threads to use for parallel evaluation (<font color='brown'>4 in this case</font>).

The random search process works as follows:
1. The optimizer creates multiple candidate programs, each with a different set of examples.
2. These candidates include the uncompiled program, a program optimized with `LabeledFewShot`, a program optimized with `BootstrapFewShot` using unshuffled examples, and `num_candidate_programs` programs optimized with `BootstrapFewShot` using randomized example sets.
3. It evaluates all these candidates on your validation set using your metric function.
4. It returns the candidate program that performs best according to your metric.

The parallelization through num_threads can significantly speed up the optimization process, especially when evaluating many candidate programs. Let's check a workable example to see how it goes:

#### <b>Example</b>

In [65]:
import dspy
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

# 1. Define the Signature (Input -> Output)
class TicketTriage(dspy.Signature):
    """Classify the severity of a software issue based on the description."""
    
    issue_description = dspy.InputField(desc="The raw text from the user ticket")
    severity = dspy.OutputField(desc="Severity level: High, Medium, or Low")
    reasoning = dspy.OutputField(desc="Brief explanation for the classification")

In [66]:
# 2. Create a small labeled dataset (Train & Dev)
# In a real scenario, you'd want 10-50 examples here.
trainset = [
    dspy.Example(issue_description="The entire system is down and no one can login.", severity="High").with_inputs('issue_description'),
    dspy.Example(issue_description="There is a typo in the footer of the dashboard.", severity="Low").with_inputs('issue_description'),
    dspy.Example(issue_description="The reports are taking 5 minutes to load instead of 10 seconds.", severity="Medium").with_inputs('issue_description'),
    dspy.Example(issue_description="Data loss occurred when clicking save.", severity="High").with_inputs('issue_description'),
]

devset = [
    dspy.Example(issue_description="The color of the 'Submit' button is slightly off.", severity="Low").with_inputs('issue_description'),
    dspy.Example(issue_description="API returns 500 error for all POST requests.", severity="High").with_inputs('issue_description'),
]

In [67]:
# 3. Define the Module (ChainOfThought is usually best for reasoning tasks)
module = dspy.ChainOfThought(TicketTriage)

In [68]:
# 4. Define the Metric
# We check if the predicted severity matches the actual severity
def validate_severity(example, prediction, trace=None):
    # Normalize strings for comparison (case-insensitive)
    return example.severity.lower() == prediction.severity.lower()

In [69]:
# 5. Initialize the Optimizer
print("Compiling...")
optimizer = BootstrapFewShotWithRandomSearch(
    metric=validate_severity,
    max_bootstrapped_demos=4,    # How many examples to generate total
    max_labeled_demos=4,         # How many real examples to use
    num_candidate_programs=5,    # How many different prompts to try testing
    num_threads=4                # Parallel threads for speed
)

Compiling...
Going to sample between 1 and 4 traces per predictor.
Will attempt to bootstrap 5 candidate sets.
CPU times: user 309 μs, sys: 95 μs, total: 404 μs
Wall time: 375 μs


In [70]:
%%time
# 6. Compile (Optimize) the module
# This will run the random search over your trainset
compiled_module = optimizer.compile(student=module, trainset=trainset, valset=devset)

Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.09s/it]

2025/12/14 05:22:44 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



New best score: 100.0 for seed -3
Scores so far: [100.0]
Best score so far: 100.0
Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.24s/it]

2025/12/14 05:22:46 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0]
Best score so far: 100.0


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:07<00:00,  1.98s/it]


Bootstrapped 4 full traces after 3 examples for up to 1 rounds, amounting to 4 attempts.
Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.25s/it]

2025/12/14 05:22:56 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0, 100.0]
Best score so far: 100.0


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:05<00:00,  1.49s/it]


Bootstrapped 4 full traces after 3 examples for up to 1 rounds, amounting to 4 attempts.
Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.39s/it]

2025/12/14 05:23:05 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0, 100.0, 100.0]
Best score so far: 100.0


 50%|████████████████████████████████████████████████████████████▌                                                            | 2/4 [00:01<00:01,  1.35it/s]


Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00,  1.14it/s]

2025/12/14 05:23:08 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0, 100.0, 100.0, 100.0]
Best score so far: 100.0


 25%|██████████████████████████████▎                                                                                          | 1/4 [00:01<00:05,  1.84s/it]


Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.13s/it]

2025/12/14 05:23:13 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0, 100.0, 100.0, 100.0, 100.0]
Best score so far: 100.0


 50%|████████████████████████████████████████████████████████████▌                                                            | 2/4 [00:00<00:00, 35.50it/s]


Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Average Metric: 2.00 / 2 (100.0%): 100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 1734.62it/s]

2025/12/14 05:23:13 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0]
Best score so far: 100.0


 50%|████████████████████████████████████████████████████████████▌                                                            | 2/4 [00:03<00:03,  1.92s/it]


Bootstrapped 2 full traces after 2 examples for up to 1 rounds, amounting to 2 attempts.
Average Metric: 2.00 / 2 (100.0%): 100%|██████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.11s/it]

2025/12/14 05:23:19 INFO dspy.evaluate.evaluate: Average Metric: 2 / 2 (100.0%)



Scores so far: [100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0, 100.0]
Best score so far: 100.0
8 candidate programs found.
CPU times: user 812 ms, sys: 76 ms, total: 888 ms
Wall time: 37.4 s


In [71]:
# 7. Test it out
print("\n--- Testing Optimized Module ---")
test_issue = "I can't export the CSV file, it just does nothing when I click."
pred = compiled_module(issue_description=test_issue)

print(f"Issue: {test_issue}")
print(f"Predicted Severity: {pred.severity}")
print(f"Reasoning: {pred.reasoning}")


--- Testing Optimized Module ---
Issue: I can't export the CSV file, it just does nothing when I click.
Predicted Severity: High
Reasoning: The user is completely blocked from exporting data, which is a core functionality and impacts their ability to use or analyze data outside the application. There is no workaround mentioned.


#### <b>When to use this vs. standard `BootstrapFewShot`</b>

<b>The main advantage of [BootstrapFewShotWithRandomSearch](https://dspy.ai/api/optimizers/BootstrapFewShotWithRandomSearch/?h=bootstrapfewshotwithrandomsearch) over [BootstrapFewShot](https://dspy.ai/api/optimizers/BootstrapFewShot/?h=bootstrapfewshot) is that it explores a larger space of possible example combinations, increasing the chances of finding a particularly effective set</b>. However, this comes at <b><font color='red'>the cost of increased computation time</font></b>, as it needs to evaluate multiple candidate programs.

### <b><font color='darkgreen'>KNNFewShot: Context-Aware Example Selection</font></b>
<font size='3ptx'>The final few-shot optimizer we'll explore is [**KNNFewShot**](https://dspy.ai/api/optimizers/KNNFewShot/?h=knnfewshot), which takes a different approach by selecting examples based on their similarity to the current input. <b>This is particularly useful when the relevance of examples varies significantly depending on the input.</b></font>

Here's how to implement [**KNNFewShot**](https://dspy.ai/api/optimizers/KNNFewShot/?h=knnfewshot):

```python
from sentence_transformers import SentenceTransformer
from dspy import Embedder
from dspy.teleprompt import KNNFewShot
from dspy import ChainOfThought

# Create an embedder using SentenceTransformer
embedder = Embedder(SentenceTransformer("all-MiniLM-L6-v2").encode)

# Create the optimizer
knn_optimizer = KNNFewShot(
    k=3,
    trainset=trainset,
    vectorizer=embedder
)

# Compile your DSPy program with the optimizer
qa_compiled = knn_optimizer.compile(
    student=ChainOfThought("question -> answer")
)
```

In this example, we first create an [**Embedder**](https://dspy.ai/api/models/Embedder/?h=embedder) using a pre-trained `SentenceTransformer` model. This `embedder` converts text into vector representations that capture semantic meaning. Then, we create a [**KNNFewShot**](https://dspy.ai/api/optimizers/KNNFewShot/?h=knnfewshot) optimizer with several parameters:
* <b><font size='3ptx'>k</font></b>: The number of nearest neighbors (examples) to include in each prompt (3 in this case).
* <b><font size='3ptx'>trainset</font></b>: Your training set of examples.
* <b><font size='3ptx'>vectorizer</font></b>: The embedder that converts text to vectors for similarity comparison.

The KNN ([k-nearest neighbors](https://www.geeksforgeeks.org/machine-learning/k-nearest-neighbours/)) process works as follows:
1. When your program receives a new input, the optimizer converts it to a vector using the embedder.
2. It compares this vector to the vectors of all examples in your training set.
3. It selects the k examples that are most similar to the current input.
4. It includes these examples in the prompt sent to the language model.

Let's check a workable example:

#### <b>Example</b>

In [82]:
#!pip install sentence_transformers
!pip freeze | grep 'sentence-transformers'

sentence-transformers==5.2.0


In [80]:
import dspy
from dspy.teleprompt import KNNFewShot
from sentence_transformers import SentenceTransformer
from dspy import Embedder

# 1. Define the Signature (Input -> Output)
class TicketTriage(dspy.Signature):
    """Classify the severity of a software issue."""
    issue_description = dspy.InputField(desc="The raw text from the user ticket")
    severity = dspy.OutputField(desc="Severity level: High, Medium, or Low")
    reasoning = dspy.OutputField(desc="Brief explanation for the classification")

  from .autonotebook import tqdm as notebook_tqdm


In [95]:
# 2. Create a small labeled dataset (Train & Dev)
# KNN shines when it has a "library" of examples to choose from.
trainset = [
    dspy.Example(issue_description="System crash on startup.", severity="High", reasoning="Prevents use.").with_inputs('issue_description'),
    dspy.Example(issue_description="Typo in 'Welcome' message.", severity="Low", reasoning="Cosmetic only.").with_inputs('issue_description'),
    dspy.Example(issue_description="SQL injection vulnerability found.", severity="High", reasoning="Security risk.").with_inputs('issue_description'),
    dspy.Example(issue_description="Blue button looks purple on mobile.", severity="Low", reasoning="Cosmetic UI.").with_inputs('issue_description'),
    dspy.Example(issue_description="Export to PDF fails with error 404.", severity="Medium", reasoning="Feature broken but system works.").with_inputs('issue_description'),
    dspy.Example(issue_description="Payment gateway timeout.", severity="High", reasoning="Revenue impact.").with_inputs('issue_description'),
    dspy.Example(issue_description="Font size too small on settings page.", severity="Low", reasoning="Accessibility/Cosmetic.").with_inputs('issue_description'),
    dspy.Example(issue_description="Search bar takes 10 seconds to respond.", severity="Medium", reasoning="Performance degradation.").with_inputs('issue_description'),
]

In [96]:
# 3. Define the Module (ChainOfThought is usually best for reasoning tasks)
module = dspy.ChainOfThought(TicketTriage)

In [97]:
# 4. Initialize the Optimizer
# k: The number of similar examples to fetch for each query
# Create an embedder using SentenceTransformer
embedder = Embedder(SentenceTransformer("all-MiniLM-L6-v2").encode)

optimizer = KNNFewShot(k=3, trainset=trainset, vectorizer=embedder)

In [99]:
%%time
# 5. Initialize the Optimizer
print("Compiling...")
compiled_module = optimizer.compile(module)

Compiling...
CPU times: user 308 μs, sys: 0 ns, total: 308 μs
Wall time: 302 μs


In [103]:
# 6. Test it out
print("\n--- Testing KNN Module ---")

# Case A: A security/crash issue -> Should pull 'System crash' or 'SQL injection' examples
print(">> Input: 'User password exposed in logs'")
pred_a = compiled_module(issue_description="User password exposed in plain text logs.")
print(f"Predicted: {pred_a.severity}\n")


--- Testing KNN Module ---
>> Input: 'User password exposed in logs'


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 13.15it/s]

Bootstrapped 3 full traces after 2 examples for up to 1 rounds, amounting to 3 attempts.
Predicted: High






In [104]:
# Check last interaction from history
lm.inspect_history(n=1)





[34m[2025-12-14T05:46:40.473761][0m

[31mSystem message:[0m

Your input fields are:
1. `issue_description` (str): The raw text from the user ticket
Your output fields are:
1. `reasoning` (str): Brief explanation for the classification
2. `severity` (str): Severity level: High, Medium, or Low
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## issue_description ## ]]
{issue_description}

[[ ## reasoning ## ]]
{reasoning}

[[ ## severity ## ]]
{severity}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Classify the severity of a software issue.


[31mUser message:[0m

[[ ## issue_description ## ]]
Typo in 'Welcome' message.


[31mAssistant message:[0m

[[ ## reasoning ## ]]
Cosmetic issue, does not affect functionality.

[[ ## severity ## ]]
Low

[[ ## completed ## ]]


[31mUser message:[0m

[[ ## issue_description ## ]]
SQL injection vulnerability found.


[31mAssistant message:[0m

[[ #

In [105]:
# Case B: A UI issue -> Should pull 'Typo' or 'Blue button' examples
print(">> Input: 'The header icon is misaligned'")
pred_b = compiled_module(issue_description="The header icon is misaligned by 5 pixels.")
print(f"Predicted: {pred_b.severity}")

>> Input: 'The header icon is misaligned'


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 22.99it/s]

Bootstrapped 3 full traces after 2 examples for up to 1 rounds, amounting to 3 attempts.
Predicted: Low





In [106]:
# Check last interaction from history
lm.inspect_history(n=1)





[34m[2025-12-14T05:46:58.276767][0m

[31mSystem message:[0m

Your input fields are:
1. `issue_description` (str): The raw text from the user ticket
Your output fields are:
1. `reasoning` (str): Brief explanation for the classification
2. `severity` (str): Severity level: High, Medium, or Low
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## issue_description ## ]]
{issue_description}

[[ ## reasoning ## ]]
{reasoning}

[[ ## severity ## ]]
{severity}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Classify the severity of a software issue.


[31mUser message:[0m

[[ ## issue_description ## ]]
Font size too small on settings page.


[31mAssistant message:[0m

[[ ## reasoning ## ]]
Cosmetic UI issue, does not block functionality.

[[ ## severity ## ]]
Low

[[ ## completed ## ]]


[31mUser message:[0m

[[ ## issue_description ## ]]
Blue button looks purple on mobile.


[31mAssistant mess

#### <b>When to use</b>
<b>The main advantage of [**KNNFewShot**](https://dspy.ai/api/optimizers/KNNFewShot/?h=knnfewshot) over the other optimizers is that it dynamically selects examples based on their relevance to the current input</b>. This can lead to better performance, especially when different types of inputs benefit from different types of examples. However, it <b><font color='red'>requires an additional step of computing embeddings, which adds some computational overhead</font></b>.

## <b><font color='darkblue'>Advantages of DSPy</font></b>
Below we will see the advantages of DSPy:

* <b><font size='3ptx'>Declarative Programming</font></b>: Allows developers to specify desired outcomes without detailing the implementation steps.
* <b><font size='3ptx'>Modularity</font></b>: Encourages the creation of reusable components for building complex workflows.
* <b><font size='3ptx'>Automatic Optimization</font></b>: Enhances performance by fine-tuning prompts and configurations without manual intervention.
* <b><font size='3ptx'>Self-Improvement</font></b>: Continuously refines workflows based on feedback, leading to better results over time.
* <b><font size='3ptx'>Scalability</font></b>: Efficiently manages workflows of varying complexity and size.
* <b><font size='3ptx'>Easy Integration</font></b>: Seamlessly incorporates into existing systems and applications.
* <b><font size='3ptx'>Continuous Monitoring</font></b>: Provides tools to track and maintain workflow performance.

## <b><font color='darkblue'>Supplement</font></b>
* [ithelp - 【Day 27】- 告別提示工程：DSPy如何革新大型語言模型的應用開發](https://ithelp.ithome.com.tw/m/articles/10348919)
* [CodeSignal - Introduction to Optimization with DSPy](https://codesignal.com/learn/courses/how-to-optimize-with-dspy/lessons/introduction-to-optimization-with-dspy)