# DSPy Introduction

## Table of Contents

- [Concepts](#concepts)
- [Building Blocks](#building-blocks)
    - [Language Models](#language-models)
    - [Signatures](#signatures)
    - [Modules](#modules)
    - [Data](#data)
    - [Metrics](#metrics)
    - [Optimizers](#optimizers)
    - [Assertions](#assertions)
    - [Type Predictors](#type-predictors)
- [Workflow](#workflow)
- [Examples](#examples)
- [Roadmap](#roadmap)
- [References](#references)


## Concepts

## Building Blocks

In [26]:
import dspy
from dotenv import load_dotenv

load_dotenv()

True

### Language Models

Notes:
- Earlier versions of DSPy involved tens of clients for different LM providers.(deprecated, and will be removed in DSPy 2.6) Starting from 2.5, use `dspy.LM` instead(using litellm under the hood)
- Inspecting history
- Adapters
    - DSPy 2.5 introduces **Adapters** as a layer between Signatures and LMs, responsible for formatting these pieces (Signature I/O fields, instructions, and examples) as well as generating and parsing the outputs.
- Using `dspy.configure` and `dspy.context` is thread-safe!
- By default LMs in DSPy are cached. If you repeat the same call, you will get the same outputs. But you can turn off caching by setting `cache=False` while declaring `dspy.LM` object
- Any OpenAI-compatible endpoint is easy to set up with an `openai/` prefix as well.



References:
- documentation: https://dspy-docs.vercel.app/building-blocks/1-language_models/
- source code: https://github.com/stanfordnlp/dspy/blob/main/dspy/clients/lm.py


#### Examples

setting up LLM

In [27]:
lm = dspy.LM(model="gpt-4o-mini")
dspy.configure(lm=lm)

directly calling the LLM(not recommended)

In [28]:
lm("hello!")

['Hello! How can I assist you today?']

In [29]:
# for chat LLMs
lm(messages=[{"role": "system", "content": "You are a helpful assistant."},
             {"role": "user", "content": "What is 2+2?"}])

['2 + 2 equals 4.']

using the llm with DSPy signatures and modules

In [30]:
# Define a module (ChainOfThought) and assign it a signature (return an answer, given a question).
qa = dspy.ChainOfThought('question -> answer')

# Run with the default LM configured with `dspy.configure` above.
response = qa(question="How many floors are in the castle David Gregory inherited?")
print(response.answer)

Insufficient information to determine the number of floors in the castle David Gregory inherited.


using multiple LLMs at once

In [31]:
# Run with the default LM configured above, i.e. GPT-4o-mini
response = qa(question="How many floors are in the castle David Gregory inherited?")
print('gpt-4o-mini:', response.answer)

gpt_4o = dspy.LM(model='gpt-4o', max_tokens=300)

# Run with GPT-4o instead
with dspy.context(lm=gpt_4o):
    response = qa(question="How many floors are in the castle David Gregory inherited?")
    print('gpt-4o:', response.answer)

gpt-4o-mini: Insufficient information to determine the number of floors in the castle David Gregory inherited.
gpt-4o: Unknown


configuring llm attributes

In [32]:
gpt_4o_mini = dspy.LM(
	'gpt-4o-mini',
	temperature=0.9,
	max_tokens=3000,
	stop=None,
	cache=False
)

using locally hosted LLMs

In [33]:
ollama_port = 11434 
ollama_url = f"http://localhost:{ollama_port}"
ollama_llm = dspy.LM(model="ollama/llama3.2:1b", api_base=ollama_url)

inspecting llm output and usage metadata

In [34]:
len(lm.history)

4

In [35]:
for k, v in lm.history[-1].items():
    print(f"{k}: {v}")

prompt: None
messages: [{'role': 'system', 'content': 'Your input fields are:\n1. `question` (str)\n\nYour output fields are:\n1. `reasoning` (str)\n2. `answer` (str)\n\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\n[[ ## question ## ]]\n{question}\n\n[[ ## reasoning ## ]]\n{reasoning}\n\n[[ ## answer ## ]]\n{answer}\n\n[[ ## completed ## ]]\n\nIn adhering to this structure, your objective is: \n        Given the fields `question`, produce the fields `answer`.'}, {'role': 'user', 'content': '[[ ## question ## ]]\nHow many floors are in the castle David Gregory inherited?\n\nRespond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.'}]
kwargs: {'temperature': 0.0, 'max_tokens': 1000}
response: ModelResponse(id='chatcmpl-AP1gFYUpR7PAotKStTcGdeP89I0dM', choices=[Choices(finish_reason='stop', index=0, message=Messa

#### Creating custom LLM class (Advanced)

Creating a custom LM class is quite straightforward in DSPy. You can inherit from the dspy.LM class or create a new class with a similar interface. You'll need to implement/override these three methods:

- `__init__`: Initialize the LM with the given model and other keyword arguments.
- `__call__`: Call the LM with the given input prompt and return a list of string outputs.
- `inspect_history`: The history of interactions with the LM. This is optional but is needed by some optimizers in DSPy.

```python
import os
import dspy
import google.generativeai as genai

class GeminiLM(dspy.LM):
    def __init__(self, model, api_key=None, endpoint=None, **kwargs):
        genai.configure(api_key=os.environ["GEMINI_API_KEY"] or api_key)

        self.endpoint = endpoint
        self.history = []

        super().__init__(model, **kwargs)
        self.model = genai.GenerativeModel(model)

    def __call__(self, prompt=None, messages=None, **kwargs):
        # Custom chat model working for text completion model
        prompt = '\n\n'.join([x['content'] for x in messages] + ['BEGIN RESPONSE:'])

        completions = self.model.generate_content(prompt)
        self.history.append({"prompt": prompt, "completions": completions})

        # Must return a list of strings
        return [completions.candidates[0].content.parts[0].text]

    def inspect_history(self):
        for interaction in self.history:
            print(f"Prompt: {interaction['prompt']} -> Completions: {interaction['completions']}")

lm = GeminiLM("gemini-1.5-flash", temperature=0)
dspy.configure(lm=lm)

qa = dspy.ChainOfThought("question->answer")
qa(question="What is the capital of France?")
```

#### TODO: Structured LLM output with Adapters (Advanced)

### Signatures

Notes:
- inline-based signature prompt creation
![](https://dspy-docs.vercel.app/deep-dive/signature/img/prompt_creation.png)
- class-based signature prompt creation
![](https://dspy-docs.vercel.app/deep-dive/signature/img/class_based_prompt_creation.png)

References:
- documentation
    - https://dspy-docs.vercel.app/building-blocks/2-signatures/
    - https://dspy-docs.vercel.app/deep-dive/signature/understanding-signatures/
    - https://dspy-docs.vercel.app/deep-dive/signature/executing-signatures/
- source code: https://github.com/stanfordnlp/dspy/tree/main/dspy/signatures


When we assign tasks to LMs in DSPy, we specify the behavior we need as a Signature.

A signature is a declarative specification of input/output behavior of a DSPy module. Signatures allow you to tell the LM what it needs to do, rather than specify how we should ask the LM to do it.

You're probably familiar with function signatures, which specify the input and output arguments and their types. DSPy signatures are similar, but the differences are that:

- While typical function signatures just describe things, DSPy Signatures define and control the behavior of modules.
- The field names matter in DSPy Signatures. You express semantic roles in plain English: a question is different from an answer, a sql_query is different from python_code.

Why should I use a DSPy Signature?

tl;dr For modular and clean code, in which LM calls can be optimized into high-quality prompts (or automatic finetunes).

Long Answer: Most people coerce LMs to do tasks by hacking long, brittle prompts. Or by collecting/generating data for fine-tuning.

Writing signatures is far more modular, adaptive, and reproducible than hacking at prompts or finetunes. The DSPy compiler will figure out how to build a highly-optimized prompt for your LM (or finetune your small LM) for your signature, on your data, and within your pipeline. In many cases, we found that compiling leads to better prompts than humans write. Not because DSPy optimizers are more creative than humans, but simply because they can try more things and tune the metrics directly.

#### Inline DSPy Signatures

Signatures can be defined as a short string, with argument names that define semantic roles for inputs/outputs.

1. `Question Answering: "question -> answer"`
2. `Sentiment Classification: "sentence -> sentiment"`
3. `Summarization: "document -> summary"`

Your signatures can also have multiple input/output fields.

1. `Retrieval-Augmented Question Answering: "context, question -> answer"`
2. `Multiple-Choice Question Answering with Reasoning: "question, choices -> reasoning, selection"`

Tip: For fields, any valid variable names work! Field names should be semantically meaningful, but start simple and don't prematurely optimize keywords! Leave that kind of hacking to the DSPy compiler. For example, for summarization, it's probably fine to say "document -> summary", "text -> gist", or "long_context -> tldr".

Notes:
- Many DSPy modules (except `dspy.Predict`) return auxiliary information by expanding your signature under the hood. For example, `dspy.ChainOfThought` also adds a rationale field that includes the LM's reasoning before it generates the output summary.

In [37]:
sentence = "it's a charming and often affecting journey."  # example from the SST-2 dataset.

classify = dspy.Predict('sentence -> sentiment')
classify(sentence=sentence).sentiment

'positive'

In [38]:
# Example from the XSum dataset.
document = """The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page."""

summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document=document)

print(response.summary)

Lee, a 21-year-old footballer, made seven appearances for the Hammers, scoring once in a Europa League match. He had loan spells at Blackpool and Colchester United, scoring twice for Colchester, but could not prevent their relegation. His contract details with the Tykes remain undisclosed.


#### Class-based DSPy Signatures

For some advanced tasks, you need more verbose signatures. This is typically to:

1. Clarify something about the nature of the task (expressed below as a docstring).
2. Supply hints on the nature of an input field, expressed as a desc keyword argument for dspy.InputField.
3. Supply constraints on an output field, expressed as a desc keyword argument for dspy.OutputField.

Tips:
- There's nothing wrong with specifying your requests to the LM more clearly. Class-based Signatures help you with that. However, don't prematurely tune the keywords of your signature by hand. The DSPy optimizers will likely do a better job (and will transfer better across LMs).
- How `Predict` works:
    - https://dspy-docs.vercel.app/deep-dive/signature/executing-signatures/#how-predict-works
    - source code: https://github.com/stanfordnlp/dspy/blob/main/dspy/predict/predict.py

In [39]:
class Emotion(dspy.Signature):
    """Classify emotion among sadness, joy, love, anger, fear, surprise."""

    sentence = dspy.InputField()
    sentiment = dspy.OutputField()

sentence = "i started feeling a little vulnerable when the giant spotlight started blinding me"  # from dair-ai/emotion

classify = dspy.Predict(Emotion)
classify(sentence=sentence)

Prediction(
    sentiment='fear'
)

#### Using signatures to build modules & compiling them

While signatures are convenient for prototyping with structured inputs/outputs, that's not the main reason to use them!

You should compose multiple signatures into bigger DSPy modules and compile these modules into optimized prompts and finetunes.

### Modules

### Data

### Metrics

### Optimizers

### Assertions

### Type Predictors

## Workflow

## Examples

## Roadmap

## References

- Documentation: https://dspy-docs.vercel.app/intro/
- GitHub: https://github.com/stanfordnlp/dspy
- Introduction by Author
    - Video: https://www.youtube.com/live/JEMYuzrKLUw?si=iwAzhwobN52zgIZ_
    - Slides: https://llmagents-learning.org/slides/dspy_lec.pdf