# dspy methodology 101

1. programming
   1. LMs (tasks)
   2. signatures (i/o - eg `"context: list[str], question: str -> answer: str"`) - compiling leads to better prompts than humans write
      1. tasks, instruct the model what it needs to do
      2. underlying dSPY compiler will do the optimization, rather than brittle prompts
   3. modules (ie `dspy.Predict`, `dspy.ChainOfThought`)
      1. prompting techniques
2. evaluation
3. optimization

## TOC:
* [intro](#dspy-methodology-101)
* [LMs](#set-a-generator-lm)
* [signatures](#testing-with-dspy-signatures)
* [class-based signatures](#class-based-dspy-signatures)

## set a generator LM

<a class="anchor" id="LM"></a>

In [1]:
import dspy
import os

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")
TOGETHER_API_KEY = os.getenv("TOGETHER_API_KEY")

if OPENAI_API_KEY:
    print("OPENAI API Key found!")
else:
    print("OPENAI API Key not found!")
if ANTHROPIC_API_KEY:
    print("ANTHROPIC API Key found!")
else:
    print("ANTHROPIC API Key not found!")

lm = dspy.LM('openai/gpt-4o-mini', temperature=0.9, max_tokens=3000, stop=None, cache=False, api_key=OPENAI_API_KEY)
dspy.configure(lm=lm)

OPENAI API Key found!
ANTHROPIC API Key found!


## testing with dspy signatures

- A signature is a declarative specification of input/output behavior of a DSPy module. Signatures allow you to tell the LM what it needs to do, rather than specify how we should ask the LM to do it.
- underlying dSPY compiler will do the optimization (`optuna`), rather than brittle prompts
- argument names and types to a module

In [12]:
with dspy.context(lm=dspy.LM('together_ai/deepseek-ai/DeepSeek-R1', temperature=0.1, max_tokens=2500, stop=None, cache=False, api_key=TOGETHER_API_KEY)):
    sentence = "it's a charming and often affecting journey."
    classify = dspy.Predict('sentence -> sentiment: bool')
    classify(sentence=sentence).sentiment
print(classify(sentence=sentence).sentiment)

True


In [15]:
document = """The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page."""

summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document=document)

print(response.summary)
print("\nReasoning:", response.reasoning)

A 21-year-old football player made seven appearances for the Hammers, scoring once in a Europa League match. He spent last season on loan at Blackpool and Colchester United, where he scored two goals but could not prevent Colchester from relegation. The details of his new contract with the promoted Tykes are not disclosed.

Reasoning: The document discusses a young football player who has had a brief career with the Hammers and mentions his experiences during loan spells. It highlights his performance, including his goal-scoring and the challenges he faced with relegation. The mention of his new contract with the Tykes adds a current aspect to his career trajectory. Overall, the emphasis is on his recent activities and transitions in professional football.


In [20]:
from typing import Literal

class Emotion(dspy.Signature):
    """Classify emotion."""

    sentence: str = dspy.InputField()
    sentiment: Literal['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'] = dspy.OutputField()

sentence = "i started feeling a little vulnerable when the giant spotlight started blinding me"

print(classify(sentence=sentence).sentiment)

classify = dspy.Predict(Emotion)
classify(sentence=sentence)

fear


Prediction(
    sentiment='fear'
)

In [22]:
with dspy.context(lm = dspy.LM('ollama_chat/llama3.2', temperature=0.9, max_tokens=3000, stop=None, cache=False, api_base='http://localhost:11434', api_key='')):
    class CheckCitationFaithfulness(dspy.Signature):
        """Verify that the text is based on the provided context."""

        context: str = dspy.InputField(desc="facts here are assumed to be true")
        text: str = dspy.InputField()
        faithfulness: bool = dspy.OutputField()
        evidence: dict[str, list[str]] = dspy.OutputField(desc="Supporting evidence for claims")

    context = "The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page."

    text = "Lee scored 3 goals for Colchester United."

    faithfulness = dspy.ChainOfThought(CheckCitationFaithfulness)
    result = faithfulness(context=context, text=text)
    print(result)

Prediction(
    reasoning='The text claims that Lee scored 3 goals for Colchester United, but the context only mentions him scoring twice for them. This inconsistency suggests that the information may not be entirely accurate.',
    faithfulness=False,
    evidence={'additionalProperties': [], 'type': ['context', 'text']}
)


### Class-based DSPy Signatures

- more advanced (verbose clarification as a docstring, hints and constraints)

In [26]:
from typing import Literal

class Emotion(dspy.Signature):
    """Classify emotion."""

    sentence: str = dspy.InputField()
    sentiment: Literal['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'] = dspy.OutputField()

sentence = "i started feeling a little vulnerable when the giant spotlight started blinding me"  # from dair-ai/emotion

classify = dspy.Predict(Emotion)
classify(sentence=sentence)
print(classify)

Predict(Emotion(sentence -> sentiment
    instructions='Classify emotion.'
    sentence = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Sentence:', 'desc': '${sentence}'})
    sentiment = Field(annotation=Literal['sadness', 'joy', 'love', 'anger', 'fear', 'surprise'] required=True json_schema_extra={'__dspy_field_type': 'output', 'prefix': 'Sentiment:', 'desc': '${sentiment}'})
))


In [28]:
class CheckCitationFaithfulness(dspy.Signature):
    """Verify that the text is based on the provided context."""

    context: str = dspy.InputField(desc="facts here are assumed to be true")
    text: str = dspy.InputField()
    faithfulness: bool = dspy.OutputField()
    evidence: dict[str, list[str]] = dspy.OutputField(desc="Supporting evidence for claims")

context = "The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page."

text = "Lee scored 3 goals for Colchester United."

faithfulness = dspy.ChainOfThought(CheckCitationFaithfulness)
faithfulness(context=context, text=text)
print(faithfulness)

predict = Predict(StringSignature(context, text -> reasoning, faithfulness, evidence
    instructions='Verify that the text is based on the provided context.'
    context = Field(annotation=str required=True json_schema_extra={'desc': 'facts here are assumed to be true', '__dspy_field_type': 'input', 'prefix': 'Context:'})
    text = Field(annotation=str required=True json_schema_extra={'__dspy_field_type': 'input', 'prefix': 'Text:', 'desc': '${text}'})
    reasoning = Field(annotation=str required=True json_schema_extra={'prefix': "Reasoning: Let's think step by step in order to", 'desc': '${reasoning}', '__dspy_field_type': 'output'})
    faithfulness = Field(annotation=bool required=True json_schema_extra={'__dspy_field_type': 'output', 'prefix': 'Faithfulness:', 'desc': '${faithfulness}'})
    evidence = Field(annotation=dict[str, list[str]] required=True json_schema_extra={'desc': 'Supporting evidence for claims', '__dspy_field_type': 'output', 'prefix': 'Evidence:'})
))


In [31]:
class DogPictureSignature(dspy.Signature):
    """Output the dog breed of the dog in the image."""
    image_1: dspy.Image = dspy.InputField(desc="An image of a dog")
    answer: str = dspy.OutputField(desc="The dog breed of the dog in the image")

image_url = "https://picsum.photos/id/237/200/300"
classify = dspy.Predict(DogPictureSignature)
classify(image_1=dspy.Image.from_url(image_url))


Prediction(
    answer='Labrador Retriever'
)

--> multiple signatures into bigger DSPy [modules](./2modules.ipynb)