In [None]:
DSPy is a powerful and flexible library designed for data science and machine learning tasks. It provides a wide range of tools and functionalities to help users efficiently process, analyze, and visualize data. DSPy is built to be user-friendly, making it accessible for both beginners and experienced data scientists.

Some of the key features of DSPy include:
- Data manipulation and cleaning
- Statistical analysis
- Machine learning model building and evaluation
- Data visualization

The DSPy community actively shares use cases and examples, showcasing the versatility of the library in various real-world applications. Whether you are working on predictive modeling, data exploration, or any other data science task, DSPy offers the tools you need to succeed.

https://dspy.ai/community/use-cases/

In [None]:
Below examples are adapted from https://github.com/gabrielvanderlei/DSPy-examples

In [None]:
import dspy

lm = dspy.LM('ollama_chat/llama3.2', api_base='http://localhost:11434', api_key='')
dspy.configure(lm=lm)

# Direct prompt
#response = lm("Say this is a test!", temperature=0.7)

# Chat format
#chat_response = lm(messages=[{"role": "user", "content": "Say this is a test!"}])

In [6]:

class ExtractInfo(dspy.Signature):
    """Extract structured information from text."""

    text: str = dspy.InputField()
    title: str = dspy.OutputField()
    headings: list[str] = dspy.OutputField()
    entities: list[dict[str, str]] = dspy.OutputField(desc="a list of entities and their metadata")

module = dspy.Predict(ExtractInfo)

text = "Apple Inc. announced its latest iPhone 14 today." \
    "The CEO, Tim Cook, highlighted its new features in a press release."
response = module(text=text)

print(response.title)
print(response.headings)
print(response.entities)

Extracting Structured Information from Text
['Company', 'Feature', 'Person']
[{'name': 'Apple Inc.', 'type': 'Organization'}, {'name': 'iPhone 14', 'type': 'Product'}, {'name': 'Tim Cook', 'type': 'Person'}]


In [7]:
lm.inspect_history(n=1)





[34m[2025-02-08T23:06:22.879790][0m

[31mSystem message:[0m

Your input fields are:
1. `text` (str)

Your output fields are:
1. `title` (str)
2. `headings` (list[str])
3. `entities` (list[dict[str, str]]): a list of entities and their metadata

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## text ## ]]
{text}

[[ ## title ## ]]
{title}

[[ ## headings ## ]]
{headings}        # note: the value you produce must be pareseable according to the following JSON schema: {"type": "array", "items": {"type": "string"}}

[[ ## entities ## ]]
{entities}        # note: the value you produce must be pareseable according to the following JSON schema: {"type": "array", "items": {"type": "object", "additionalProperties": {"type": "string"}}}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Extract structured information from text.


[31mUser message:[0m

[[ ## text ## ]]
Apple Inc. announced its latest iP

In [8]:
math = dspy.ChainOfThought("question -> answer: float")
math(question="Two dice are tossed. What is the probability that the sum equals two?")

Prediction(
    reasoning='To calculate the probability, we need to find all possible outcomes where the sum of the dice equals two. The only way this can happen is if both dice show a 1. There are 6 possible outcomes when rolling two dice (1,1), (1,2), (1,3), (1,4), (1,5), and (1,6). Only one of these outcomes has a sum of two. Therefore, the probability is 1/6.',
    answer=0.16666666666666666
)

In [9]:
lm.inspect_history(n=2)





[34m[2025-02-08T23:07:28.295187][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str)

Your output fields are:
1. `reasoning` (str)
2. `answer` (float)

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## question ## ]]
{question}

[[ ## reasoning ## ]]
{reasoning}

[[ ## answer ## ]]
{answer}        # note: the value you produce must be a single float value

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Given the fields `question`, produce the fields `answer`.


[31mUser message:[0m

[[ ## question ## ]]
Two dice are tossed. What is the probability that the sum equals two?

Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]` (must be formatted as a valid Python float), and then ending with the marker for `[[ ## completed ## ]]`.


[31mResponse:[0m

[32m[[ ## reasoning ## ]]
When two dice are tossed, 

In [10]:
def search_wikipedia(query: str) -> list[str]:
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]

rag = dspy.ChainOfThought('context, question -> response')

question = "What's the name of the castle that David Gregory inherited?"
rag(context=search_wikipedia(question), question=question)

Prediction(
    reasoning='The text mentions that David Gregory inherited Kinnairdy Castle in 1664.',
    response='Kinnairdy Castle.'
)

In [11]:
lm.inspect_history(n=2)





[34m[2025-02-08T23:07:36.061621][0m

[31mSystem message:[0m

Your input fields are:
1. `question` (str)

Your output fields are:
1. `reasoning` (str)
2. `answer` (float)

All interactions will be structured in the following way, with the appropriate values filled in.

Inputs will have the following structure:

[[ ## question ## ]]
{question}

Outputs will be a JSON object with the following fields.

{
  "reasoning": "{reasoning}",
  "answer": "{answer}        # note: the value you produce must be a single float value"
}

In adhering to this structure, your objective is: 
        Given the fields `question`, produce the fields `answer`.


[31mUser message:[0m

[[ ## question ## ]]
Two dice are tossed. What is the probability that the sum equals two?

Respond with a JSON object in the following order of fields: `reasoning`, then `answer` (must be formatted as a valid Python float).


[31mResponse:[0m

[32m{
  "reasoning": "To calculate the probability, we need to find all po

In [15]:
from typing import Literal

class Classify(dspy.Signature):
    """Classify sentiment of a given sentence."""

    sentence: str = dspy.InputField()
    sentiment: Literal['positive', 'negative', 'neutral'] = dspy.OutputField()
    confidence: float = dspy.OutputField()

classify = dspy.Predict(Classify)
classify(sentence="This book was super fun to read, though not the last chapter.")

Prediction(
    sentiment='neutral',
    confidence=0.75
)

In [16]:
lm.inspect_history(n=1)





[34m[2025-02-08T23:14:32.488538][0m

[31mSystem message:[0m

Your input fields are:
1. `sentence` (str)

Your output fields are:
1. `sentiment` (Literal['positive', 'negative', 'neutral'])
2. `confidence` (float)

All interactions will be structured in the following way, with the appropriate values filled in.

Inputs will have the following structure:

[[ ## sentence ## ]]
{sentence}

Outputs will be a JSON object with the following fields.

{
  "sentiment": "{sentiment}        # note: the value you produce must be one of: positive; negative; neutral",
  "confidence": "{confidence}        # note: the value you produce must be a single float value"
}

In adhering to this structure, your objective is: 
        Classify sentiment of a given sentence.


[31mUser message:[0m

[[ ## sentence ## ]]
This book was super fun to read, though not the last chapter.

Respond with a JSON object in the following order of fields: `sentiment` (must be formatted as a valid Python Literal['posit

In [17]:
def evaluate_math(expression: str):
    return dspy.PythonInterpreter({}).execute(expression)

def search_wikipedia(query: str):
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]

react = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])

pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")
print(pred.answer)

ValidationError: 1 validation error for float
  Input should be a valid number [type=float_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.10/v/float_type

In [18]:
class Outline(dspy.Signature):
    """Outline a thorough overview of a topic."""

    topic: str = dspy.InputField()
    title: str = dspy.OutputField()
    sections: list[str] = dspy.OutputField()
    section_subheadings: dict[str, list[str]] = dspy.OutputField(desc="mapping from section headings to subheadings")

class DraftSection(dspy.Signature):
    """Draft a top-level section of an article."""

    topic: str = dspy.InputField()
    section_heading: str = dspy.InputField()
    section_subheadings: list[str] = dspy.InputField()
    content: str = dspy.OutputField(desc="markdown-formatted section")

class DraftArticle(dspy.Module):
    def __init__(self):
        self.build_outline = dspy.ChainOfThought(Outline)
        self.draft_section = dspy.ChainOfThought(DraftSection)

    def forward(self, topic):
        outline = self.build_outline(topic=topic)
        sections = []
        for heading, subheadings in outline.section_subheadings.items():
            section, subheadings = f"## {heading}", [f"### {subheading}" for subheading in subheadings]
            section = self.draft_section(topic=outline.title, section_heading=section, section_subheadings=subheadings)
            sections.append(section.content)
        return dspy.Prediction(title=outline.title, sections=sections)

draft_article = DraftArticle()
article = draft_article(topic="World Cup 2002")

In [19]:
print(article)

Prediction(
    title='2002 FIFA World Cup',
    sections=['### Teams that qualified\n* Argentina\n* Australia\n* Belgium\n* Brazil\n* Cameroon\n* Chile\n* Colombia\n* Costa Rica\n* Croatia\n* Czech Republic\n* Denmark\n* Egypt\n* England\n* France\n* Germany\n* Greece\n* Hungary\n* Iceland\n* Ireland\n* Italy\n* Jamaica\n* Japan\n* Mexico\n* Netherlands\n* New Zealand\n* Norway\n* Paraguay\n* Peru\n* Poland\n* Portugal\n* Romania\n* Russia\n* Saudi Arabia\n* Senegal\n* Serbia and Montenegro\n* South Africa\n* Spain\n* Sweden\n* Switzerland\n* Turkey\n\n### Qualifying process\nThe qualification process was divided into two stages. The first stage consisted of a series of group-stage matches, with the top teams from each group advancing to the second stage. The second stage was a knockout tournament, where teams played each other in a single-elimination format until the final.', '### Format of the tournament\nThe tournament featured two groups of 16 teams each, with the top two teams fr

In [20]:
lm.inspect_history(n=1)





[34m[2025-02-08T23:25:48.421825][0m

[31mSystem message:[0m

Your input fields are:
1. `topic` (str)
2. `section_heading` (str)
3. `section_subheadings` (list[str])

Your output fields are:
1. `reasoning` (str)
2. `content` (str): markdown-formatted section

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## topic ## ]]
{topic}

[[ ## section_heading ## ]]
{section_heading}

[[ ## section_subheadings ## ]]
{section_subheadings}

[[ ## reasoning ## ]]
{reasoning}

[[ ## content ## ]]
{content}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Draft a top-level section of an article.


[31mUser message:[0m

[[ ## topic ## ]]
2002 FIFA World Cup

[[ ## section_heading ## ]]
## Legacy

[[ ## section_subheadings ## ]]
["### Impact of the tournament on football", "### Changes to the World Cup format"]

Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, t

In [None]:
code_string = "print('Hello'); 1 + 2"
interp = dspy.PythonInterpreter()
output = interp(code_string)
print(output)  # If final statement is non-None, prints the numeric result, else prints captured output
interp.shutdown()

Below code is adapted from https://www.youtube.com/watch?v=_ROckQHGHsU

In [None]:
from pydantic import BaseModel, Field

class Answer(BaseModel):
    country: str = Field()
    year: int = Field()

class QAList(dspy.Signature):
    """Given user's question, answer with a JSON readable python list"""
    question = dspy.InputField()
    answer_list: list[Answer] = dspy.OutputField()

question = "Generate a list of country and the year of FIFA world cup winners from 2002-present"
predict = dspy.TypedChainOfThought(QAList)


answer = predict(question=question)
lm.inspect_history(n=1)

AttributeError: module 'dspy' has no attribute 'TypedChainOfThought'

: 