<a href="https://colab.research.google.com/github/olanigan/agentic-framework/blob/main/tools/dspy/DSPy_Overview.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DSPy Zero-to-Hero


## Setup

In [None]:
!pip install -q dspy-ai
from google.colab import userdata
import os
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

In [None]:
import dspy
lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

In [None]:
lm(messages=[{"role": "user", "content": "Say oruko mi ni Bayo!"}])

['Bawo ni, Bayo! Bawo ni mo ṣe le ran ọ lọwọ loni?']

## Signatures

In [None]:
qa = dspy.Predict('question -> answer')
response = qa(question="Who is the governor of Lagos?")
print("Response: ", response.answer)

Response:  As of October 2023, the governor of Lagos State is Babajide Sanwo-Olu. He has been in office since May 29, 2019.


In [None]:
# Single Input & Output
sum = dspy.Predict('document -> summary')
document = """
Aucanquilcha is a massive stratovolcano located in the Antofagasta Region of northern Chile, just west of the border with Bolivia and within the Alto Loa National Reserve. Part of the Central Volcanic Zone of the Andes, the stratovolcano has the form of a ridge with a maximum height of 6,176 metres (20,262 ft). The volcano is embedded in a larger cluster of volcanoes known as the Aucanquilcha cluster. This cluster of volcanoes was formed in stages over eleven million years of activity with varying magma output, including lava domes and lava flows. Aucanquilcha volcano proper is formed from four units that erupted between 1.04–0.23 million years ago. During the ice ages, both the principal Aucanquilcha complex and the other volcanoes of the cluster were subject to glaciation, resulting in the formation of moraines and cirques.

The cluster has generated lava ranging in composition from andesite to dacite, with the main volcano being exclusively of dacitic composition. Systematic variations in temperature, crystal and biotite content have been recorded during the evolution of the cluster.
"""
response = sum(document=document)

print("Summary: ", response.summary)

Summary:  Aucanquilcha is a prominent stratovolcano in northern Chile, part of the Central Volcanic Zone of the Andes, with a height of 6,176 meters. It is located within the Alto Loa National Reserve and is part of the Aucanquilcha cluster, which formed over eleven million years through various volcanic activities. The main volcano consists of four units that erupted between 1.04 and 0.23 million years ago. The region experienced glaciation during ice ages, leading to the formation of moraines and cirques. The cluster's lava varies in composition from andesite to dacite, with Aucanquilcha primarily composed of dacite, showing systematic variations in temperature and mineral content throughout its evolution.


In [None]:
# Multi Inputs & Outputs
multi = dspy.Predict('question, context -> name, profile')

question = "What did I meet?"
context = "I met Ibrahim Ola, an AI practitioner and educator"

response = multi(question=question, context=context)

print("Name: ", response.name)
print("\nProfile: ", response.profile)

Name:  Ibrahim Ola

Profile:  AI practitioner and educator


In [None]:
# Classification with Type Hints

emotion = dspy.Predict('input -> sentiment: str, confidence: float, reasoning: str')

text = "I may have enjoyed the movie, I wouldn't be watching it again"

response = emotion(input=text)

print("Sentiment Classification: ", response.sentiment)
print("\nConfidence: ", response.confidence)
print("\nReasoning: ", response.reasoning)

Sentiment Classification:  neutral

Confidence:  0.75

Reasoning:  The statement expresses a mixed feeling about the movie. The phrase "I may have enjoyed the movie" suggests a positive sentiment, but the follow-up "I wouldn't be watching it again" indicates a lack of strong enthusiasm or a negative aspect. This combination leads to a neutral sentiment overall, as the enjoyment is tempered by the decision not to rewatch it.


### Class-based Signatures

In [None]:
from typing import Literal

class TextStyleTransfer(dspy.Signature):
    """Transfer text between different writing styles while preserving content."""
    text: str = dspy.InputField()
    source_style: Literal["academic", "casual", "business", "poetic"] = dspy.InputField()
    target_style: Literal["academic", "casual", "business", "poetic"] = dspy.InputField()

    preserved_keywords: list[str] = dspy.OutputField()
    transformed_text: str = dspy.OutputField()
    style_metrics: dict[str, float] = dspy.OutputField(desc="Scores for formality, complexity, emotiveness")


text = "This coffee shop makes the best lattes ever! Their new barista really knows what he's doing with the espresso machine."

style_transfer = dspy.Predict(TextStyleTransfer)

response = style_transfer(
    text=text,
    source_style="casual",
    target_style="business"
)

print("Transformed Text: ", response.transformed_text)
print("\nStyle Metrics: ", response.style_metrics)
print("\nPreserverd Keywords: ", response.preserved_keywords)

Transformed Text:  This coffee establishment offers exceptional lattes. The new barista demonstrates a high level of proficiency with the espresso machine.

Style Metrics:  {'formality': 0.8, 'complexity': 0.6, 'emotiveness': 0.3}

Preserverd Keywords:  ['coffee shop', 'lattes', 'barista', 'espresso machine']


## Modules

In [None]:
## ChainOfThought Module

# Define the Signature and Module
cot_emotion = dspy.ChainOfThought('input -> sentiment: str')
text = "I may have enjoyed the movie, I wouldn't be watching it again"

# Run
cot_response = cot_emotion(input=text)

# Output
print("Sentiment: ", cot_response.sentiment)
# Inherently added reasoning
print("\nReasoning: ", cot_response.reasoning)

Sentiment:  Neutral

Reasoning:  The statement expresses a mixed sentiment. The phrase "I may have enjoyed the movie" indicates a positive experience, suggesting that there were enjoyable aspects. However, the follow-up "I wouldn't be watching it again" implies a negative conclusion, indicating that while the experience was pleasant, it was not compelling enough to warrant a repeat viewing. This combination of enjoyment and reluctance to rewatch leads to a neutral sentiment overall.


In [None]:
# Define the Signature
class MathAnalysis(dspy.Signature):
    """Analyze a dataset and compute various statistical metrics."""

    numbers: list[float] = dspy.InputField(desc="List of numerical values to analyze")
    required_metrics: list[str] = dspy.InputField(desc="List of metrics to calculate (e.g. ['mean', 'variance', 'quartiles'])")
    analysis_results: dict[str, float] = dspy.OutputField(desc="Dictionary containing the calculated metrics")

# Create the module with defined Signature
math_analyzer = dspy.ProgramOfThought(MathAnalysis)

# Example
data = [1.5, 2.8, 3.2, 4.7, 5.1, 2.3, 3.9, 5.3]
metrics = ['mean', 'median', 'standard_deviation']

# Run
pot_response = math_analyzer(
    numbers=data,
    required_metrics=metrics
)

import pprint
pprint.pprint(pot_response.analysis_results)
pprint.pprint(pot_response.reasoning)