# DSPY Learn - Modules

https://dspy.ai/learn/programming/modules/

In [1]:
#!pip install dspy

In [2]:
import dspy 
from common.my_settings import MySettings  
from common.utils import md

settings = MySettings().get()

#lm_gpt35 = dspy.LM('gpt-3.5-turbo', temperature=0.8, model_type='chat', cache=False, api_key=settings.OPENAI_API_KEY)
lm_gpt4omin = dspy.LM('gpt-4o-mini', temperature=0.9, model_type='chat', cache=False, api_key=settings.OPENAI_API_KEY)
dspy.configure(lm=lm_gpt4omin)

lm = lm_gpt4omin
lm('hello there')


Getting keys from environment variables


['Hello! How can I assist you today?']

In [3]:
sentence = "it's a charming and often affecting journey"
classify = dspy.Predict("sentence -> sentiment: bool")
response = classify(sentence=sentence)
print(response.sentiment)

True


In [4]:
question = "What's something great about the ColBERT retrieval module"
classify = dspy.ChainOfThought("question -> answer", n=5)
response = classify(question=question)
response.completions.answer

['One great aspect of the ColBERT retrieval module is its ability to efficiently perform late interaction during the retrieval process, which combines fast matching with rich semantic information using dense embeddings, thereby improving retrieval performance while maintaining scalability.',
 'A great aspect of the ColBERT retrieval module is its ability to achieve high retrieval effectiveness while maintaining efficiency through late interaction with BERT embeddings, enabling quick and accurate document retrieval in large datasets.',
 'One great aspect of the ColBERT retrieval module is its ability to efficiently combine dense and traditional retrieval techniques, allowing for fast and accurate information retrieval through late interaction mechanisms.',
 'One great aspect of the ColBERT retrieval module is its efficient use of late interaction mechanisms, which allows it to combine powerful BERT embeddings for queries and documents while maintaining scalability for large-scale retrie

In [5]:
print(f"Reasoning: {response.reasoning}")
print(f"Answer: {response.answer}")

Reasoning: ColBERT is notable for its efficiency in combining the strengths of dense and scalable retrieval techniques. One of its standout features is the ability to perform late interaction, which allows for fast matching between query vectors and document representations while still leveraging the rich semantic information provided by dense embeddings. This results in improved retrieval performance without significant trade-offs in speed, making it suitable for large-scale applications. Additionally, ColBERT can effectively handle diverse document types and formats, enhancing its versatility in various information retrieval tasks.
Answer: One great aspect of the ColBERT retrieval module is its ability to efficiently perform late interaction during the retrieval process, which combines fast matching with rich semantic information using dense embeddings, thereby improving retrieval performance while maintaining scalability.


# Math

Install Deno to run the Python code

In [6]:
import os, pathlib, shutil, subprocess

# Point to your Deno install (Codespaces default)
deno_bin = str(pathlib.Path.home() / ".deno" / "bin")

# Add to PATH for this kernel session
os.environ["PATH"] = deno_bin + ":" + os.environ.get("PATH","")

# (Optional) sanity checks
print("deno path:", deno_bin)
print("which deno:", shutil.which("deno"))
subprocess.run(["deno","--version"], check=True)


deno path: /home/codespace/.deno/bin
which deno: /home/codespace/.deno/bin/deno
deno 2.4.5 (stable, release, x86_64-unknown-linux-gnu)
v8 13.7.152.14-rusty
typescript 5.8.3


CompletedProcess(args=['deno', '--version'], returncode=0)

In [7]:
maths = dspy.ProgramOfThought("question -> answer")
result = maths(question="Whats the volume of a circle that is 3cm in diameter?")
md(result)

Prediction(  
    reasoning='To find the volume related to a circle, we assume the question is about the volume of a cylinder with a circular base since a circle itself does not have volume. Given the diameter of the circle is 3 cm, we first calculate the radius, which is half of the diameter. We then assume a height of 1 cm for the cylinder. Using the formula for the volume of a cylinder, V = πr²h, we substitute the radius and height values to compute the volume. The result is approximately 7.07 cm³.',  
    answer='The volume of the cylinder with a circular base of diameter 3 cm and height 1 cm is approximately 7.07 cm³.'  
)

In [8]:
maths.history

[{'prompt': None,
  'messages': [{'role': 'system',
    'content': 'Your input fields are:\n1. `question` (str):\nYour output fields are:\n1. `reasoning` (str): \n2. `generated_code` (str): python code that answers the question\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\n[[ ## question ## ]]\n{question}\n\n[[ ## reasoning ## ]]\n{reasoning}\n\n[[ ## generated_code ## ]]\n{generated_code}\n\n[[ ## completed ## ]]\nIn adhering to this structure, your objective is: \n        You will be given `question` and you will respond with `generated_code`.\n        Generating executable Python code that programmatically computes the correct `generated_code`.\n        After you\'re done with the computation and think you have the answer, make sure to provide your answer by calling the preloaded function `final_answer()`.\n        You should structure your answer in a dict object, like {"field_a": answer_a, ...}, evaluates to the correct value 

# RAG
https://dspy.ai/learn/programming/modules/#__tabbed_1_2


In [9]:
lm('Whats the name of the Castle that David Gregory inherited?')

['David Gregory inherited the historic "Castle of Duns" in Scotland, which has been associated with his family for generations. It is an important estate with a rich history in the region.']

In [10]:
def search(query: str) -> list[str]:
    """Retreives abstracts from wikipedia"""
    # A COLBERT server is a vector database that can be used to store and retrieve embeddings.
    # The line below is fetching snippets from wikiped (2017) that is hosted on a COLBERT server. 
    result = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")(query, k=3)

    md("**Search result**:", result[0])
    return [item['text'] for item in result]

rag = dspy.ChainOfThought("context, question -> response")
question = "Whats the name of the Castle that David Gregory inherited?"

md("**RAG with answerable question**")
md(rag(context=search(question), question=question))

md("**RAG with unanswerable question**")
# In the example below, the question is "Who is the president of Germany?" which cannot be answered from the context retrieved.
# It should give an answer such as "I dont know" or "The context does not provide this information.
md(rag(context=search(question), question="Who is the president of Germany?")) 

**RAG with answerable question**

**Search result**:{'text': 'David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinnairdy Castle in 1664. Three of his twenty-nine children became mathematics professors. He is credited with inventing a military cannon that Isaac Newton described as "being destructive to the human species". Copies and details of the model no longer exist. Gregory\'s use of a barometer to predict farming-related weather conditions led him to be accused of witchcraft by Presbyterian ministers from Aberdeen, although he was never convicted.', 'pid': 3296134, 'rank': 1, 'score': 25.856355667114258, 'prob': 0.9928459586069872, 'long_text': 'David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinnairdy Castle in 1664. Three of his twenty-nine children became mathematics professors. He is credited with inventing a military cannon that Isaac Newton described as "being destructive to the human species". Copies and details of the model no longer exist. Gregory\'s use of a barometer to predict farming-related weather conditions led him to be accused of witchcraft by Presbyterian ministers from Aberdeen, although he was never convicted.'}

Prediction(  
    reasoning='David Gregory, the physician, inherited Kinnairdy Castle in 1664. This is the specific detail related to the question about the name of the castle he inherited.',  
    response='Kinnairdy Castle'  
)

**RAG with unanswerable question**

**Search result**:{'text': 'David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinnairdy Castle in 1664. Three of his twenty-nine children became mathematics professors. He is credited with inventing a military cannon that Isaac Newton described as "being destructive to the human species". Copies and details of the model no longer exist. Gregory\'s use of a barometer to predict farming-related weather conditions led him to be accused of witchcraft by Presbyterian ministers from Aberdeen, although he was never convicted.', 'pid': 3296134, 'rank': 1, 'score': 25.856355667114258, 'prob': 0.9928459586069872, 'long_text': 'David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinnairdy Castle in 1664. Three of his twenty-nine children became mathematics professors. He is credited with inventing a military cannon that Isaac Newton described as "being destructive to the human species". Copies and details of the model no longer exist. Gregory\'s use of a barometer to predict farming-related weather conditions led him to be accused of witchcraft by Presbyterian ministers from Aberdeen, although he was never convicted.'}

Prediction(  
    reasoning='The context provided does not contain any information related to the current president of Germany. Therefore, I cannot derive an answer from the context.',  
    response='As of October 2023, the president of Germany is Frank-Walter Steinmeier.'  
)

# Classification
https://dspy.ai/learn/programming/modules/#__tabbed_1_3

In [11]:
from typing import Literal 

class Classify(dspy.Signature):
    """Classify the sentiment"""
    sentence: str = dspy.InputField()
    sentiment: Literal['positive', 'neutral', 'negative'] = dspy.OutputField()
    confidence: float = dspy.OutputField()

classify = dspy.Predict(Classify)
md(classify(sentence='It is raining this evening, the roads will be dangerous'))


Prediction(  
    sentiment='negative',  
    confidence=0.85  
)

# Information Extraction
https://dspy.ai/learn/programming/modules/#__tabbed_1_4

In [12]:
#text = "Apple Inc. announced its latest iPhone 14 today. The CEO, Tim Cook, highlighted its new features in a press release."
#text = "The DSPy framework aims to resolve consistency and reliability issues by prioritizing declarative, systematic programming over manual prompt writing."
text = """
The Mongol Empire was the largest contiguous empire in history. Originating in present-day Mongolia in East Asia, the empire at its height stretched from the Sea of Japan 
to Eastern Europe, extending northward into Siberia and east and southward into the Indian subcontinent, mounting invasions of Southeast Asia, and conquering the Iranian 
plateau; and reaching westward as far as the Levant and the Carpathian Mountains.
"""

extract = dspy.Predict("text -> title, headings: list[str], entities_and_metadata: list[dict[str, str]]")
md(extract(text=text))

Prediction(  
    title='The Mongol Empire: The Largest Contiguous Empire in History',  
    headings=['Introduction', 'Geographic Extent', 'Invasions and Conquests', 'Historical Significance'],  
    entities_and_metadata=[{'entity': 'Mongol Empire', 'metadata': 'largest contiguous empire in history'}, {'entity': 'Mongolia', 'metadata': 'origin of the empire in East Asia'}, {'entity': 'Sea of Japan', 'metadata': 'eastern boundary at height'}, {'entity': 'Eastern Europe', 'metadata': 'western boundary at height'}, {'entity': 'Siberia', 'metadata': 'northern extension'}, {'entity': 'Indian subcontinent', 'metadata': 'southern extension'}, {'entity': 'Southeast Asia', 'metadata': 'site of invasions'}, {'entity': 'Iranian plateau', 'metadata': 'conquered region'}, {'entity': 'Levant', 'metadata': 'western boundary'}, {'entity': 'Carpathian Mountains', 'metadata': 'western boundary'}]  
)

# Agent 
https://dspy.ai/learn/programming/modules/#__tabbed_1_5

In [None]:
def evaluate_math(expression: str) -> float:
    md("Python code: ", expression)
    return dspy.PythonInterpreter({}).execute(expression)

def search_wikipedia(query: str) -> str:
    md("Searching Wikipedia for:", query)
    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)
    return [x['text'] for x in results]

react = dspy.ReAct("question -> answer: float", tools=[evaluate_math, search_wikipedia])

pred = react(question="What is 9362158 divided by the year of birth of David Gregory of Kinnairdy castle?")

md(pred.answer)

Searching Wikipedia for:David Gregory Kinnairdy Castle year of birth

Python code: 9362158 / 1625

5761.328

In [None]:
md("**Answer**: ", pred.answer)
md("**Reasoning**: ", pred.reasoning)

**Answer**: 5761.328

**Reasoning**: To answer the question, I first determined the year of birth of David Gregory associated with Kinnairdy Castle, which is 1625. I then performed the division of 9362158 by 1625, resulting in approximately 5761.328.

## Agent reasoning
See how by looping through the trajectory we can see how the agent 
1. thinks on what to do 
1. decides on a tool to call to fetch info 
1. calls the tool 
1. observes the result of the tool

In [19]:
# Loop through the trajectory to see how the agent thought and acted

for r in pred.trajectory:
    md(f"**{r}**: ", ": ", pred.trajectory[r])

**thought_0**: : I need to find out the year of birth of David Gregory of Kinnairdy Castle in order to perform the division of 9362158 by that year. I will search for this information on Wikipedia.

**tool_name_0**: : search_wikipedia

**tool_args_0**: : {'query': 'David Gregory Kinnairdy Castle year of birth'}

**observation_0**: : ['David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinnairdy Castle in 1664. Three of his twenty-nine children became mathematics professors. He is credited with inventing a military cannon that Isaac Newton described as "being destructive to the human species". Copies and details of the model no longer exist. Gregory\'s use of a barometer to predict farming-related weather conditions led him to be accused of witchcraft by Presbyterian ministers from Aberdeen, although he was never convicted.', 'David Castle (cricketer) | David James Castle (born 25 May 1972 in Launceston, Tasmania) is a former cricket player who represented the Tasmanian Tigers in one first-class match.', 'David Castle (philosopher) | David Castle (born 1967 in Edmonton, Alberta) is a Canadian philosopher and bioethicist. He is Vice-President of Research at the University of Victoria. Previously he was Professor and Chair of Innovations in the Life Sciences at University of Edinburgh, where he investigated how to get others to innovate. From 2006–2010 he served as Canada Research Chair in Science and Society at the University of Ottawa where he developed ideas leading to the creation of the Institute for Science, Society and Policy.']

**thought_1**: : I found that David Gregory, the physician associated with Kinnairdy Castle, was born on December 20, 1625. I will now perform the division of 9362158 by 1625 to find the result.

**tool_name_1**: : evaluate_math

**tool_args_1**: : {'expression': '9362158 / 1625'}

**observation_1**: : 5761.328

**thought_2**: : I have completed the division and found that 9362158 divided by 1625 equals approximately 5761.328. Since I have all the information needed to finalize my answer, I will mark the task as complete.

**tool_name_2**: : finish

**tool_args_2**: : {}

**observation_2**: : Completed.