<a href="https://colab.research.google.com/github/phukon/notebooks/blob/main/dspy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install dspy
!pip install pydantic

# Simple prompt and respose

In [7]:
import dspy
turbo = dspy.OpenAI(model='gpt-3.5-turbo-0125', api_key='xxx')
dspy.settings.configure(lm=turbo)

In [8]:
predict = dspy.Predict("question -> answer")
prediction = predict(question="Whats the capital of Arunachal Pradesh?")
prediction.answer

'Itanagar'

In [9]:
turbo.inspect_history(1)





Given the fields `question`, produce the fields `answer`.

---

Follow the following format.

Question: ${question}
Answer: ${answer}

---

Question: Whats the capital of Arunachal Pradesh?
Answer:[32m Itanagar[0m





### Signatures

In [10]:
class QA(dspy.Signature):
  lol = dspy.InputField() # i can rename it to whatever suits me
  qwe = dspy.OutputField()

predict = dspy.Predict(QA) # Instead of dspy.Predict("question -> answer")

In [11]:
prediction = predict(lol="What was the name of the first moon landing mission by NASA?")
print(prediction.qwe)

Apollo 11


### With description

In [12]:
class QA2(dspy.Signature):
  """Given the question, generate the answer"""
  question = dspy.InputField(desc="User's question")
  answer = dspy.OutputField(desc="in JSON format")

predict = dspy.Predict(QA2)
prediction = predict(question="What is the date of India's independence in UTC?")
print(prediction.answer)

{
    "date": "1947-08-15",
    "time": "00:00:00",
    "timezone": "UTC"
}


In [13]:
turbo.inspect_history(1)





Given the question, generate the answer

---

Follow the following format.

Question: User's question
Answer: in JSON format

---

Question: What is the date of India's independence in UTC?
Answer:[32m {
    "date": "1947-08-15",
    "time": "00:00:00",
    "timezone": "UTC"
}[0m





# CoT

In [14]:
multiStepQuestion = "what is the capital of the birth state of the person who provided the assist to Mario Gotze's goal in football world cup finals in 2014?"

class QuestionAnswer(dspy.Signature):
  question = dspy.InputField()
  answer = dspy.OutputField()

generateAnswer = dspy.ChainOfThought(QuestionAnswer)
prediction = generateAnswer(question=multiStepQuestion)
print(prediction) # gives wrong answer

Prediction(
    rationale="produce the answer. We know that Mario Gotze scored the winning goal for Germany in the 2014 World Cup finals. The person who provided the assist to him was Andre Schurrle. Andre Schurrle was born in Ludwigshafen, Germany. Therefore, the capital of the birth state of the person who provided the assist to Mario Gotze's goal in the 2014 World Cup finals is Stuttgart.",
    answer='Stuttgart'
)


DSPY thingies ---> modules, signatures

### Module

In [15]:
class DoubleChainOfThoughtModule(dspy.Module):
  def __init__(self):
    self.cot1 = dspy.ChainOfThought("question -> step_by_step_thought")
    self.cot2 = dspy.ChainOfThought("question, thought -> one_word_answer")

  def forward(self, question):
    thought = self.cot1(question=question).step_by_step_thought
    answer =  self.cot2(question=question, thought=thought).one_word_answer
    return dspy.Prediction(thought=thought, answer=answer)

In [16]:
doubleCot = DoubleChainOfThoughtModule()
output = doubleCot(question=multiStepQuestion)
print(output)

Prediction(
    thought="The player who provided the assist to Mario Gotze's goal in the 2014 World Cup finals was Andre Schurrle. Andre Schurrle was born in Ludwigshafen, Rhineland-Palatinate, Germany. Therefore, the capital of Rhineland-Palatinate is Mainz.",
    answer='Mainz'
)


In [17]:
turbo.inspect_history(2)





Given the fields `question`, produce the fields `step_by_step_thought`.

---

Follow the following format.

Question: ${question}
Reasoning: Let's think step by step in order to ${produce the step_by_step_thought}. We ...
Step By Step Thought: ${step_by_step_thought}

---

Question: what is the capital of the birth state of the person who provided the assist to Mario Gotze's goal in football world cup finals in 2014?
Reasoning: Let's think step by step in order to[32m find the capital of the birth state of the person who provided the assist to Mario Gotze's goal in the 2014 football world cup finals. We need to first identify the player who assisted Mario Gotze's goal in the 2014 World Cup finals and then determine their birth state.
Step By Step Thought: The player who provided the assist to Mario Gotze's goal in the 2014 World Cup finals was Andre Schurrle. Andre Schurrle was born in Ludwigshafen, Rhineland-Palatinate, Germany. Therefore, the capital of Rhineland-Palatinate is M

# Typed Predictors

In [22]:
from pydantic import BaseModel, Field

class AnswerConfidence(BaseModel):
  answer: str = Field("Answer in 1-5 words")
  confidence: float = Field("Your confidence between 0-1")

class QAWithConfidence(dspy.Signature):
  """Given user's question, answer it and also give your confidence value"""
  question = dspy.InputField()
  answer: AnswerConfidence = dspy.OutputField()

predict = dspy.TypedChainOfThought(QAWithConfidence)


In [24]:
output = predict(question=multiStepQuestion)
print(output.answer)
print(type(output.answer.confidence))
turbo.inspect_history(1)

answer='Rio de Janeiro' confidence=0.8
<class 'float'>




Given user's question, answer it and also give your confidence value

---

Follow the following format.

Question: ${question}
Reasoning: Let's think step by step in order to ${produce the answer}. We ...
Answer: ${answer}. Respond with a single JSON object. JSON Schema: {"properties": {"answer": {"default": "Answer in 1-5 words", "title": "Answer", "type": "string"}, "confidence": {"default": "Your confidence between 0-1", "title": "Confidence", "type": "number"}}, "title": "AnswerConfidence", "type": "object"}

---

Question: what is the capital of the birth state of the person who provided the assist to Mario Gotze's goal in football world cup finals in 2014?
Reasoning: Let's think step by step in order to
Answer: Rio de Janeiro
Confidence: 0.8
Answer:[32m {"answer": "Rio de Janeiro", "confidence": 0.8}[0m





## Typed Predictors, Advanced


In [27]:
class Answer(BaseModel):
  country: str = Field()
  year: int = Field()

class QAList(dspy.Signature):
  """Given a user's question, answer with a JSON readable python list"""
  question = dspy.InputField()
  answer_list: list[Answer] = dspy.OutputField()

question = "Generate a list of countries and the year of FIFA world cup winners from 2002-present"
predict = dspy.TypedChainOfThought(QAList)

In [28]:
answer = predict(question=question)
print(answer)
print(answer.answer_list)

Prediction(
    reasoning='produce the answer_list. We need to identify the countries that won the FIFA World Cup from 2002 to the present year and the corresponding years.',
    answer_list=[Answer(country='Brazil', year=2002), Answer(country='Italy', year=2006), Answer(country='Spain', year=2010), Answer(country='Germany', year=2014), Answer(country='France', year=2018)]
)
[Answer(country='Brazil', year=2002), Answer(country='Italy', year=2006), Answer(country='Spain', year=2010), Answer(country='Germany', year=2014), Answer(country='France', year=2018)]
