<img src="docs/images/DSPy8.png" alt="DSPy7 Image" height="150"/>

# DSPy: Tutorial @ SkyCamp

This notebook contains the **DSPy tutorial** for **SkyCamp 2023**.

Let's begin by setting things up. The snippet below will also install **DSPy** if it's not there already.

In [None]:
%load_ext autoreload
%autoreload 2

import sys
import os

try: # When on google Colab, let's clone the notebook so we download the cache.
    import google.colab
    repo_path = 'dspy'
    !git -C $repo_path pull origin || git clone https://github.com/stanfordnlp/dspy $repo_path
except:
    repo_path = '.'

if repo_path not in sys.path:
    sys.path.append(repo_path)

# Set up the cache for this notebook
os.environ["DSP_NOTEBOOK_CACHEDIR"] = os.path.join(repo_path, 'cache')

import pkg_resources # Install the package if it's not installed
if not "dspy-ai" in {pkg.key for pkg in pkg_resources.working_set}:
    !pip install -U pip
    # !pip install dspy-ai
    !pip install -e $repo_path

!pip install transformers

In [None]:
import dspy
from dspy.evaluate import Evaluate
from dspy.teleprompt import BootstrapFewShot, BootstrapFewShotWithRandomSearch, BootstrapFinetune

import os
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv())

### 1) Configure the default LM and retriever

We'll start by setting up the language model (LM) and retrieval model (RM). **DSPy** supports multiple API and local models.

In this notebook, we will use `Llama2-13b-chat` using the HuggingFace TGI serving software infrastructure. In principle you can run this on your own local GPUs, but for this tutorial all examples are pre-cached so you don't need to worry about cost.

We will use the retriever `ColBERTv2`. To make things easy, we've set up a ColBERTv2 server hosting a Wikipedia 2017 "abstracts" search index (i.e., containing first paragraph of each article from this [2017 dump](https://hotpotqa.github.io/wiki-readme.html)), so you don't need to worry about setting one up! It's free.

**Note:** _If you run this notebook as instructed, you don't need an API key. All examples are already cached internally so you can inspect them!_

In [None]:
turbo = dspy.OpenAI(
    model='gpt-3.5-turbo-0125',
    api_key=os.getenv('OPENAI_API_KEY'),
)
colbertv2 = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')

# # NOTE: After you finish this notebook, you can use GPT-3.5 like this if you like.
# # In that case, make sure to configure lm=turbo below if you choose to do that.

dspy.settings.configure(rm=colbertv2, lm=turbo)

# import openai

# client = openai.OpenAI(
#     api_key=os.getenv('OPENAI_API_KEY'),
# )

# completion = client.chat.completions.create(
#     model="gpt-3.5-turbo-0125",
#     messages=[
#         {"role": "user", "content": "Given the fields `question`, produce the fields `answer`.\n\n---\n\nFollow the following format.\n\nQuestion: ${question}\nReasoning: Let's think step by step in order to ${produce the answer}. We ...\nAnswer: ${answer}\n\n---\n\nQuestion: What is the capital of Paris?\nReasoning: Let's think step by step in order to"},
#     ],
#     temperature=0,
# )
# completion

In [None]:
#Example DSPy CoT QA program
qa = dspy.ChainOfThought('question -> answer')

response = qa(question="What is the capital of Paris?")
print(response.answer)

### 2) Create a few question–answer pairs for our task

In [None]:
train = [('Who was the director of the 2009 movie featuring Peter Outerbridge as William Easton?', 'Kevin Greutert'),
         ('The heir to the Du Pont family fortune sponsored what wrestling team?', 'Foxcatcher'),
         ('In what year was the star of To Hell and Back born?', '1925'),
         ('Which award did the first book of Gary Zukav receive?', 'U.S. National Book Award'),
         ('What documentary about the Gilgo Beach Killer debuted on A&E?', 'The Killing Season'),
         ('Which author is English: John Braine or Studs Terkel?', 'John Braine'),
         ('Who produced the album that included a re-recording of "Lithium"?', 'Butch Vig')]

train = [dspy.Example(question=question, answer=answer).with_inputs('question') for question, answer in train]

In [None]:
dev = [('Who has a broader scope of profession: E. L. Doctorow or Julia Peterkin?', 'E. L. Doctorow'),
       ('Right Back At It Again contains lyrics co-written by the singer born in what city?', 'Gainesville, Florida'),
       ('What year was the party of the winner of the 1971 San Francisco mayoral election founded?', '1828'),
       ('Anthony Dirrell is the brother of which super middleweight title holder?', 'Andre Dirrell'),
       ('The sports nutrition business established by Oliver Cookson is based in which county in the UK?', 'Cheshire'),
       ('Find the birth date of the actor who played roles in First Wives Club and Searching for the Elephant.', 'February 13, 1980'),
       ('Kyle Moran was born in the town on what river?', 'Castletown River'),
       ("The actress who played the niece in the Priest film was born in what city, country?", 'Surrey, England'),
       ('Name the movie in which the daughter of Noel Harrison plays Violet Trefusis.', 'Portrait of a Marriage'),
       ('What year was the father of the Princes in the Tower born?', '1442'),
       ('What river is near the Crichton Collegiate Church?', 'the River Tyne'),
       ('Who purchased the team Michael Schumacher raced for in the 1995 Monaco Grand Prix in 2000?', 'Renault'),
       ('André Zucca was a French photographer who worked with a German propaganda magazine published by what Nazi organization?', 'the Wehrmacht')]

dev = [dspy.Example(question=question, answer=answer).with_inputs('question') for question, answer in dev]

### 3) Key Concepts: Signatures & Modules

In [None]:
# Define a dspy.Predict module with the signature `question -> answer` (i.e., takes a question and outputs an answer).
predict = dspy.Predict('question -> answer')

# Use the module!
predict(question="What is the capital of Germany?")

In the example above, we used the `dspy.Predict` module **zero-shot**, i.e. without compiling it on any examples.

Let's now build a slightly more advanced program. Our program will use the `dspy.ChainOfThought` module, which asks the LM to think step by step.

We will call this program `CoT`.

In [None]:
class CoT(dspy.Module):  # let's define a new module
    def __init__(self):
        super().__init__()

        # here we declare the chain of thought sub-module, so we can later compile it (e.g., teach it a prompt)
        self.generate_answer = dspy.ChainOfThought('question -> answer')

    def forward(self, question):
        return self.generate_answer(question=question)  # here we use the module

Now let's compile this using our six `train` examples. We will use the very simple `BootstrapFewShot` in DSPy.

In [None]:
metric_EM = dspy.evaluate.answer_exact_match

teleprompter = BootstrapFewShot(metric=metric_EM, max_bootstrapped_demos=4)
cot_compiled = teleprompter.compile(CoT(), trainset=train)

Let's ask a question to this new program.

In [None]:
cot_compiled("What is the capital of Germany?")

You might be curious what's happening under the hood. Let's inspect the last call to our Llama LM to see the prompt and the output.

In [None]:
turbo.inspect_history(n=1)

Notice how the prompt ends with the question we asked ("What is the capital of Germany?"), but before that it includes few-shot examples.

The final example in the prompt contains a rationale (step-by-step reasoning) self-generated from the LM for use as a demonstration, for the training question "Which author is English: John Braine or Studs Terkel?".

Now, let's evaluate on our development set.

In [None]:
NUM_THREADS = 32
evaluate_hotpot = Evaluate(devset=dev, metric=metric_EM, num_threads=NUM_THREADS, display_progress=True, display_table=15)

First, let's evaluate the compiled `CoT` program with Llama. Feel free to replace `cot_compiled` below with `CoT()` (notice the paranthesis) to test the zero-shot version of CoT.

In [32]:
evaluate_hotpot(cot_compiled)

Average Metric: 6 / 13  (46.2): 100%|██████████| 13/13 [00:00<00:00, 3387.76it/s]

Average Metric: 6 / 13  (46.2%)



  df = df.applymap(truncate_cell)
 'False' 'False' '✔️ [True]' 'False' '✔️ [True]' 'False']' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.
  df.loc[:, metric_name] = df[metric_name].apply(lambda x: f'✔️ [{x}]' if x is True else f'{x}')


Unnamed: 0,question,example_answer,rationale,pred_answer,answer_exact_match
0,Who has a broader scope of profession: E. L. Doctorow or Julia Peterkin?,E. L. Doctorow,"produce the answer. E. L. Doctorow was an American novelist and editor, while Julia Peterkin was an American author. Since E. L. Doctorow's profession includes...",E. L. Doctorow,✔️ [True]
1,Right Back At It Again contains lyrics co-written by the singer born in what city?,"Gainesville, Florida","produce the answer. We know that ""Right Back At It Again"" is a song by A Day to Remember. The singer of A Day to...","Ocala, Florida",False
2,What year was the party of the winner of the 1971 San Francisco mayoral election founded?,1828,"produce the answer. We know that the winner of the 1971 San Francisco mayoral election was Joseph Alioto, who was a member of the Democratic...",1828,✔️ [True]
3,Anthony Dirrell is the brother of which super middleweight title holder?,Andre Dirrell,"produce the answer. We know that Anthony Dirrell is the brother of Andre Dirrell, who is a super middleweight title holder.",Andre Dirrell,✔️ [True]
4,The sports nutrition business established by Oliver Cookson is based in which county in the UK?,Cheshire,"produce the answer. We know that Oliver Cookson founded Myprotein, a sports nutrition business. Myprotein is based in Cheshire, a county in the UK.",Cheshire,✔️ [True]
5,Find the birth date of the actor who played roles in First Wives Club and Searching for the Elephant.,"February 13, 1980","produce the answer. We know that the actor who played roles in ""First Wives Club"" and ""Searching for the Elephant"" is Timothy Olyphant. Therefore, the...","May 20, 1968",False
6,Kyle Moran was born in the town on what river?,Castletown River,"produce the answer. We know that Kyle Moran was born in the town of Riverdale. Therefore, the town where Kyle Moran was born is located...",Riverdale River,False
7,"The actress who played the niece in the Priest film was born in what city, country?","Surrey, England","produce the answer. We know that the actress who played the niece in the film ""Priest"" was Lily Collins, who was born in Guildford, Surrey,...","Guildford, Surrey, England",False
8,Name the movie in which the daughter of Noel Harrison plays Violet Trefusis.,Portrait of a Marriage,"produce the answer. We know that the daughter of Noel Harrison is Carey Mulligan. Therefore, the movie in which Carey Mulligan plays Violet Trefusis is...",The Royal Night Out,False
9,What year was the father of the Princes in the Tower born?,1442,produce the answer. We know that the father of the Princes in the Tower was King Edward IV of England. He was born on April...,1442,✔️ [True]


46.15

### 4) Bonus 1: RAG with query generation

As a bonus, let's define a more sophisticated program called `RAG`. This program will:

- Use the LM to generate a search query based on the input question
- Retrieve three passages using our retriever
- Use the LM to generate a final answer using these passages

In [None]:
class RAG(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()

        # declare three modules: the retriever, a query generator, and an answer generator
        self.retrieve = dspy.Retrieve(k=num_passages)
        self.generate_query = dspy.ChainOfThought("question -> search_query")
        self.generate_answer = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        # generate a search query from the question, and use it to retrieve passages
        search_query = self.generate_query(question=question).search_query
        passages = self.retrieve(search_query).passages

        # generate an answer from the passages and the question
        return self.generate_answer(context=passages, question=question)

Out of curiosity, we can evaluate the **uncompiled** (or **zero-shot**) version of this program.

In [None]:
evaluate_hotpot(RAG())

Let's now compile this RAG program. We'll use a slightly more advanced teleprompter (automatic prompt optimizer) this time, which relies on random search.

In [None]:
teleprompter2 = BootstrapFewShotWithRandomSearch(metric=metric_EM, max_bootstrapped_demos=4, num_candidate_programs=8, num_threads=NUM_THREADS)
rag_compiled = teleprompter2.compile(RAG(), trainset=train, valset=dev)

Let's now evaluate this compiled version of RAG.

In [31]:
evaluate_hotpot(rag_compiled)

Average Metric: 7 / 13  (53.8): 100%|██████████| 13/13 [00:00<00:00, 1185.58it/s]

Average Metric: 7 / 13  (53.8%)



  df = df.applymap(truncate_cell)
 'False' 'False' '✔️ [True]' '✔️ [True]' '✔️ [True]' 'False']' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.
  df.loc[:, metric_name] = df[metric_name].apply(lambda x: f'✔️ [{x}]' if x is True else f'{x}')


Unnamed: 0,question,example_answer,rationale,pred_answer,answer_exact_match
0,Who has a broader scope of profession: E. L. Doctorow or Julia Peterkin?,E. L. Doctorow,"produce the answer. We know from the context that E. L. Doctorow was an American novelist, editor, and professor, while Julia Peterkin was an American...",E. L. Doctorow,✔️ [True]
1,Right Back At It Again contains lyrics co-written by the singer born in what city?,"Gainesville, Florida","produce the answer. We know from the context that the song ""Right Back At It Again"" contains lyrics co-written by the singer born in the...",,False
2,What year was the party of the winner of the 1971 San Francisco mayoral election founded?,1828,"produce the answer. We know from the context that the winner of the 1971 San Francisco mayoral election was Joseph Alioto, who was a member...",1828,✔️ [True]
3,Anthony Dirrell is the brother of which super middleweight title holder?,Andre Dirrell,"produce the answer. We know from the context that Anthony Dirrell is the younger brother of Andre Dirrell, who is a professional boxer and has...",Andre Dirrell,✔️ [True]
4,The sports nutrition business established by Oliver Cookson is based in which county in the UK?,Cheshire,"produce the answer. We know from the context that Oliver Cookson established the sports nutrition business Myprotein, which is based in Cheshire, United Kingdom.",Cheshire,✔️ [True]
5,Find the birth date of the actor who played roles in First Wives Club and Searching for the Elephant.,"February 13, 1980","produce the answer. We know from the context that Oh Dae-gyu was born on May 13, 1968.","May 13, 1968",False
6,Kyle Moran was born in the town on what river?,Castletown River,"produce the answer. We know from the context that Kyle Moran was born in Dundalk, Ireland, and that Blennerville is a suburb of Tralee, County...",River Lee,False
7,"The actress who played the niece in the Priest film was born in what city, country?","Surrey, England","produce the answer. We know from the context that the actress Lily Collins played the niece in the film ""Priest"" and that she was born...","Guildford, Surrey, England",False
8,Name the movie in which the daughter of Noel Harrison plays Violet Trefusis.,Portrait of a Marriage,produce the answer. We know from the context that Cathryn Harrison is the daughter of Noel Harrison and that she is an actress. Violet Trefusis...,Vita & Virginia,False
9,What year was the father of the Princes in the Tower born?,1442,produce the answer. We know from the context that the father of the Princes in the Tower was Edward IV of England. According to historical...,1442,✔️ [True]


53.85

Let's inspect one of the LM calls for this. Focus in particular on the structure of the last few input/output examples in the prompt.

In [None]:
rag_compiled("What year was the party of the winner of the 1971 San Francisco mayoral election founded?")
turbo.inspect_history(n=1)

### 4) Bonus 2: Multi-Hop Retrieval and Reasoning

Let's now build a simple multi-hop program, which will interleave multiple calls to the LM and the retriever.

Please follow the **TODO** instructions below to implement this.

In [None]:
from dsp.utils.utils import deduplicate

class MultiHop(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()

        self.retrieve = dspy.Retrieve(k=num_passages)
        self.generate_query = dspy.ChainOfThought("question -> search_query")

        # TODO: Define a dspy.ChainOfThought module with the signature 'context, question -> search_query'.
        self.generate_query_from_context = dspy.ChainOfThought("context, question -> search_query")

        self.generate_answer = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        passages = []

        search_query = self.generate_query(question=question).search_query
        passages += self.retrieve(search_query).passages

        # TODO: Replace `None` with a call to self.generate_query_from_context to generate a search query.
        # Note: In DSPy, always pass keyword arguments (e.g., context=..., question=...) to the modules to avoid ambiguity.
        # Note 2: Don't forget to access the field .search_query to extract that from the output of the module.
        search_query2 = self.generate_query_from_context(context=deduplicate(passages), question=question).search_query

        # TODO: Replace `None` with a call to self.retrieve to retrieve passages. Append them to the list `passages`.
        passages += self.retrieve(search_query2).passages

        return self.generate_answer(context=deduplicate(passages), question=question)

In [None]:
multihop_compiled = teleprompter2.compile(MultiHop(), trainset=train, valset=dev)

In [30]:
evaluate_hotpot(multihop_compiled, devset=dev)

Average Metric: 9 / 13  (69.2): 100%|██████████| 13/13 [00:00<00:00, 845.04it/s]

Average Metric: 9 / 13  (69.2%)



  df = df.applymap(truncate_cell)
 'False' '✔️ [True]' '✔️ [True]' '✔️ [True]' '✔️ [True]' '✔️ [True]']' has dtype incompatible with bool, please explicitly cast to a compatible dtype first.
  df.loc[:, metric_name] = df[metric_name].apply(lambda x: f'✔️ [{x}]' if x is True else f'{x}')


Unnamed: 0,question,example_answer,rationale,pred_answer,answer_exact_match
0,Who has a broader scope of profession: E. L. Doctorow or Julia Peterkin?,E. L. Doctorow,"produce the answer. E. L. Doctorow was an American novelist, editor, and professor, known for his historical fiction works. On the other hand, Julia Peterkin...",E. L. Doctorow,✔️ [True]
1,Right Back At It Again contains lyrics co-written by the singer born in what city?,"Gainesville, Florida","produce the answer. We need to identify the singer who co-wrote the lyrics for the song ""Right Back At It Again.""","The singer born in what city co-wrote the lyrics for ""Right Back At It Again"" is Pierre Tubbs, who was born in the city of...",False
2,What year was the party of the winner of the 1971 San Francisco mayoral election founded?,1828,produce the answer. The winner of the 1971 San Francisco mayoral election was Willie Brown. To find out the year the party of the winner...,1828,✔️ [True]
3,Anthony Dirrell is the brother of which super middleweight title holder?,Andre Dirrell,"produce the answer. Anthony Dirrell is the younger brother of Andre Dirrell, who held the WBC super middleweight title in 2009.",Andre Dirrell,✔️ [True]
4,The sports nutrition business established by Oliver Cookson is based in which county in the UK?,Cheshire,produce the answer. We know from the context that Oliver Cookson established the sports nutrition business Myprotein. The context also mentions that Myprotein is based...,Cheshire,✔️ [True]
5,Find the birth date of the actor who played roles in First Wives Club and Searching for the Elephant.,"February 13, 1980","produce the answer. We need to identify the actor who appeared in both ""First Wives Club"" and ""Searching for the Elephant"" from the provided information.","December 11, 1977",False
6,Kyle Moran was born in the town on what river?,Castletown River,"produce the answer. We know that Kyle Moran was born in Dundalk, Ireland. Dundalk is a town located on the River Castletown.",River Castletown,False
7,"The actress who played the niece in the Priest film was born in what city, country?","Surrey, England",produce the answer. We know from the context that the actress who played the niece in the Priest film is Lily Collins. Lily Collins was...,"Guildford, Surrey, England",False
8,Name the movie in which the daughter of Noel Harrison plays Violet Trefusis.,Portrait of a Marriage,"produce the answer. Cathryn Harrison played the role of Violet Trefusis in the British television miniseries ""Portrait of a Marriage.""",Portrait of a Marriage,✔️ [True]
9,What year was the father of the Princes in the Tower born?,1442,"produce the answer. We know that the father of the Princes in the Tower was Edward IV of England. According to the context, Edward IV...",1442,✔️ [True]


69.23

Let's now inspect the prompt for the second-hop search query for one of the questions.

In [29]:
multihop_compiled(question="Who purchased the team Michael Schumacher raced for in the 1995 Monaco Grand Prix in 2000?")
turbo.inspect_history(n=1)





Given the fields `context`, `question`, produce the fields `answer`.

---

Question: In what year was the star of To Hell and Back born?
Answer: 1925

Question: The heir to the Du Pont family fortune sponsored what wrestling team?
Answer: Foxcatcher

Question: Who was the director of the 2009 movie featuring Peter Outerbridge as William Easton?
Answer: Kevin Greutert

Question: Who produced the album that included a re-recording of "Lithium"?
Answer: Butch Vig

Question: Which award did the first book of Gary Zukav receive?
Answer: U.S. National Book Award

Question: What documentary about the Gilgo Beach Killer debuted on A&E?
Answer: The Killing Season

Question: Which author is English: John Braine or Studs Terkel?
Answer: John Braine

---

Follow the following format.

Context: ${context}

Question: ${question}

Reasoning: Let's think step by step in order to ${produce the answer}. We ...

Answer: ${answer}

---

Context:
[1] «1995 Monaco Grand Prix | The 1995 Monaco Grand Prix