# Multi-hop Question Answering

## Configure LM and RM

In [1]:
import dspy

# Set up the LM.
OPENAI_API_KEY = open("../.secrets/openai-api_key.txt").read()
turbo = dspy.OpenAI(model='gpt-3.5-turbo-instruct', max_tokens=250, api_key=OPENAI_API_KEY)
colbertv2 = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
# configure language model and retrieval model
dspy.settings.configure(lm=turbo, rm=colbertv2)

  from .autonotebook import tqdm as notebook_tqdm


## Load dataset

In [2]:
from dspy.datasets import HotPotQA

# Load the dataset.
dataset = HotPotQA(train_seed=1, train_size=20, eval_seed=2023, dev_size=50, test_size=0)

# Tell DSPy that the 'question' field is the input. Any other fields are labels and/or metadata.
trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]

len(trainset), len(devset)

(20, 50)

## Signature

In [3]:
class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""

    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

Define a new Signature for the "hop" behavior: inputting some context and a question to generate a search query to find missing information.

In [6]:
class GenerateSearchQuery(dspy.Signature):
    """Write a simple search query that will help answer a complex question."""

    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    query = dspy.OutputField()

## Build Pipeline

In [4]:
from dsp.utils import deduplicate

class MultiHopRAG(dspy.Module):
    def __init__(self, passages_per_hop=3, max_hops=2):
        super().__init__()

        self.generate_query = [dspy.ChainOfThought(GenerateSearchQuery) for _ in range(max_hops)]
        self.retrieve = dspy.Retrieve(k=passages_per_hop)
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
        self.max_hops = max_hops
    
    def forward(self, question):
        context = []
        
        for hop in range(self.max_hops):
            query = self.generate_query[hop](context=context, question=question).query
            passages = self.retrieve(query).passages
            context = deduplicate(context + passages)

        pred = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=pred.answer)

In [7]:
# Ask any question you like to this simple RAG program.
my_question = "How many storeys are in the castle that David Gregory inherited?"

mhrag = MultiHopRAG()  # uncompiled = zero-shot program
pred = mhrag(my_question)

# Print the contexts and the answer.
print(f"Question: {my_question}")
print(f"Predicted Answer: {pred.answer}")
print(f"Retrieved Contexts (truncated): {[c[:200] + '...' for c in pred.context]}")

Question: How many storeys are in the castle that David Gregory inherited?
Predicted Answer: Unknown
Retrieved Contexts (truncated): ['David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinn...', 'David Gregory (footballer, born 1970) | Born in Polstead, Gregory began his career at Ipswich Town, making 32 appearances between 1987–1995. He made two appearances on loan at Hereford United and thre...', 'David Gregory (historian) | David Gregory (1696–1767) was an English churchman and academic, Dean of Christ Church, Oxford and the first Regius Professor of Modern History at Oxford....', 'Gregory III, Count of Tusculum | Gregory III (died about 1108) was the son of Gregory II. He was the Count of Tusculum from 1058, when his father died. He is usually considered to stand at the fore of...']


In [8]:
turbo.inspect_history(n=3)




Write a simple search query that will help answer a complex question.

---

Follow the following format.

Context: may contain relevant facts

Question: ${question}

Reasoning: Let's think step by step in order to ${produce the query}. We ...

Query: ${query}

---

Context: N/A

Question: How many storeys are in the castle that David Gregory inherited?

Reasoning: Let's think step by step in order to[32m find the answer. We know that David Gregory inherited a castle, so we need to find information about the castle and its features.

Query: SELECT COUNT(storeys) FROM castle WHERE owner = 'David Gregory'[0m





Write a simple search query that will help answer a complex question.

---

Follow the following format.

Context: may contain relevant facts

Question: ${question}

Reasoning: Let's think step by step in order to ${produce the query}. We ...

Query: ${query}

---

Context:
[1] «David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and 

'\n\n\nWrite a simple search query that will help answer a complex question.\n\n---\n\nFollow the following format.\n\nContext: may contain relevant facts\n\nQuestion: ${question}\n\nReasoning: Let\'s think step by step in order to ${produce the query}. We ...\n\nQuery: ${query}\n\n---\n\nContext: N/A\n\nQuestion: How many storeys are in the castle that David Gregory inherited?\n\nReasoning: Let\'s think step by step in order to\x1b[32m find the answer. We know that David Gregory inherited a castle, so we need to find information about the castle and its features.\n\nQuery: SELECT COUNT(storeys) FROM castle WHERE owner = \'David Gregory\'\x1b[0m\n\n\n\n\n\nWrite a simple search query that will help answer a complex question.\n\n---\n\nFollow the following format.\n\nContext: may contain relevant facts\n\nQuestion: ${question}\n\nReasoning: Let\'s think step by step in order to ${produce the query}. We ...\n\nQuery: ${query}\n\n---\n\nContext:\n[1] «David Gregory (physician) | David Gre

## Optimize the Pipeline

### Setup Metric

1. The predicted answer matches the gold answer.
2. The retrieved context contains the gold answer.
3. None of the generated queries is rambling (i.e., none exceeds 100 characters in length).
4. None of the generated queries is roughly repeated (i.e., none is within 0.8 or higher F1 score of earlier queries).

In [10]:
def validate_context_and_answer_and_hops(example, pred, trace=None):
    if not dspy.evaluate.answer_exact_match(example, pred): return False
    if not dspy.evaluate.answer_passage_match(example, pred): return False

    hops = [example.question] + [outputs.query for *_, outputs in trace if 'query' in outputs]

    if max([len(h) for h in hops]) > 100: return False
    if any(dspy.evaluate.answer_exact_match_str(hops[idx], hops[:idx], frac=0.8) for idx in range(2, len(hops))): return False

    return True

## Compile RAG model

In [11]:
from dspy.teleprompt import BootstrapFewShot

teleprompter = BootstrapFewShot(metric=validate_context_and_answer_and_hops)
compiled_baleen = teleprompter.compile(MultiHopRAG(), teacher=MultiHopRAG(passages_per_hop=2), trainset=trainset)

100%|██████████| 20/20 [02:07<00:00,  6.37s/it]

Bootstrapped 0 full traces after 20 examples in round 0.





In [13]:
from dspy.evaluate.evaluate import Evaluate

# Define metric to check if we retrieved the correct documents
def gold_passages_retrieved(example, pred, trace=None):
    gold_titles = set(map(dspy.evaluate.normalize_text, example["gold_titles"]))
    found_titles = set(
        map(dspy.evaluate.normalize_text, [c.split(" | ")[0] for c in pred.context])
    )
    return gold_titles.issubset(found_titles)

# Set up the `evaluate_on_hotpotqa` function. We'll use this many times below.
evaluate_on_hotpotqa = Evaluate(devset=devset, num_threads=1, display_progress=True, display_table=5)

uncompiled_baleen_retrieval_score = evaluate_on_hotpotqa(mhrag, metric=gold_passages_retrieved)

compiled_baleen_retrieval_score = evaluate_on_hotpotqa(compiled_baleen, metric=gold_passages_retrieved)

print(f"## Retrieval Score for uncompiled Baleen: {uncompiled_baleen_retrieval_score}")
print(f"## Retrieval Score for compiled Baleen: {compiled_baleen_retrieval_score}")

Average Metric: 15 / 50  (30.0): 100%|██████████| 50/50 [05:02<00:00,  6.05s/it]


Unnamed: 0,question,example_answer,gold_titles,context,pred_answer,gold_passages_retrieved
0,Are both Cangzhou and Qionghai in the Hebei province of China?,no,"{'Qionghai', 'Cangzhou'}","['Cangzhou | Cangzhou () is a prefecture-level city in eastern Hebei province, People\'s Republic of China. At the 2010 census, Cangzhou\'s built-up (""or metro"") area...",Yes,False
1,Who conducts the draft in which Marc-Andre Fleury was drafted to the Vegas Golden Knights for the 2017-18 season?,National Hockey League,"{'2017–18 Pittsburgh Penguins season', '2017 NHL Expansion Draft'}","[""2017 NHL Expansion Draft | The 2017 NHL Expansion Draft was an expansion draft conducted by the National Hockey League on June 18–20, 2017 to...",National Hockey League (NHL),✔️ [True]
2,"The Wings entered a new era, following the retirement of which Canadian retired professional ice hockey player and current general manager of the Tampa Bay...",Steve Yzerman,"{'Steve Yzerman', '2006–07 Detroit Red Wings season'}","['Steve Yzerman | Stephen Gregory ""Steve"" Yzerman ( ; born May 9, 1965) is a Canadian retired professional ice hockey player and current general manager...",Steve Yzerman,False
3,What river is near the Crichton Collegiate Church?,the River Tyne,"{'Crichton Castle', 'Crichton Collegiate Church'}","['River Esk, Lothian | The River Esk (Brythonic: Isca (water), Scottish Gaelic: ""Easg (water)"" ), also called the Lothian Esk, is a river that flows...",River Tyne,✔️ [True]
4,In the 10th Century A.D. Ealhswith had a son called Æthelweard by which English king?,King Alfred the Great,"{'Ealhswith', 'Æthelweard (son of Alfred)'}","[""Beonna of East Anglia | Beonna (also known as Beorna) was King of East Anglia from 749. He is notable for being the first East...",Alfred the Great,✔️ [True]


Average Metric: 15 / 50  (30.0): 100%|██████████| 50/50 [01:27<00:00,  1.75s/it]


Unnamed: 0,question,example_answer,gold_titles,context,pred_answer,gold_passages_retrieved
0,Are both Cangzhou and Qionghai in the Hebei province of China?,no,"{'Qionghai', 'Cangzhou'}","['Cangzhou | Cangzhou () is a prefecture-level city in eastern Hebei province, People\'s Republic of China. At the 2010 census, Cangzhou\'s built-up (""or metro"") area...",No,False
1,Who conducts the draft in which Marc-Andre Fleury was drafted to the Vegas Golden Knights for the 2017-18 season?,National Hockey League,"{'2017–18 Pittsburgh Penguins season', '2017 NHL Expansion Draft'}","[""2017 NHL Expansion Draft | The 2017 NHL Expansion Draft was an expansion draft conducted by the National Hockey League on June 18–20, 2017 to...",NHL,✔️ [True]
2,"The Wings entered a new era, following the retirement of which Canadian retired professional ice hockey player and current general manager of the Tampa Bay...",Steve Yzerman,"{'Steve Yzerman', '2006–07 Detroit Red Wings season'}","['Steve Yzerman | Stephen Gregory ""Steve"" Yzerman ( ; born May 9, 1965) is a Canadian retired professional ice hockey player and current general manager...",Steve Yzerman,False
3,What river is near the Crichton Collegiate Church?,the River Tyne,"{'Crichton Castle', 'Crichton Collegiate Church'}","['River Esk, Lothian | The River Esk (Brythonic: Isca (water), Scottish Gaelic: ""Easg (water)"" ), also called the Lothian Esk, is a river that flows...",River Esk,✔️ [True]
4,In the 10th Century A.D. Ealhswith had a son called Æthelweard by which English king?,King Alfred the Great,"{'Ealhswith', 'Æthelweard (son of Alfred)'}","[""Beonna of East Anglia | Beonna (also known as Beorna) was King of East Anglia from 749. He is notable for being the first East...",Alfred the Great,✔️ [True]


## Retrieval Score for uncompiled Baleen: 30.0
## Retrieval Score for compiled Baleen: 30.0
