# DSPy KNN few-shot example 
This noteboook shows how KNN few-shot can be implemented with DSPy using the **KNNFewShot** teleprompter. To illustrate, we use the HotPotQA dataset. Please see [intro.ipynb](../intro.ipynb) for other example use cases of DSPy.


In [1]:
import openai
import dspy
import json

In [2]:
with open("creds.json", "r") as creds:
    api_key = json.loads(creds.read())["openai_key"]

In [3]:
lm = dspy.OpenAI(model='gpt-4', api_key=api_key, model_type='chat', max_tokens = 500)
dspy.settings.configure(lm=lm)

In [4]:
from dspy.datasets import HotPotQA

# Load the dataset.
dataset = HotPotQA(train_seed=1, train_size=20, eval_seed=2023, dev_size=50, test_size=0)

trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]


In [5]:
train_example = trainset[0]
print(train_example)
print(f"Question: {train_example.question}")
print(f"Answer: {train_example.answer}")

Example({'question': 'At My Window was released by which American singer-songwriter?', 'answer': 'John Townes Van Zandt'}) (input_keys={'question'})
Question: At My Window was released by which American singer-songwriter?
Answer: John Townes Van Zandt


In [6]:
class BasicQA(dspy.Signature):
    """Answer questions with short factoid answers."""

    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

In [7]:
class BasicQABot(dspy.Module):
    def __init__(self):
        super().__init__()

        self.generate = dspy.Predict(BasicQA)

    def forward(self,question):
        prediction = self.generate(question = question)
        return dspy.Prediction(answer = prediction.answer)

In [8]:
qa_bot = BasicQABot()
pred = qa_bot.forward("In the 10th Century A.D. Ealhswith had a son called Æthelweard by which English king?")
pred.answer

'Alfred the Great'

In [9]:
from dspy.teleprompt import KNNFewShot
from dspy.predict.knn import KNN

knn_teleprompter = KNNFewShot(KNN, 7, trainset)
compiled_knn = knn_teleprompter.compile(BasicQABot(), trainset=trainset)

In [10]:
example = devset[0]
pred = compiled_knn(question = example.question)
print("Question: ", example.question)
print("Expected answer: ", example.answer)
print("Prediction: ", pred.answer)

 57%|█████▋    | 4/7 [00:00<00:00, 76.67it/s]

Bootstrapped 4 full traces after 5 examples in round 0.
Question:  Are both Cangzhou and Qionghai in the Hebei province of China?
Expected answer:  no
Prediction:  No





In [11]:
lm.inspect_history(1)





Answer questions with short factoid answers.

---

Follow the following format.

Question: ${question}
Answer: often between 1 and 5 words

---

Question: On the coast of what ocean is the birthplace of Diogal Sakho?
Answer: Atlantic Ocean

---

Question: Which is taller, the Empire State Building or the Bank of America Tower?
Answer: Empire State Building

---

Question: Samantha Cristoforetti and Mark Shuttleworth are both best known for being first in their field to go where?
Answer: Space

---

Question: Which Pakistani cricket umpire who won 3 consecutive ICC umpire of the year awards in 2009, 2010, and 2011 will be in the ICC World Twenty20?
Answer: Aleem Dar

---

Question: What is the code name for the German offensive that started this Second World War engagement on the Eastern Front (a few hundred kilometers from Moscow) between Soviet and German forces, which included 102nd Infantry Division?
Answer: Operation Citadel

---

Question: Which of these publications was most 

In [13]:
from dspy.evaluate.evaluate import Evaluate

# Set up the `evaluate_on_hotpotqa` function. We'll use this many times below.
evaluate_on_hotpotqa = Evaluate(devset=devset, num_threads=1, display_progress=True, display_table=5)

# Evaluate the `compiled_knn` program with the `answer_exact_match` metric.
metric = dspy.evaluate.answer_exact_match

evaluate_on_hotpotqa(compiled_knn, metric)

 57%|█████▋    | 4/7 [00:00<00:00, 2264.74it/s]
Average Metric: 1 / 1  (100.0):   0%|          | 0/50 [00:00<?, ?it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 83.63it/s]
Average Metric: 2 / 2  (100.0):   4%|▍         | 2/50 [00:00<00:02, 19.32it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 10.67it/s]


Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.52it/s]  | 2/50 [00:00<00:02, 19.32it/s]
Average Metric: 3 / 4  (75.0):   8%|▊         | 4/50 [00:01<00:13,  3.35it/s] 

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 11.19it/s]
Average Metric: 3 / 5  (60.0):  10%|█         | 5/50 [00:01<00:14,  3.00it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.30it/s]
Average Metric: 4 / 6  (66.7):  12%|█▏        | 6/50 [00:01<00:14,  3.02it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.16it/s]
Average Metric: 4 / 7  (57.1):  14%|█▍        | 7/50 [00:02<00:14,  3.03it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.83it/s]


Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.74it/s] | 8/50 [00:02<00:17,  2.47it/s]
Average Metric: 5 / 9  (55.6):  18%|█▊        | 9/50 [00:03<00:15,  2.65it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 13.98it/s]
Average Metric: 6 / 10  (60.0):  20%|██        | 10/50 [00:03<00:14,  2.69it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00,  8.91it/s]
Average Metric: 7 / 11  (63.6):  22%|██▏       | 11/50 [00:03<00:15,  2.46it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.86it/s]
Average Metric: 7 / 12  (58.3):  24%|██▍       | 12/50 [00:04<00:14,  2.69it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 19.00it/s]
Average Metric: 7 / 13  (53.8):  26%|██▌       | 13/50 [00:04<00:12,  2.91it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.93it/s]
Average Metric: 7 / 14  (50.0):  28%|██▊       | 14/50 [00:04<00:12,  2.88it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 16.04it/s]
Average Metric: 7 / 15  (46.7):  30%|███       | 15/50 [00:05<00:12,  2.87it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.06it/s]
Average Metric: 8 / 16  (50.0):  32%|███▏      | 16/50 [00:05<00:11,  2.93it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 11.56it/s]
Average Metric: 8 / 17  (47.1):  34%|███▍      | 17/50 [00:05<00:11,  2.80it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.57it/s]
Average Metric: 8 / 18  (44.4):  36%|███▌      | 18/50 [00:06<00:11,  2.88it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.20it/s]
Average Metric: 8 / 19  (42.1):  38%|███▊      | 19/50 [00:06<00:10,  2.85it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 10.48it/s]


Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 27.90it/s]  | 20/50 [00:07<00:13,  2.18it/s]
Average Metric: 8 / 21  (38.1):  42%|████▏     | 21/50 [00:07<00:11,  2.61it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 19.24it/s]
Average Metric: 8 / 22  (36.4):  44%|████▍     | 22/50 [00:07<00:09,  2.93it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00,  6.87it/s]
Average Metric: 8 / 23  (34.8):  46%|████▌     | 23/50 [00:08<00:11,  2.33it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00,  9.86it/s]
Average Metric: 8 / 24  (33.3):  48%|████▊     | 24/50 [00:08<00:11,  2.27it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.60it/s]
Average Metric: 8 / 25  (32.0):  50%|█████     | 25/50 [00:09<00:09,  2.50it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 19.33it/s]
Average Metric: 9 / 26  (34.6):  52%|█████▏    | 26/50 [00:09<00:08,  2.76it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 26.14it/s]
Average Metric: 9 / 27  (33.3):  54%|█████▍    | 27/50 [00:09<00:07,  3.16it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 13.22it/s]
Average Metric: 9 / 28  (32.1):  56%|█████▌    | 28/50 [00:09<00:07,  3.00it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.75it/s]
Average Metric: 10 / 29  (34.5):  58%|█████▊    | 29/50 [00:10<00:06,  3.16it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 25.94it/s]
Average Metric: 11 / 30  (36.7):  60%|██████    | 30/50 [00:10<00:05,  3.51it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 19.19it/s]
Average Metric: 11 / 31  (35.5):  62%|██████▏   | 31/50 [00:10<00:05,  3.57it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 24.89it/s]
Average Metric: 12 / 32  (37.5):  64%|██████▍   | 32/50 [00:10<00:04,  3.83it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 12.00it/s]
Average Metric: 12 / 33  (36.4):  66%|██████▌   | 33/50 [00:11<00:05,  3.34it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 14.42it/s]
Average Metric: 12 / 34  (35.3):  68%|██████▊   | 34/50 [00:11<00:05,  3.15it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 17.69it/s]
Average Metric: 13 / 35  (37.1):  70%|███████   | 35/50 [00:11<00:04,  3.18it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 22.23it/s]
Average Metric: 14 / 36  (38.9):  72%|███████▏  | 36/50 [00:12<00:04,  3.45it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 25.73it/s]
Average Metric: 15 / 37  (40.5):  74%|███████▍  | 37/50 [00:12<00:03,  3.69it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 14.13it/s]
Average Metric: 16 / 38  (42.1):  76%|███████▌  | 38/50 [00:12<00:03,  3.43it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 20.40it/s]
Average Metric: 16 / 39  (41.0):  78%|███████▊  | 39/50 [00:13<00:03,  3.55it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 16.23it/s]
Average Metric: 17 / 40  (42.5):  80%|████████  | 40/50 [00:13<00:02,  3.47it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 14.61it/s]
Average Metric: 18 / 41  (43.9):  82%|████████▏ | 41/50 [00:13<00:02,  3.20it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 10.44it/s]
Average Metric: 19 / 42  (45.2):  84%|████████▍ | 42/50 [00:14<00:02,  2.85it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 12.44it/s]
Average Metric: 19 / 43  (44.2):  86%|████████▌ | 43/50 [00:14<00:02,  2.67it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 22.11it/s]
Average Metric: 19 / 44  (43.2):  88%|████████▊ | 44/50 [00:14<00:02,  2.83it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 16.91it/s]
Average Metric: 20 / 45  (44.4):  90%|█████████ | 45/50 [00:15<00:01,  3.00it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 14.72it/s]
Average Metric: 20 / 46  (43.5):  92%|█████████▏| 46/50 [00:15<00:01,  2.97it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 19.51it/s]
Average Metric: 20 / 47  (42.6):  94%|█████████▍| 47/50 [00:15<00:00,  3.18it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 26.03it/s]
Average Metric: 20 / 48  (41.7):  96%|█████████▌| 48/50 [00:16<00:00,  3.43it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 18.18it/s]
Average Metric: 20 / 49  (40.8):  98%|█████████▊| 49/50 [00:16<00:00,  3.49it/s]

Bootstrapped 4 full traces after 5 examples in round 0.


 57%|█████▋    | 4/7 [00:00<00:00, 15.46it/s]
Average Metric: 20 / 50  (40.0): 100%|██████████| 50/50 [00:16<00:00,  3.01it/s]
  df = df.applymap(truncate_cell)


Bootstrapped 4 full traces after 5 examples in round 0.
Average Metric: 20 / 50  (40.0%)


Unnamed: 0,question,example_answer,gold_titles,pred_answer,answer_exact_match
0,Are both Cangzhou and Qionghai in the Hebei province of China?,no,"{'Qionghai', 'Cangzhou'}",No,✔️ [True]
1,Who conducts the draft in which Marc-Andre Fleury was drafted to the Vegas Golden Knights for the 2017-18 season?,National Hockey League,"{'2017–18 Pittsburgh Penguins season', '2017 NHL Expansion Draft'}",National Hockey League,✔️ [True]
2,"The Wings entered a new era, following the retirement of which Canadian retired professional ice hockey player and current general manager of the Tampa Bay...",Steve Yzerman,"{'2006–07 Detroit Red Wings season', 'Steve Yzerman'}",Steve Yzerman,✔️ [True]
3,What river is near the Crichton Collegiate Church?,the River Tyne,"{'Crichton Castle', 'Crichton Collegiate Church'}",Tyne River,❌ [False]
4,In the 10th Century A.D. Ealhswith had a son called Æthelweard by which English king?,King Alfred the Great,"{'Ealhswith', 'Æthelweard (son of Alfred)'}",Alfred the Great,❌ [False]


40.0