# KNN Few-Shot Prompting Demo

This notebook demonstrates how to use DSPy with a KNN-based few-shot retriever for dynamic prompt construction.

## Overview

1. **Setup**: Configure API and DSPy.
2. **Load Data**: Load training examples for few-shot retrieval.
3. **Compile Pipeline**: Build a DSPy pipeline using KNN retrieval.
4. **Inference**: Query the pipeline with test questions.
5. **Inspect Prompt**: View the selected few-shot examples for each query.

In [5]:
# Import necessary functions and libraries
from src.knn_fewshot_pipeline import load_examples, compile_knn_dspy_pipeline, inference_knn_dspy_pipeline, QuestionAnswer
import dspy
import os

  from .autonotebook import tqdm as notebook_tqdm


In [None]:

os.environ["OPENAI_API_KEY"] = "OPENAI_API_KEY"  # Replace with your actual OpenAI API key
dspy.settings.configure(lm=dspy.LM("openai/gpt-4o-mini"))

## Build and compile the KNN few-shot prompting pipeline

In [10]:
example_path = "data/examples.json"
embedder_model_name = "all-MiniLM-L6-v2"
k = 3

# Load training examples
trainset = load_examples(example_path)

# Create DSPy module for question answering
qa_module = dspy.Predict(QuestionAnswer)

# Compile the KNN few-shot pipeline
compiled_qa = compile_knn_dspy_pipeline(trainset, qa_module, k, embedder_model_name)

## Query on a test example

### Example 1: Capital of Belgium

In [11]:
question = "What is the capital of Belgium?"
answer = inference_knn_dspy_pipeline(compiled_qa, question=question)
print(f"Q: {question}")
print(f"A: {answer}")

  0%|          | 0/3 [00:00<?, ?it/s]

100%|██████████| 3/3 [00:00<00:00, 17.37it/s]

Bootstrapped 3 full traces after 2 examples for up to 1 rounds, amounting to 3 attempts.
Q: What is the capital of Belgium?
A: Prediction(
    answer='Brussels'
)





#### Few-shot examples selected for this prompt

In [12]:
print(compiled_qa.history[-1])

{'prompt': None, 'messages': [{'role': 'system', 'content': 'Your input fields are:\n1. `question` (str): input text to answer the question\nYour output fields are:\n1. `answer` (str): the answer to the question\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\n[[ ## question ## ]]\n{question}\n\n[[ ## answer ## ]]\n{answer}\n\n[[ ## completed ## ]]\nIn adhering to this structure, your objective is: \n        Answer the question.'}, {'role': 'user', 'content': '[[ ## question ## ]]\nWhat is the capital of France?'}, {'role': 'assistant', 'content': '[[ ## answer ## ]]\nParis\n\n[[ ## completed ## ]]\n'}, {'role': 'user', 'content': '[[ ## question ## ]]\nWhat language is primarily spoken in Brazil?'}, {'role': 'assistant', 'content': '[[ ## answer ## ]]\nPortuguese\n\n[[ ## completed ## ]]\n'}, {'role': 'user', 'content': '[[ ## question ## ]]\nWhat is the currency of Japan?'}, {'role': 'assistant', 'content': '[[ ## answer ## ]]\nYen\

### Example 2: Language in Japan

In [13]:
question = "What language is primarily spoken in Japan?"
answer = inference_knn_dspy_pipeline(compiled_qa, question=question)
print(f"Q: {question}")
print(f"A: {answer}")

  0%|          | 0/3 [00:00<?, ?it/s]

100%|██████████| 3/3 [00:00<00:00, 21.01it/s]

Bootstrapped 3 full traces after 2 examples for up to 1 rounds, amounting to 3 attempts.
Q: What language is primarily spoken in Japan?
A: Prediction(
    answer='Japanese'
)





#### Few-shot examples selected for this prompt

In [14]:
print(compiled_qa.history[-1])

{'prompt': None, 'messages': [{'role': 'system', 'content': 'Your input fields are:\n1. `question` (str): input text to answer the question\nYour output fields are:\n1. `answer` (str): the answer to the question\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\n[[ ## question ## ]]\n{question}\n\n[[ ## answer ## ]]\n{answer}\n\n[[ ## completed ## ]]\nIn adhering to this structure, your objective is: \n        Answer the question.'}, {'role': 'user', 'content': '[[ ## question ## ]]\nWhat language is primarily spoken in Brazil?'}, {'role': 'assistant', 'content': '[[ ## answer ## ]]\nPortuguese\n\n[[ ## completed ## ]]\n'}, {'role': 'user', 'content': '[[ ## question ## ]]\nWhat is the currency of Japan?'}, {'role': 'assistant', 'content': '[[ ## answer ## ]]\nYen\n\n[[ ## completed ## ]]\n'}, {'role': 'user', 'content': '[[ ## question ## ]]\nWhat is the largest ocean on Earth?'}, {'role': 'assistant', 'content': '[[ ## answer ## ]]\nP