In [7]:
import os
import getpass

if "OPENAI_API_KEY" not in os.environ:
    key = getpass.getpass("OpenAI API key: ")
    if key:
        os.environ["OPENAI_API_KEY"] = key
    else:
        print("No key entered. Set OPENAI_API_KEY in this notebook or in your shell.")

In [2]:
import dspy
import orjson
from dspy.utils import download
from dspy.evaluate import SemanticF1




In [20]:
lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

In [3]:
download("https://huggingface.co/dspy/cache/resolve/main/ragqa_arena_tech_corpus.jsonl")

Downloading 'ragqa_arena_tech_corpus.jsonl'...


In [None]:
# %pip install -U faiss-cpu

Collecting faiss-cpu
  Downloading faiss_cpu-1.13.0-cp39-abi3-macosx_14_0_arm64.whl (3.4 MB)
[K     |████████████████████████████████| 3.4 MB 3.4 MB/s eta 0:00:01
Installing collected packages: faiss-cpu
Successfully installed faiss-cpu-1.13.0
You should consider upgrading via the '/Users/eddiej@nisos.com/mespace/dspy/dspy-venv/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [8]:
max_characters = 6000  # for truncating >99th percentile of documents
topk_docs_to_retrieve = 5  # number of documents to retrieve per search query

with open("ragqa_arena_tech_corpus.jsonl") as f:
    corpus = [orjson.loads(line)['text'][:max_characters] for line in f]
    print(f"Loaded {len(corpus)} documents. Will encode them below.")

embedder = dspy.Embedder('openai/text-embedding-3-small', dimensions=512)
search = dspy.retrievers.Embeddings(embedder=embedder, corpus=corpus, k=topk_docs_to_retrieve)

Loaded 28436 documents. Will encode them below.
Training a 32-byte FAISS index with 337 partitions, based on 28436 x 512-dim embeddings


In [19]:
search("what is Android?")

Prediction(
    passages=['The Short Answer Theoretically, all devices that meet Androids minimum requirements can run Android, its just a matter of customizing Android for the device. The Long Answer While Android is open source and can be modified to suit many devices, firmware and hardware drivers are most often not made readily available -- especially not the source code. Android wont run on a device without drivers for that specific device, so this means that you cant simply compile the code for Android and run it on your phone. Android is a very different operating system than other phone platforms; Android and Windows Phone 7, for example, are just as different as Ubuntu and Windows 7 for the PC. This means that even if you have WP7 drivers for your device, those drivers wont work on Android. Youll have to modify those drivers to be compatible with Android, and you may need to reverse-engineer a lot of code. This is very difficult and time-consuming, and sometimes even a team of

In [15]:
class RAG(dspy.Module):
    def __init__(self):
        self.respond = dspy.ChainOfThought('context, question -> response')
    def __call__(self, question: str) -> str:
        context = search(question)
        return self.respond(context=context, question=question)




In [21]:

rag = RAG()
rag(question="what is the capital of France?")

Prediction(
    reasoning='The context provided does not contain any information regarding the capital of France. However, based on general knowledge, the capital of France is Paris.',
    response='The capital of France is Paris.'
)

In [22]:
rag(question="what are high memory and low memory on linux?")

Prediction(
    reasoning="High memory and low memory in Linux refer to two distinct segments of the system's memory that are used for different purposes. Low memory is the portion of memory that the Linux kernel can access directly and is statically mapped at boot time. This memory is used for kernel operations and data structures. High memory, on the other hand, is not permanently mapped in the kernel's address space and is used primarily for user-space applications. When the kernel needs to access high memory, it must first map it into its address space temporarily. This distinction is crucial for managing memory efficiently in a 32-bit architecture, where the kernel must handle both user processes and hardware devices while ensuring that user-space applications cannot directly access kernel memory.",
    response="In Linux, low memory is the segment of memory that the kernel can access directly, while high memory is the segment that is not permanently mapped in the kernel's address

In [23]:
dspy.inspect_history()





[34m[2026-02-11T22:57:04.604594][0m

[31mSystem message:[0m

Your input fields are:
1. `context` (str): 
2. `question` (str):
Your output fields are:
1. `reasoning` (str): 
2. `response` (str):
All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## context ## ]]
{context}

[[ ## question ## ]]
{question}

[[ ## reasoning ## ]]
{reasoning}

[[ ## response ## ]]
{response}

[[ ## completed ## ]]
In adhering to this structure, your objective is: 
        Given the fields `context`, `question`, produce the fields `response`.


[31mUser message:[0m

[[ ## context ## ]]
Prediction(
    passages=['As far as I remember, High Memory is used for application space and Low Memory for the kernel. Advantage is that (user-space) applications cant access kernel-space memory.', 'HIGHMEM is a range of kernels memory space, but it is NOT memory you access but its a place where you put what you want to access. A typical 32bit Linux virtual memory 