# Building optimized RAG with LlamaIndex + DSPy

This notebook provides a comprehensive overview of LlamaIndex + DSPy integrations.

We show **three** core integrations:
1. **Build and optimize Query Pipelines with DSPy predictors**: The first section shows you how to write DSPy code to define signatures for LLM inputs/outputs. Then port over these components to overall workflows within LlamaIndex Query pipelines, and then end-to-end optimize the entire system.

2. **Build and optimize Query Pipelines with Existing Prompts**: Instead of writing DSPy signatures, you can just define a LlamaIndex prompt template, and our converter will auto-optimize it for you.

3. **Port over DSPy-Optimized Prompts to any LlamaIndex Module**: Possible through our `DSPyPromptTemplate` - translate an optimized prompt through DSPy into any module that requires prompts in LlamaIndex.

In [14]:
!pip uninstall llama-index -y
!pip install "llama-index==0.11.6"

78.28s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


Found existing installation: llama-index 0.11.6
Uninstalling llama-index-0.11.6:
  Successfully uninstalled llama-index-0.11.6


83.72s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


Collecting llama-index==0.11.6
  Using cached llama_index-0.11.6-py3-none-any.whl.metadata (11 kB)
Using cached llama_index-0.11.6-py3-none-any.whl (6.8 kB)
Installing collected packages: llama-index
Successfully installed llama-index-0.11.6


In [15]:
from dotenv import load_dotenv
import os

# Load environment variables from the .env file
load_dotenv()
assert(len(os.getenv('OPENAI_API_KEY')))

## Setup

Define the LLM setting for DSPy (note: this is separate from using the LlamaIndex LLMs), and also the answer signature.

In [16]:
import dspy

turbo = dspy.OpenAI(model='gpt-3.5-turbo')
dspy.settings.configure(lm=turbo)

In [17]:
import dspy

class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""

    context_str = dspy.InputField(desc="contains relevant facts")
    query_str = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

## [Part 1] Build and Optimize a Query Pipeline with DSPy Modules

Use our DSPy query components to plugin DSPy prompts/LLMs, stitch together with our query pipeline abstraction.

Any query pipeline can be plugged into our `LlamaIndexModule`. We can then let DSPy optimize the entire thing e2e.

#### Load Data, Build Index

In [18]:
# port it over to another index  (paul graham example) 

!wget https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt -O paul_graham_essay.txt

89.54s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


--2024-09-06 15:10:38--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8002::154, 2606:50c0:8003::154, 2606:50c0:8001::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘paul_graham_essay.txt’


2024-09-06 15:10:39 (2.01 MB/s) - ‘paul_graham_essay.txt’ saved [75042/75042]



In [19]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

reader = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"])
docs = reader.load_data()

index = VectorStoreIndex.from_documents(docs)

In [20]:
retriever = index.as_retriever(similarity_top_k=2)

#### Build Query Pipeline

Replace the synthesis piece with the DSPy component (make sure GenerateAnswer matches signature of inputs/outputs).

In [21]:
from llama_index.core.query_pipeline import QueryPipeline as QP, InputComponent, FnComponent
from dspy.predict.llamaindex import DSPyComponent, LlamaIndexModule

dspy_component = DSPyComponent(
    dspy.Predict(GenerateAnswer)
)

retriever_post = FnComponent(
    lambda contexts: "\n\n".join([n.get_content() for n in contexts])
)


p = QP(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "retriever_post": retriever_post,
        "synthesizer": dspy_component,
    }
)
p.add_link("input", "retriever")
p.add_link("retriever", "retriever_post")
p.add_link("input", "synthesizer", dest_key="query_str")
p.add_link("retriever_post", "synthesizer", dest_key="context_str")


dspy_qp = LlamaIndexModule(p)

In [22]:
output = dspy_qp(query_str="what did the author do in YC")

[1;3;38;2;155;135;227m> Running module input with input: 
query_str: what did the author do in YC

[0m[1;3;38;2;155;135;227m> Running module retriever with input: 
input: what did the author do in YC

[0m[1;3;38;2;155;135;227m> Running module retriever_post with input: 
contexts: [NodeWithScore(node=TextNode(id_='fb626aa5-b633-48b3-8a79-7614ca3ac4a5', embedding=None, metadata={'file_path': 'paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain'...

[0m[1;3;38;2;155;135;227m> Running module synthesizer with input: 
query_str: what did the author do in YC
context_str: YC was different from other kinds of work I've done. Instead of deciding for myself what to work on, the problems came to me. Every 6 months there was a new batch of startups, and their problems, what...

[0m

In [23]:
output

Prediction(
    answer='Funded startups, helped founders.'
)

#### Optimize Query Pipeline

Let's try optimizing the query pipeline with few-shot examples.

We define a toy dataset with two examples. We then use our `SemanticSimilarityEvaluator` to define a custom eval function to pass to the DSPy teleprompter.
- Because our passing threshold is set to very low, every example should pass with a reasonable LLM. 
- What this practically means is that all training examples will be added as few-shot examples to the prompt.

In [24]:
from dspy import Example

train_examples = [
    Example(query_str="What did the author do growing up?", answer="The author wrote short stories and also worked on programming."),
    Example(query_str="What did the author do during his time at YC?", answer="organizing a Summer Founders Program, funding startups, writing essays, working on a new version of Arc, creating Hacker News, and developing internal software for YC")
]

train_examples = [t.with_inputs("query_str") for t in train_examples]

In [25]:
import nest_asyncio
nest_asyncio.apply()

In [26]:
from dspy.teleprompt import BootstrapFewShot
from llama_index.core.evaluation import SemanticSimilarityEvaluator

evaluator = SemanticSimilarityEvaluator(similarity_threshold=0.5)

# Validation logic: check that the predicted answer is correct.
# Also check that the retrieved context does actually contain that answer.
def validate_context_and_answer(example, pred, trace=None):
    result = evaluator.evaluate(response=pred.answer, reference=example.answer)
    return result.passing

# Set up a basic teleprompter, which will compile our RAG program.
teleprompter = BootstrapFewShot(max_labeled_demos=0, metric=validate_context_and_answer)

# Compile!
compiled_dspy_qp = teleprompter.compile(dspy_qp, trainset=train_examples)

its ok 'QueryPipeline' object has no attribute '__pydantic_fields_set__'
its ok 'QueryPipeline' object has no attribute '__pydantic_fields_set__'
its ok 'InputComponent' object has no attribute '__pydantic_fields_set__'
its ok 'RetrieverComponent' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' object has no attribute '__pydantic_fields_set__'
its ok 'OpenAIEmbedding' ob

its ok 'SimpleVectorStore' object has no attribute '__pydantic_fields_set__'
its ok 'SimpleVectorStore' object has no attribute '__pydantic_fields_set__'
its ok 'SimpleVectorStore' object has no attribute '__pydantic_fields_set__'
its ok 'SimpleVectorStore' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
Unexpected error during deepcopy: function() missing required argument 'code' (pos 1)
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__'
its ok 'SentenceSplitter' object has no attribute '__pydantic_fields_set__

  0%|          | 0/2 [00:00<?, ?it/s]ERROR:dspy.teleprompt.bootstrap:[2m2024-09-06T22:10:46.440926Z[0m [[31m[1merror    [0m] [1mFailed to run or to evaluate example Example({'query_str': 'What did the author do growing up?', 'answer': 'The author wrote short stories and also worked on programming.'}) (input_keys={'query_str'}) with <function validate_context_and_answer at 0x7fc438f49300> due to 'QueryPipeline' object has no attribute '__pydantic_fields_set__'.[0m [[0m[1m[34mdspy.teleprompt.bootstrap[0m][0m [36mfilename[0m=[35mbootstrap.py[0m [36mlineno[0m=[35m211[0m
ERROR:dspy.teleprompt.bootstrap:[2m2024-09-06T22:10:46.441891Z[0m [[31m[1merror    [0m] [1mFailed to run or to evaluate example Example({'query_str': 'What did the author do during his time at YC?', 'answer': 'organizing a Summer Founders Program, funding startups, writing essays, working on a new version of Arc, creating Hacker News, and developing internal software for YC'}) (input_keys={'query_s

Bootstrapped 0 full traces after 2 examples in round 0.





In [None]:
# test this out 
compiled_dspy_qp(query_str="How did PG meet Jessica Livingston?")

In [None]:
# [optional]: inspect history
turbo.inspect_history(n=1)

## [Part 2] Build and Optimize Query Pipelines with Existing Prompts

Build a query pipeline similar to the previous section. But instead of directly using DSPy signatures/predictors, we can build DSPyComponent modules from LlamaIndex prompts directly. 

This allows you to write any LlamaIndex prompt and trust that it'll be optimized in DSPy.

In [None]:
from llama_index.core.prompts import PromptTemplate

# let's try a fun prompt that writes in Shakespeare! 
qa_prompt_template = PromptTemplate("""\
Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, \
answer the query.

Write in the style of a Shakespearean sonnet.

Query: {query_str}
Answer: 
""")

In [None]:
from llama_index.core.query_pipeline import QueryPipeline as QP, InputComponent, FnComponent
from dspy.predict.llamaindex import DSPyComponent, LlamaIndexModule

dspy_component = DSPyComponent.from_prompt(qa_prompt_template)

retriever_post = FnComponent(
    lambda contexts: "\n\n".join([n.get_content() for n in contexts])
)


p = QP(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "retriever_post": retriever_post,
        "synthesizer": dspy_component,
    }
)
p.add_link("input", "retriever")
p.add_link("retriever", "retriever_post")
p.add_link("input", "synthesizer", dest_key="query_str")
p.add_link("retriever_post", "synthesizer", dest_key="context_str")


dspy_qp = LlamaIndexModule(p)

In [None]:
dspy_component

In [None]:
# check the inferred signature
dspy_component.predict_module.signature

In [None]:
from dspy.teleprompt import BootstrapFewShot
from llama_index.core.evaluation import SemanticSimilarityEvaluator
from dspy import Example

output_key = "sonnet_answer"
train_example_dicts = [
    {"query_str": "What did the author do growing up?", output_key: "The author wrote short stories and also worked on programming."},
    {"query_str": "What did the author do during his time at YC?", output_key: "organizing a Summer Founders Program, funding startups, writing essays, working on a new version of Arc, creating Hacker News, and developing internal software for YC"}
]
train_examples = [Example(**t).with_inputs("query_str") for t in train_example_dicts]

evaluator = SemanticSimilarityEvaluator(similarity_threshold=0.5)
# Validation logic: check that the predicted answer is correct.
# Also check that the retrieved context does actually contain that answer.
def validate_context_and_answer(example, pred, trace=None):
    result = evaluator.evaluate(response=getattr(pred, output_key), reference=getattr(example, output_key))
    return result.passing

# Set up a basic teleprompter, which will compile our RAG program.
teleprompter = BootstrapFewShot(max_labeled_demos=0, metric=validate_context_and_answer)

# Compile!
compiled_dspy_qp = teleprompter.compile(dspy_qp, trainset=train_examples)

In [None]:
# test this out 
compiled_dspy_qp(query_str="How did PG meet Jessica Livingston?")

In [None]:
# [optional]: inspect the optimized prompt 
turbo.inspect_history(n=1)

## [Part 3] Port over Optimized Prompts to LlamaIndex using the DSPy Prompt Template

Extract out a prompt from an existing compiled DSPy module, and then port it over to any LlamaIndex pipeline! 

In the example below we use our `DSPyPromptTemplate` to extract out the compiled few-shot prompt from the optimized query pipeline. 

We then plug it into a separate query engine over the PG essay.

In [None]:
from dspy.predict.llamaindex import DSPyPromptTemplate

# NOTE: you cannot do DSPyPromptTemplate(dspy_component.predict_module) - the predict_module is replaced.
qa_prompt_tmpl = DSPyPromptTemplate(compiled_dspy_qp.query_pipeline.module_dict["synthesizer"].predict_module)

In [None]:
print(qa_prompt_tmpl.format(query_str="hello?", context_str="this is my context"))

In [None]:
query_engine = index.as_query_engine(
    text_qa_template=qa_prompt_tmpl
)

In [None]:
response = query_engine.query("what did the author do at RISD?")

In [None]:
print(str(response))