# DSPy sandbox

Initial notebook using the references from [here](https://x.com/jerryjliu0/status/1805626753551155243), complemented from [here](https://github.com/stanfordnlp/dspy/blob/main/intro.ipynb).

## Brief Intro

DSPy is a new framework for developing LLM programs. It chains LLM calls to build robust systems, where the output of one LLM call becomes the input of the next. Each LLM call acts as a function that takes text as input and produces text as output.
DSPy is a new programming model inspired by PyTorch that provides significant control over LLM programs. The Signature abstraction streamlines LLM program codebases by encapsulating prompts and structured input/output data. DSPy's compiler optimizes instructions for each LLM program component while sourcing task examples.



In [None]:
!pip install dspy-ai

In [None]:
!pip install llama-index
## Original using a specific version
# !pip install llama-index==0.10.44

**DSPy** can be used for various tasks (QA, information extraction, text-to-SQL); the general workflow is:

1. **Collect a little bit of data.** Define examples of the inputs and outputs of your program (e.g., questions and their answers). This could just be a handful of quick examples you wrote down. If large datasets exist, the more the merrier!

1. **Write your program.** Define the modules (i.e., sub-tasks) of your program and the way they should interact together to solve your task.
1. **Define some validation logic.** What makes for a good run of your program? Maybe the answers need to have a certain length or stick to a particular format? Specify the logic that checks that.
1. **Compile!** Ask **DSPy** to _compile_ your program using your data. The compiler will use your data and validation logic to optimize your program (e.g., prompts and modules) so it's efficient and effective!
1. **Iterate.** Repeat the process by improving your data, program, validation, or by using more advanced features of the **DSPy** compiler.

In [None]:
## Minimal sample coding





# Building optimized RAG with LlamaIndex + DSPy

This notebook provides a comprehensive overview of LlamaIndex + DSPy integrations.

We show **three** core integrations:
1. **Build and optimize Query Pipelines with DSPy predictors**: The first section shows you how to write DSPy code to define signatures for LLM inputs/outputs. Then port over these components to overall workflows within LlamaIndex Query pipelines, and then end-to-end optimize the entire system.

2. **Build and optimize Query Pipelines with Existing Prompts**: Instead of writing DSPy signatures, you can just define a LlamaIndex prompt template, and our converter will auto-optimize it for you.

3. **Port over DSPy-Optimized Prompts to any LlamaIndex Module**: Possible through our `DSPyPromptTemplate` - translate an optimized prompt through DSPy into any module that requires prompts in LlamaIndex.

## Setup

Define the LLM setting for DSPy (note: this is separate from using the LlamaIndex LLMs), and also the answer signature.

In [None]:
import dspy

turbo = dspy.OpenAI(model='gpt-3.5-turbo')
dspy.settings.configure(lm=turbo)

In [None]:
import dspy

class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""

    context_str = dspy.InputField(desc="contains relevant facts")
    query_str = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

## [Part 1] Build and Optimize a Query Pipeline with DSPy Modules

Use our DSPy query components to plugin DSPy prompts/LLMs, stitch together with our query pipeline abstraction.

Any query pipeline can be plugged into our `LlamaIndexModule`. We can then let DSPy optimize the entire thing e2e.

#### Load Data, Build Index

In [None]:
# port it over to another index  (paul graham example) 

!wget https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt -O paul_graham_essay.txt

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

reader = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"])
docs = reader.load_data()

index = VectorStoreIndex.from_documents(docs)

In [None]:
retriever = index.as_retriever(similarity_top_k=2)

#### Build Query Pipeline

Replace the synthesis piece with the DSPy component (make sure GenerateAnswer matches signature of inputs/outputs).

In [None]:
from llama_index.core.query_pipeline import QueryPipeline as QP, InputComponent, FnComponent
from dspy.predict.llamaindex import DSPyComponent, LlamaIndexModule

dspy_component = DSPyComponent(
    dspy.ChainOfThought(GenerateAnswer)
)

retriever_post = FnComponent(
    lambda contexts: "\n\n".join([n.get_content() for n in contexts])
)


p = QP(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "retriever_post": retriever_post,
        "synthesizer": dspy_component,
    }
)
p.add_link("input", "retriever")
p.add_link("retriever", "retriever_post")
p.add_link("input", "synthesizer", dest_key="query_str")
p.add_link("retriever_post", "synthesizer", dest_key="context_str")


dspy_qp = LlamaIndexModule(p)

In [None]:
output = dspy_qp(query_str="what did the author do in YC")

In [None]:
output

#### Optimize Query Pipeline

Let's try optimizing the query pipeline with few-shot examples.

We define a toy dataset with two examples. We then use our `SemanticSimilarityEvaluator` to define a custom eval function to pass to the DSPy teleprompter.
- Because our passing threshold is set to very low, every example should pass with a reasonable LLM. 
- What this practically means is that all training examples will be added as few-shot examples to the prompt.

In [None]:
from dspy import Example

train_examples = [
    Example(query_str="What did the author do growing up?", answer="The author wrote short stories and also worked on programming."),
    Example(query_str="What did the author do during his time at YC?", answer="organizing a Summer Founders Program, funding startups, writing essays, working on a new version of Arc, creating Hacker News, and developing internal software for YC")
]

train_examples = [t.with_inputs("query_str") for t in train_examples]

In [None]:
import nest_asyncio
nest_asyncio.apply()

In [None]:
from dspy.teleprompt import BootstrapFewShot
from llama_index.core.evaluation import SemanticSimilarityEvaluator

evaluator = SemanticSimilarityEvaluator(similarity_threshold=0.5)

# Validation logic: check that the predicted answer is correct.
# Also check that the retrieved context does actually contain that answer.
def validate_context_and_answer(example, pred, trace=None):
    result = evaluator.evaluate(response=pred.answer, reference=example.answer)
    return result.passing

# Set up a basic teleprompter, which will compile our RAG program.
teleprompter = BootstrapFewShot(max_labeled_demos=0, metric=validate_context_and_answer)

# Compile!
compiled_dspy_qp = teleprompter.compile(dspy_qp, trainset=train_examples)

In [None]:
# test this out 
compiled_dspy_qp(query_str="How did PG meet Jessica Livingston?")

In [None]:
# [optional]: inspect history
turbo.inspect_history(n=1)

## [Part 2] Build and Optimize Query Pipelines with Existing Prompts

Build a query pipeline similar to the previous section. But instead of directly using DSPy signatures/predictors, we can build DSPyComponent modules from LlamaIndex prompts directly. 

This allows you to write any LlamaIndex prompt and trust that it'll be optimized in DSPy.

In [None]:
from llama_index.core.prompts import PromptTemplate

# let's try a fun prompt that writes in Shakespeare! 
qa_prompt_template = PromptTemplate("""\
Context information is below.
---------------------
{context_str}
---------------------
Given the context information and not prior knowledge, \
answer the query.

Write in the style of a Shakespearean sonnet.

Query: {query_str}
Answer: 
""")

In [None]:
from llama_index.core.query_pipeline import QueryPipeline as QP, InputComponent, FnComponent
from dspy.predict.llamaindex import DSPyComponent, LlamaIndexModule

dspy_component = DSPyComponent.from_prompt(qa_prompt_template)

retriever_post = FnComponent(
    lambda contexts: "\n\n".join([n.get_content() for n in contexts])
)


p = QP(verbose=True)
p.add_modules(
    {
        "input": InputComponent(),
        "retriever": retriever,
        "retriever_post": retriever_post,
        "synthesizer": dspy_component,
    }
)
p.add_link("input", "retriever")
p.add_link("retriever", "retriever_post")
p.add_link("input", "synthesizer", dest_key="query_str")
p.add_link("retriever_post", "synthesizer", dest_key="context_str")


dspy_qp = LlamaIndexModule(p)

In [None]:
# check the inferred signature
dspy_component.predict_module.signature

In [None]:
from dspy.teleprompt import BootstrapFewShot
from llama_index.core.evaluation import SemanticSimilarityEvaluator
from dspy import Example

output_key = "sonnet_answer"
train_example_dicts = [
    {"query_str": "What did the author do growing up?", output_key: "The author wrote short stories and also worked on programming."},
    {"query_str": "What did the author do during his time at YC?", output_key: "organizing a Summer Founders Program, funding startups, writing essays, working on a new version of Arc, creating Hacker News, and developing internal software for YC"}
]
train_examples = [Example(**t).with_inputs("query_str") for t in train_example_dicts]

evaluator = SemanticSimilarityEvaluator(similarity_threshold=0.5)
# Validation logic: check that the predicted answer is correct.
# Also check that the retrieved context does actually contain that answer.
def validate_context_and_answer(example, pred, trace=None):
    result = evaluator.evaluate(response=getattr(pred, output_key), reference=getattr(example, output_key))
    return result.passing

# Set up a basic teleprompter, which will compile our RAG program.
teleprompter = BootstrapFewShot(max_labeled_demos=0, metric=validate_context_and_answer)

# Compile!
compiled_dspy_qp = teleprompter.compile(dspy_qp, trainset=train_examples)

In [None]:
# test this out 
compiled_dspy_qp(query_str="How did PG meet Jessica Livingston?")

In [None]:
# [optional]: inspect the optimized prompt 
turbo.inspect_history(n=1)

## [Part 3] Port over Optimized Prompts to LlamaIndex using the DSPy Prompt Template

Extract out a prompt from an existing compiled DSPy module, and then port it over to any LlamaIndex pipeline! 

In the example below we use our `DSPyPromptTemplate` to extract out the compiled few-shot prompt from the optimized query pipeline. 

We then plug it into a separate query engine over the PG essay.

In [None]:
from dspy.predict.llamaindex import DSPyPromptTemplate

# NOTE: you cannot do DSPyPromptTemplate(dspy_component.predict_module) - the predict_module is replaced.
qa_prompt_tmpl = DSPyPromptTemplate(compiled_dspy_qp.query_pipeline.module_dict["synthesizer"].predict_module)

In [None]:
print(qa_prompt_tmpl.format(query_str="hello?", context_str="this is my context"))

In [None]:
query_engine = index.as_query_engine(
    text_qa_template=qa_prompt_tmpl
)

In [None]:
response = query_engine.query("what did the author do at RISD?")

In [None]:
print(str(response))