# Optimizing with DSPy

Most folks who have heard about DSPy are familiar with the fact that DSPy offers automated optimizations - both for prompts and for finetuning. It is definitely one of the standout features of DSPy, as using **[Optimizers](https://dspy.ai/learn/optimization/optimizers/?h=optimizer)** is an extremely powerful way to further improve the quality of your solution. 

##### Note:
As we saw in the previous notebook, DSPy offers many other benefits beyond optimization. We emphasize this point because you shouldn't discount using DSPy if your customer doesn't have the training examples needed (we typically recommend 70+ examples) for optimization when you start building your solution. You can still benefit from all the other features of DSPy. 

**The defining value proposition of DSPy is providing a declarative approach to developing GenAI solutions.**

Additionally, while most customers won't have a curated dataset when they start a project, they will want to log interactions for their production applications, which can then be used to create a training set for the DSPy Optimizers.

### Optimizing DSPy programs in Databricks

With that said, lets look at how you can leverage DSPy's Optimizers.

This notebook looks at optimizing the RAG app that we previously built, once again focusing on highlighting the seamless integration with Databricks & Mlflow native features, rather than results (as this uses a dummy dataset).

In [0]:
# TODO change to pypi version when PR merged
%pip install -q dspy databricks-agents mlflow databricks-vectorsearch
dbutils.library.restartPython()

In [0]:
import dspy
import math
import mlflow
import pandas as pd
from mlflow.models import ModelConfig

from dspy.retrieve.databricks_rm import DatabricksRM

from databricks.agents.evals import judges
from databricks.agents.evals import generate_evals_df

from databricks.vector_search.client import VectorSearchClient

In [0]:
config_file = "config.yaml"
model_config = ModelConfig(development_config=config_file)

In [0]:
CATALOG = model_config.get("catalog")
SCHEMA = model_config.get("schema")
VOLUME = model_config.get("volume")

VECTOR_SEARCH_ENDPOINT = model_config.get("vector_search_endpoint")
VECTOR_SEARCH_INDEX = model_config.get("vector_search_index")
index_path = f"{CATALOG}.{SCHEMA}.{VECTOR_SEARCH_INDEX}"

model = model_config.get("chat_endpoint_name")

LM = f"databricks/{model}"

path = f"{CATALOG}/{SCHEMA}/{VOLUME}"

We'll start with the same RAG program module we wrote in the previous notebook

In [0]:
class RAG(dspy.Module):
  def __init__(self, num_docs=40, for_mosaic_agent=False): 
    # setup mlflow tracing
    mlflow.dspy.autolog()

    # setup flag indicating if the object will be deployed as a Mosaic Agent
    self.for_mosaic_agent = for_mosaic_agent
    # TODO: check with Luis, caching was on by default and didn't require me to set up a volume
    self.lm = dspy.LM(LM, cache=False)

    # setup the primary retriever pointing to the chunked documents
    self.retriever = DatabricksRM(
        databricks_index_name=index_path,
        text_column_name="page_content",
        docs_id_column_name="unique_chunk_index",
        # docs_uri_column_name="unique_chunk_index",
        columns=["source"],
        k=5,
        use_with_databricks_agent_framework=for_mosaic_agent # TODO add details from slack thread for use_with_databricks_agent_framework
    )
    # signature = QnASignature()
    self.response_generator = dspy.Predict("context, question -> answer, relevant_sources")


  def forward(self, question):
    # TODO: review with Luis & Rafi - do we need to manipulate the objects this much or is there a better way?
    if self.for_mosaic_agent:
      # When using a mosaic agent, questions can be of type ChatAgentMessage or (TODO ChatModelMessage) 
      question = question["messages"][-1]["content"]

    context = self.retriever(
        question, 
        query_type="hybrid"# Using hybrid search (embeddings + keywords search)
    )

    with dspy.context(lm=self.lm):
      response = self.response_generator(context=context, question=question)

      if self.for_mosaic_agent:
        return response.response
      return response

In [0]:
rag = RAG()

## Optimization breakdown

A typical DSPy Optimizer requires three things:

1. Your DSPy program. This may be a single module (e.g., dspy.Predict) or a complex multi-module program. In our example, this is our RAG module. 
2. A curated dataset as training inputs - the more the better, but you can start with what's feasible to create/access.
3. Your metric. This is a function that evaluates the output of your program, and assigns it a score (higher is better).

### Loading Optimization data

In [0]:
dataset = spark.read.csv(f"/Volumes/{path}/DSPy Databricks QnA - Sheet1.csv", multiLine=True, header=True, quote='"', escape='"')

In [0]:
dataset.display()

#### DSPy Examples
Once we've loaded our dataset, we need to map them to DSPy Examples

In [0]:
trainset, valset = dataset.randomSplit([0.7, 0.3], seed=15)

trainset = trainset.select("Question", "Answer").rdd.map(
    lambda row: dspy.Example({"question": row["Question"], "answer": row["Answer"]}).with_inputs("question")
).collect()

valset = valset.select("Question", "Answer").rdd.map(
    lambda row: dspy.Example({"question": row["Question"], "answer": row["Answer"]}).with_inputs("question")
).collect()

In [0]:
ex = trainset[0]
ex

In [0]:
ex.question

In [0]:
def evalute_using_mosaic_agent(example, pred, trace=None):
    # use https://docs.databricks.com/aws/en/generative-ai/agent-evaluation/llm-judge-metrics#call-judges-using-the-python-sdk
    # Running evaluation using the Mosaic Agent Evaluation
    return judges.correctness(
        request=example.question,
        response=pred.answer,
        expected_response=example.answer,
        ).value.name == "YES"

In [0]:
from dspy.evaluate.evaluate import Evaluate
from dspy.teleprompt import MIPROv2

# Set up a bootstrap optimizer, which optimizes the RAG program.
optimizer = MIPROv2(
    metric=evalute_using_mosaic_agent, # Use defined evaluation function
    prompt_model=dspy.LM(LM)
)

# Start a new MLflow run to track all evaluation metrics
with mlflow.start_run(run_name="dspy_rag_optimization"):
    # Optimize the program by identifying the best few-shot examples for the prompt used by the `response_generator` step
    optimized_rag = optimizer.compile(rag, 
                                    trainset=trainset,
                                    max_bootstrapped_demos=3,
                                    requires_permission_to_run=False
                                    )

optimized_rag.save("optimized_rag.json")

In [0]:
optimized_rag = RAG(for_mosaic_agent=True)
optimized_rag.load("optimized_rag.json")

In [0]:
optimized_rag({'messages': [{'content': 'What is Databricks?', 'role': 'user'}]})

In [0]:
eval_table = model_config.get("eval_table")
eval_table_path = f"{CATALOG}.{SCHEMA}.{eval_table}"
eval_df = spark.table(eval_table_path)

In [0]:
eval_df.display()

In [0]:
mlflow.evaluate(
  model=optimized_rag,
  data=eval_df,
  model_type="databricks-agent"
)

So far, we built a very simple chain-of-thought module for question answering and evaluated it on a small dataset, but can we do better?

In the rest of this guide, we will build a retrieval-augmented generation (RAG) program in DSPy for the same task. We'll see how this can boost the score substantially, then we'll use one of the DSPy Optimizers to compile our RAG program to higher-quality prompts, raising our scores even more.

Set up your system's retriever.
As far as DSPy is concerned, you can plug in any Python code for calling tools or retrievers. Here, we'll use the Datbricks Vector Search index we set up earlier to execute a Hybrid Semantic Search (using embeddings and keywords)