# Prompt Optimization with DSPy

<a target="_blank" href="https://colab.research.google.com/github/deepset-ai/haystack-cookbook/blob/main/notebooks/prompt_optimization_with_dspy.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" width="200" alt="Open In Colab"/>
</a>

<img src="https://raw.githubusercontent.com/stanfordnlp/dspy/main/docs/images/DSPy8.png" width="400" style="display:inline;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<img src="https://haystack.deepset.ai/images/haystack-ogimage.png" width="430" style="display:inline;">

When building applications with LLMs, writing effective prompts is a long process of trial and error.
Often, if you switch models, you also have to change the prompt.
What if you could automate this process?

That's where **DSPy** comes in - a framework designed to algorithmically optimize prompts for Language Models.
By applying classical machine learning concepts (training and evaluation data, metrics, optimization), DSPy generates better prompts for a given model and task.

In this notebook, we will see **how to combine DSPy with the robustness of Haystack Pipelines**.
- ▶️ Start from a Haystack RAG pipeline with a basic prompt
- 🎯 Define a goal (in this case, get correct and concise answers)
- 📊 Create a DSPy program, define data and metrics
- ✨ Optimize and evaluate -> improved prompt
- 🚀 Build a refined Haystack RAG pipeline using the optimized prompt

>[Prompt Optimization with DSPy](#scrollTo=OWWPapD1mhqs)

>>[Setup](#scrollTo=S-j3AJ3lne-o)

>>[Load data](#scrollTo=45hUq3KCp-bG)

>>[Initial Haystack pipeline](#scrollTo=eTy01somr_xl)

>>[DSPy](#scrollTo=mZQd-J9zzpCF)

>>>[DSPy Signature](#scrollTo=QzUT-Nqc1GRp)

>>>[DSPy RAG module](#scrollTo=dmee88jC29qp)

>>>[Create training and dev sets](#scrollTo=yX1q7a_J5YgP)

>>>[Define a metric](#scrollTo=JRjG8dPU8dpp)

>>>[Evaluate unoptimized RAG module](#scrollTo=eQX9ypSN-cEU)

>>>[Optimization](#scrollTo=qOJRB6h6_SHE)

>>>[Evaluate optimized RAG module](#scrollTo=4aa2dU0BAngE)

>>>[Inspect the optimized prompt](#scrollTo=VgADv_j2BObc)

>>[Optimized Haystack Pipeline](#scrollTo=ZSn00x5FBisv)



## Setup

In [1]:
! pip install haystack-ai datasets dspy-ai sentence-transformers

Collecting datasets
  Using cached datasets-2.20.0-py3-none-any.whl.metadata (19 kB)
Collecting dspy-ai
  Downloading dspy_ai-2.4.12-py3-none-any.whl.metadata (38 kB)
Collecting pyarrow-hotfix (from datasets)
  Using cached pyarrow_hotfix-0.6-py3-none-any.whl.metadata (3.6 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Using cached dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Using cached xxhash-3.4.1-cp310-cp310-win_amd64.whl.metadata (12 kB)
Collecting multiprocess (from datasets)
  Using cached multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Collecting joblib<=1.3.2 (from dspy-ai)
  Using cached joblib-1.3.2-py3-none-any.whl.metadata (5.4 kB)
Collecting optuna (from dspy-ai)
  Using cached optuna-3.6.1-py3-none-any.whl.metadata (17 kB)
Collecting structlog (from dspy-ai)
  Downloading structlog-24.4.0-py3-none-any.whl.metadata (7.3 kB)
Collecting alembic>=1.5.0 (from optuna->dspy-ai)
  Downloading alembic-1.13.2-py3-none-any.whl.met

In [2]:
import os
from getpass import getpass
from rich import print

from dotenv import load_dotenv
load_dotenv()

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API key:")

## Load data

We will use the first 1000 rows of a [labeled PubMed dataset](https://huggingface.co/datasets/vblagoje/PubMedQA_instruction/viewer/default/train?row=0) with questions, contexts and answers.

Initially, we will use only the contexts as documents and write them to a Document Store.

(Later, we will also use the questions and answers from a small subset of the dataset to create training and dev sets for optimization.)

In [3]:
from datasets import load_dataset
from haystack import Document

dataset = load_dataset("vblagoje/PubMedQA_instruction", split="train")
dataset = dataset.select(range(1000))


Downloading readme:   0%|          | 0.00/498 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/274M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/986k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/272458 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1000 [00:00<?, ? examples/s]

In [4]:
dataset

Dataset({
    features: ['instruction', 'context', 'response', 'category'],
    num_rows: 1000
})

In [10]:
print(dataset["instruction"][0]),print(dataset["context"][0]),print(dataset["response"][0])

(None, None, None)

In [11]:
docs = [Document(content=doc["context"]) for doc in dataset]

In [12]:
docs

[Document(id=f6fde0752a035f7a15860dfa6c45d3ee05380198f18abf43b7b4923ec44c9985, content: 'Chronic rhinosinusitis (CRS) is a heterogeneous disease with an uncertain pathogenesis. Group 2 inna...'),
 Document(id=8889ef27dbfe0b3cc5ba24652b393fc9e39cd49d21db1b7dbeb8a79363b4fb12, content: 'Phosphatidylethanolamine N-methyltransferase (PEMT), a liver enriched enzyme, is responsible for app...'),
 Document(id=699ac0bd51891960eb58709be9f2ffef41fcbc7ea51f247c7b922e3ad1e358c3, content: 'Psammaplin A (PsA) is a natural product isolated from marine sponges, which has been demonstrated to...'),
 Document(id=8c714387bf3999ef9b3e11997d8dc5f894ee2ffc281202db4c1a66faf65e5752, content: 'This study examined links between DNA methylation and birth weight centile (BWC), and explored the i...'),
 Document(id=7a59e616dab6b59e767513a13b3cd3551843c52d3349c1fd2a6c827b9885912c, content: 'Tumor microenvironment immunity is associated with breast cancer outcome. A high lymphocytic infiltr...'),
 Document(id=1a48285

In [13]:
from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()
document_store.write_documents(docs)

1000

In [14]:
document_store.filter_documents()[:5]

[Document(id=f6fde0752a035f7a15860dfa6c45d3ee05380198f18abf43b7b4923ec44c9985, content: 'Chronic rhinosinusitis (CRS) is a heterogeneous disease with an uncertain pathogenesis. Group 2 inna...'),
 Document(id=8889ef27dbfe0b3cc5ba24652b393fc9e39cd49d21db1b7dbeb8a79363b4fb12, content: 'Phosphatidylethanolamine N-methyltransferase (PEMT), a liver enriched enzyme, is responsible for app...'),
 Document(id=699ac0bd51891960eb58709be9f2ffef41fcbc7ea51f247c7b922e3ad1e358c3, content: 'Psammaplin A (PsA) is a natural product isolated from marine sponges, which has been demonstrated to...'),
 Document(id=8c714387bf3999ef9b3e11997d8dc5f894ee2ffc281202db4c1a66faf65e5752, content: 'This study examined links between DNA methylation and birth weight centile (BWC), and explored the i...'),
 Document(id=7a59e616dab6b59e767513a13b3cd3551843c52d3349c1fd2a6c827b9885912c, content: 'Tumor microenvironment immunity is associated with breast cancer outcome. A high lymphocytic infiltr...')]

## Initial Haystack pipeline

Let's create a simple RAG Pipeline in Haystack. For more information, see [the documentation](https://docs.haystack.deepset.ai/docs/get_started).

Next, we will see how to improve the prompt.

In [15]:
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack import Pipeline


retriever = InMemoryBM25Retriever(document_store, top_k=3)
generator = OpenAIGenerator(model="gpt-3.5-turbo")

template = """
Given the following information, answer the question.

Context:
{% for document in documents %}
    {{ document.content }}
{% endfor %}

Question: {{question}}
Answer:
"""

prompt_builder = PromptBuilder(template=template)


rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", generator)

rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm")

<haystack.core.pipeline.pipeline.Pipeline object at 0x000002771C9B7730>
🚅 Components
  - retriever: InMemoryBM25Retriever
  - prompt_builder: PromptBuilder
  - llm: OpenAIGenerator
🛤️ Connections
  - retriever.documents -> prompt_builder.documents (List[Document])
  - prompt_builder.prompt -> llm.prompt (str)

Let's ask some questions...

In [19]:
question = "What effects does ketamine have on rat neural stem cells?"

response = rag_pipeline.run({"retriever": {"query": question}, "prompt_builder": {"question": question}})

print(response["llm"]["replies"][0])

In [20]:
response

{'llm': {'replies': ['Ketamine at high concentrations (200, 500, 800, and 1000μM) significantly inhibits the proliferation of rat neural stem cells (NSCs). It also decreases intracellular Ca(2+) concentration, suppresses the activation of protein kinase C-α (PKCα), and the phosphorylation of extracellular signal-regulated kinases 1/2 (ERK1/2) in NSCs. Additionally, a combination of subthreshold concentrations of ketamine and certain inhibitors produces suprathreshold effects on NSC proliferation.'],
  'meta': [{'model': 'gpt-3.5-turbo-0125',
    'index': 0,
    'finish_reason': 'stop',
    'usage': {'completion_tokens': 114,
     'prompt_tokens': 769,
     'total_tokens': 883}}]}}

In [17]:
question = "Is the anterior cingulate cortex linked to pain-induced depression?"

pipe.run(
    {
        "embedder": {"text": question},
        "prompt_builder": {"question": question},
        "llm": {"generation_kwargs": {"max_new_tokens": 350}},
    }
)

The answers seems correct, but suppose that **our use case requires shorter answers**. How can we adjust the prompt to achieve this effect while maintaining correctness?

## DSPy

We will use DSPy to automatically improve the prompt for our goal: getting correct and short answers.

We will perform several steps:
- define a DSPy module for RAG
- create training and dev sets
- define a metric
- evaluate the unoptimized RAG module
- optimize the module
- evaluate the optimized RAG

Broadly speaking, these steps follow those listed in the [DSPy guide](https://dspy-docs.vercel.app/docs/building-blocks/solving_your_task).

In [21]:
import dspy
from dspy.primitives.prediction import Prediction


lm = dspy.OpenAI(model='gpt-3.5-turbo')
dspy.settings.configure(lm=lm)

### DSPy Signature

The RAG module involves two main tasks (smaller modules): retrieval and generation.

For generation, we need to define a signature: a declarative specification of input/output behavior of a DSPy module.
In particular, the generation module receives the `context` and a `question` as input and returns an `answer`.

In DSPy, the docstring and the field description are used to create the prompt.

In [22]:
class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""

    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="short and precise answer")

### DSPy RAG module

- the `__init__` method can be used to declare sub-modules.
- the logic of the module is contained in the `forward` method.
---
- `ChainOfThought` module encourages Language Model reasoning with a specific prompt ("Let's think step by step") and examples. [Paper](https://arxiv.org/abs/2201.11903)
- we want to reuse our Haystack retriever and the already indexed data, so we also define a `retrieve` method.





In [23]:
class RAG(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)

    # this makes it possible to use the Haystack retriever
    def retrieve(self, question):
        results = retriever.run(query=question)
        passages = [res.content for res in results['documents']]
        return Prediction(passages=passages)

    def forward(self, question):
        context = self.retrieve(question).passages
        prediction = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=prediction.answer)

### Create training and dev sets

In general, to use DSPy for prompt optimization, you have to prepare some examples for your task (or use a similar dataset).

The training set is used for optimization, while the dev set is used for evaluation.

We create them using respectively 20 and 50 examples (question and answer) from our original labeled PubMed dataset.



In [24]:
dataset

Dataset({
    features: ['instruction', 'context', 'response', 'category'],
    num_rows: 1000
})

In [26]:
print(dspy.Example(question = dataset["instruction"][0], answer=dataset["response"][0]).with_inputs('question'))

In [27]:
question_lis = ["Revelator supports distribution to FIZY",]
trainset, devset=[],[]

for i,ex in enumerate(dataset):
  example = dspy.Example(question = ex["instruction"], answer=ex["response"]).with_inputs('question')

  if i<20:
    trainset.append(example)
  elif i<70:
    devset.append(example)
  else:
    break

In [28]:
len(trainset),len(devset)

(20, 50)

### Define a metric

Defining a metric is a crucial step for evaluating and optimizing our prompt.

As we show in this example, metrics can be defined in a very customized way.

In our case, we want to focus on two aspects: correctness and brevity of the answers.
- for correctness, we use semantic similarity between the predicted answer and the ground truth answer ([Haystack SASEvaluator](https://docs.haystack.deepset.ai/docs/sasevaluator)). SAS score varies between 0 and 1.
- to encourage short answers, we add a penalty for long answers based on a simple mathematical formulation. The penalty varies between 0 (for answers of 20 words or less) and 0.5 (for answers of 40 words or more).

In [29]:
from haystack.components.evaluators import SASEvaluator
sas_evaluator = SASEvaluator()
sas_evaluator.warm_up()

def mixed_metric(example, pred, trace=None):
    semantic_similarity = sas_evaluator.run(ground_truth_answers=[example.answer], predicted_answers=[pred.answer])["score"]

    n_words=len(pred.answer.split())
    long_answer_penalty=0
    if 20<n_words<40:
      long_answer_penalty = 0.025 * (n_words - 20)
    elif n_words>=40:
      long_answer_penalty = 0.5

    return semantic_similarity - long_answer_penalty



config.json:   0%|          | 0.00/723 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/4.13k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/402 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.08M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

### Evaluate unoptimized RAG module

Let's first check how the unoptimized RAG module performs on the dev set.
Then we will optimize it.

In [30]:
uncompiled_rag = RAG()

In [32]:
len(devset),devset

(50,
 [Example({'question': 'Is increased time from neoadjuvant chemoradiation to surgery associated with higher pathologic complete response rates in esophageal cancer?', 'answer': 'A longer interval between completion of neoadjuvant chemoradiation and surgery was associated with higher pathologic complete response rates without an impact on surgical morbidity.'}) (input_keys={'question'}),
  Example({'question': 'Is epileptic focus localization based on resting state interictal MEG recordings feasible irrespective of the presence or absence of spikes?', 'answer': 'Our preliminary results suggest that accurate localization of the epileptogenic focus may be accomplished using noninvasive spontaneous "resting-state" recordings of relatively brief duration and without the need to capture definite interictal and/or ictal abnormalities.'}) (input_keys={'question'}),
  Example({'question': 'Does seminal Helicobacter pylori treatment improve sperm motility in infertile asthenozoospermic men?

In [33]:
from dspy.evaluate.evaluate import Evaluate

evaluate = Evaluate(
    metric=mixed_metric, devset=devset, num_threads=1, display_progress=True, display_table=5
)
evaluate(uncompiled_rag)

  0%|          | 0/50 [00:00<?, ?it/s]

Average Metric: 16.2783067018725 / 50  (32.6): 100%|██████████| 50/50 [01:22<00:00,  1.64s/it]  


Unnamed: 0,question,example_answer,context,pred_answer,mixed_metric
0,Is increased time from neoadjuvant chemoradiation to surgery associated with higher pathologic complete response rates in esophageal cancer?,A longer interval between completion of neoadjuvant chemoradiation and surgery was associated with higher pathologic complete response rates without an impact on surgical morbidity.,['The interval between neoadjuvant chemoradiation treatment and surgery has been described as an important predictor of pathologic response to therapy in nonesophageal cancer sites. We...,"Yes, increased time from neoadjuvant chemoradiation to surgery is associated with higher pathologic complete response rates in esophageal cancer.",✔️ [0.7792506814002991]
1,Is epileptic focus localization based on resting state interictal MEG recordings feasible irrespective of the presence or absence of spikes?,"Our preliminary results suggest that accurate localization of the epileptogenic focus may be accomplished using noninvasive spontaneous ""resting-state"" recordings of relatively brief duration and without...",['To investigate whether epileptogenic focus localization is possible based on resting state connectivity analysis of magnetoencephalographic (MEG) data. A multivariate autoregressive (MVAR) model was constructed...,Yes.,✔️ [0.0886337012052536]
2,Does seminal Helicobacter pylori treatment improve sperm motility in infertile asthenozoospermic men?,H pylori treatment significantly improves sperm motility in infertile asthenozoospermic men with elevated seminal H pylori IgA.,"['To assess the effect of treatment of seminal Helicobacter pylori in infertile asthenozoospermic men. In all, 223 infertile asthenozoospermic men were consecutively selected. They were...","Yes, treatment of seminal Helicobacter pylori in infertile asthenozoospermic men has been shown to improve sperm motility.",✔️ [0.8661220669746399]
3,Does a migrating ciliary gate compartmentalize the site of axoneme assembly in Drosophila spermatids?,"Our findings demonstrate that the ciliary gate can migrate away from the base of the cilium, thereby functioning independently of the centriole and of a...","['In most cells, the cilium is formed within a compartment separated from the cytoplasm. Entry into the ciliary compartment is regulated by a specialized gate...",Yes.,✔️ [0.06154946982860565]
4,Is individual Public Transportation Accessibility Positively Associated with Self-Reported Active Commuting?,This study extends the knowledge about the driving forces of using public transportation for commuting by examining the individual public transportation accessibility. Findings suggest that...,"['Active commuters have lower risk of chronic disease. Understanding which of the, to some extent, modifiable characteristics of public transportation that facilitate its use is...","Yes, individual public transportation accessibility is positively associated with self-reported active commuting.",✔️ [0.8073875308036804]


32.56

### Optimization

We can now compile/optimized the DSPy program we created.

This can be done using a teleprompter/optimizer, based on our metric and training set.

In particular, `BootstrapFewShot` tries to improve the metric in the training set by adding few shot examples to the prompt.

In [34]:
from dspy.teleprompt import BootstrapFewShot

optimizer = BootstrapFewShot(metric=mixed_metric)

compiled_rag = optimizer.compile(RAG(), trainset=trainset)

 20%|██        | 4/20 [00:08<00:32,  2.04s/it]


Bootstrapped 4 full traces after 5 examples in round 0.


### Evaluate optimized RAG module

Let's now see if the training has been successful, evaluating the compiled RAG module on the dev set.

In [35]:
evaluate = Evaluate(
    metric=mixed_metric, devset=devset, num_threads=1, display_progress=True, display_table=5
)
evaluate(compiled_rag)

Average Metric: 35.15759664177895 / 50  (70.3): 100%|██████████| 50/50 [01:34<00:00,  1.88s/it] 


Unnamed: 0,question,example_answer,context,pred_answer,mixed_metric
0,Is increased time from neoadjuvant chemoradiation to surgery associated with higher pathologic complete response rates in esophageal cancer?,A longer interval between completion of neoadjuvant chemoradiation and surgery was associated with higher pathologic complete response rates without an impact on surgical morbidity.,['The interval between neoadjuvant chemoradiation treatment and surgery has been described as an important predictor of pathologic response to therapy in nonesophageal cancer sites. We...,"Yes, increased time from neoadjuvant chemoradiation to surgery is associated with higher pathologic complete response rates in esophageal cancer.",✔️ [0.7792506814002991]
1,Is epileptic focus localization based on resting state interictal MEG recordings feasible irrespective of the presence or absence of spikes?,"Our preliminary results suggest that accurate localization of the epileptogenic focus may be accomplished using noninvasive spontaneous ""resting-state"" recordings of relatively brief duration and without...",['To investigate whether epileptogenic focus localization is possible based on resting state connectivity analysis of magnetoencephalographic (MEG) data. A multivariate autoregressive (MVAR) model was constructed...,"Yes, epileptic focus localization based on resting state interictal MEG recordings is feasible irrespective of the presence or absence of spikes.",✔️ [0.8550739049911499]
2,Does seminal Helicobacter pylori treatment improve sperm motility in infertile asthenozoospermic men?,H pylori treatment significantly improves sperm motility in infertile asthenozoospermic men with elevated seminal H pylori IgA.,"['To assess the effect of treatment of seminal Helicobacter pylori in infertile asthenozoospermic men. In all, 223 infertile asthenozoospermic men were consecutively selected. They were...","Yes, seminal Helicobacter pylori treatment improved sperm motility in infertile asthenozoospermic men.",✔️ [0.8639540672302246]
3,Does a migrating ciliary gate compartmentalize the site of axoneme assembly in Drosophila spermatids?,"Our findings demonstrate that the ciliary gate can migrate away from the base of the cilium, thereby functioning independently of the centriole and of a...","['In most cells, the cilium is formed within a compartment separated from the cytoplasm. Entry into the ciliary compartment is regulated by a specialized gate...","Yes, a migrating ciliary gate compartmentalizes the site of axoneme assembly in Drosophila spermatids.",✔️ [0.6595481634140015]
4,Is individual Public Transportation Accessibility Positively Associated with Self-Reported Active Commuting?,This study extends the knowledge about the driving forces of using public transportation for commuting by examining the individual public transportation accessibility. Findings suggest that...,"['Active commuters have lower risk of chronic disease. Understanding which of the, to some extent, modifiable characteristics of public transportation that facilitate its use is...","Yes, individual public transportation accessibility is positively associated with self-reported active commuting.",✔️ [0.8073875308036804]


70.32

Based on our simple metric, we got a significant improvement!

### Inspect the optimized prompt

Let's take a look at the few shot examples that made our results improve...

In [37]:
devset

[Example({'question': 'Is increased time from neoadjuvant chemoradiation to surgery associated with higher pathologic complete response rates in esophageal cancer?', 'answer': 'A longer interval between completion of neoadjuvant chemoradiation and surgery was associated with higher pathologic complete response rates without an impact on surgical morbidity.'}) (input_keys={'question'}),
 Example({'question': 'Is epileptic focus localization based on resting state interictal MEG recordings feasible irrespective of the presence or absence of spikes?', 'answer': 'Our preliminary results suggest that accurate localization of the epileptogenic focus may be accomplished using noninvasive spontaneous "resting-state" recordings of relatively brief duration and without the need to capture definite interictal and/or ictal abnormalities.'}) (input_keys={'question'}),
 Example({'question': 'Does seminal Helicobacter pylori treatment improve sperm motility in infertile asthenozoospermic men?', 'answ

In [36]:
lm.inspect_history(n=1)




Answer questions with short factoid answers.

---

Question: Do tumor-infiltrating immune cell profiles and their change after neoadjuvant chemotherapy predict response and prognosis of breast cancer?
Answer: Breast cancer immune cell subpopulation profiles, determined by immunohistochemistry-based computerized analysis, identify groups of patients characterized by high response (in the pre-treatment setting) and poor prognosis (in the post-treatment setting). Further understanding of the mechanisms underlying the distribution of immune cells and their changes after chemotherapy may contribute to the development of new immune-targeted therapies for breast cancer.

Question: Do large portion sizes increase bite size and eating rate in overweight women?
Answer: Increasing portion size led to a larger bite size and faster eating rate, but a slower reduction in eating speed during the meal. These changes may underlie greater energy intakes with exposure to large portions. Interventions 

"\n\n\nAnswer questions with short factoid answers.\n\n---\n\nQuestion: Do tumor-infiltrating immune cell profiles and their change after neoadjuvant chemotherapy predict response and prognosis of breast cancer?\nAnswer: Breast cancer immune cell subpopulation profiles, determined by immunohistochemistry-based computerized analysis, identify groups of patients characterized by high response (in the pre-treatment setting) and poor prognosis (in the post-treatment setting). Further understanding of the mechanisms underlying the distribution of immune cells and their changes after chemotherapy may contribute to the development of new immune-targeted therapies for breast cancer.\n\nQuestion: Do large portion sizes increase bite size and eating rate in overweight women?\nAnswer: Increasing portion size led to a larger bite size and faster eating rate, but a slower reduction in eating speed during the meal. These changes may underlie greater energy intakes with exposure to large portions. In

## Optimized Haystack Pipeline

We can now use the static part of the optimized prompt (including examples) and create a better Haystack RAG Pipeline.

We include an `AnswerBuilder`, to capture only the relevant part of the generation (all text after `Answer: `).

In [38]:
%%capture

static_prompt = lm.inspect_history(n=1).rpartition("---\n")[0]

In [40]:
print(static_prompt)

In [42]:
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder, AnswerBuilder
from haystack import Pipeline


template = static_prompt+"""
---

Context:
{% for document in documents %}
    «{{ document.content }}»
{% endfor %}

Question: {{question}}
Reasoning: Let's think step by step in order to
"""

new_prompt_builder = PromptBuilder(template=template)

new_retriever = InMemoryBM25Retriever(document_store, top_k=3)
new_generator = OpenAIGenerator(model="gpt-3.5-turbo")

answer_builder = AnswerBuilder(pattern="Answer: (.*)")


optimized_rag_pipeline = Pipeline()
optimized_rag_pipeline.add_component("retriever", new_retriever)
optimized_rag_pipeline.add_component("prompt_builder", new_prompt_builder)
optimized_rag_pipeline.add_component("llm", new_generator)
optimized_rag_pipeline.add_component("answer_builder", answer_builder)

optimized_rag_pipeline.connect("retriever", "prompt_builder.documents")
optimized_rag_pipeline.connect("prompt_builder", "llm")
optimized_rag_pipeline.connect("llm.replies", "answer_builder.replies")

<haystack.core.pipeline.pipeline.Pipeline object at 0x0000027756C48C40>
🚅 Components
  - retriever: InMemoryBM25Retriever
  - prompt_builder: PromptBuilder
  - llm: OpenAIGenerator
  - answer_builder: AnswerBuilder
🛤️ Connections
  - retriever.documents -> prompt_builder.documents (List[Document])
  - prompt_builder.prompt -> llm.prompt (str)
  - llm.replies -> answer_builder.replies (List[str])

Let's ask the same questions as before...

In [43]:
question = "What effects does ketamine have on rat neural stem cells?"

response = optimized_rag_pipeline.run({"retriever": {"query": question}, "prompt_builder": {"question": question}, "answer_builder": {"query": question}})

print(response["answer_builder"]["answers"][0].data)

In [44]:
question = "Is the anterior cingulate cortex linked to pain-induced depression?"
response = optimized_rag_pipeline.run({"retriever": {"query": question}, "prompt_builder": {"question": question}, "answer_builder": {"query": question}})

print(response["answer_builder"]["answers"][0].data)

The answer are correct and shorter than before!

*(Notebook by [Stefano Fiorucci](https://github.com/anakin87))*