# Improving LangChain RAG with DSPy

In the following notebook, we'll explore the integration between DSPy and LangChain!

Using this integration we can take advantage of the familiarity of LangChain and LCEL, with the addition of the power of DSPy for LLM-interaction optimization.

### API Keys and Dependencies

We'll provide our OpenAI API key, and install all the required libraries for this example!

In [None]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key:")

Enter your OpenAI API key:··········


We'll want LangChain, PyDantic, and DSPy first and foremost.

> NOTE: As of the time of recording (07/24/2023) `dspy_ai==2.1.4` is required to ensure the notebook runs.

In [None]:
!pip install -qU langchain==0.2.7 pydantic==2.8.2 dspy_ai==2.1.4

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.9/40.9 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m983.6/983.6 kB[0m [31m17.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m145.4/145.4 kB[0m [31m11.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m520.4/520.4 kB[0m [31m34.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.2/302.2 kB[0m [31m22.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m374.2/374.2 kB[0m [31m26.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.8/139.8 kB[0m [31m10.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━

Then we'll install our LangChain community packages.

In [None]:
!pip install -qU langchain_openai langchain_community langchain_core langchain_qdrant

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/46.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.7/46.7 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m31.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m990.3/990.3 kB[0m [31m50.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m337.0/337.0 kB[0m [31m13.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m254.1/254.1 kB[0m [31m15.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m54.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m91.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m

We'll also install our engines for the above community packages.

In [None]:
!pip install -qU qdrant-client pymupdf

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.5/3.5 MB[0m [31m32.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.9/15.9 MB[0m [31m115.7 MB/s[0m eta [36m0:00:00[0m
[?25h

## Simple RAG with LCEL

To begin, things will look very similar to traditional LCEL RAG with LangChain!

Let's start by grabbing our document that we'll be focusing on for this session!

In [None]:
from langchain_community.document_loaders import PyMuPDFLoader

document_loader = PyMuPDFLoader("https://d1lamhf6l6yk6d.cloudfront.net/uploads/2021/08/The-pmarca-Blog-Archives.pdf")
documents = document_loader.load()

Then we'll chunk our document into bitesized pieces of context.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
split_documents = text_splitter.split_documents(documents)

Then we'll embed our chunked documents into a vectorstore - for this example we'll be using QDrant, but you could substitute your favourite vector database, vector store, or retriever here.

In [None]:
from langchain_community.vectorstores import Qdrant
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

vectorstore = Qdrant.from_documents(
    split_documents,
    embeddings,
    location=":memory:",
    collection_name="PMarca",
)

We'll finally conform our vectorstore to a retriever.

In [None]:
retriever = vectorstore.as_retriever()

We'll set a cache since we'll be making a lot of LLM calls throughout the notebook today.



In [None]:
from langchain.globals import set_llm_cache
from langchain_community.cache import SQLiteCache

set_llm_cache(SQLiteCache(database_path="cache.db"))

We'll set-up a helper function for our retriever to work in the expected fashion for our LCEL chain below.

In [None]:
def retrieve(inputs):
  return [doc.page_content for doc in retriever.invoke(inputs["question"])]

We'll be using the new `gpt-4o-mini` as our base LLM today - check out more details about this new, inexpensive model, [here](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/)!

In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

We'll set our initial prompt - which will be optimized by DSPy - and initialize our LCEL chain.




In [None]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough

prompt = PromptTemplate.from_template(
    "Given {context}, answer the question `{question}` as a tweet. Your response should only contain the tweet."
)

naive_rag_chain = (
    RunnablePassthrough.assign(context=retrieve) | prompt | llm | StrOutputParser()
)

### DSPy Integration

So far in the notebook, we've been relying on LangChain - but it's finally time to integrate DSPy!

We're going to rely on two key integrations:

- [`LangChainPredict`](https://github.com/stanfordnlp/dspy/blob/af5617aec7b298f8688e68d6087804e702e61ba0/dspy/predict/langchain.py#L34) - The `LangChainPredict` class bridges LangChain's language models with DSPy's prediction framework. It manages prompt generation, model execution, and result handling, while also offering features for state management and input/output structure determination.
- [`LangChainModule`](https://github.com/stanfordnlp/dspy/blob/af5617aec7b298f8688e68d6087804e702e61ba0/dspy/predict/langchain.py#L139) - The `LangChainModule` class wraps a LangChain Expression Language (LCEL) chain into a DSPy module. It extracts `LangChainPredict` instances from the LCEL graph, handles the chain's execution, and formats the output to be compatible with DSPy's prediction framework.

So the basic idea is: We wrap the LCEL chain into a DSPy module - and because DPSy can optimized based on in inputs/outputs of a "black-box" we can now treat this module *exactly* as we would any other DSPy module!

In [None]:
from dspy.predict.langchain import LangChainModule, LangChainPredict

zeroshot_chain = (
    RunnablePassthrough.assign(context=retrieve)
    | LangChainPredict(prompt, llm)
    | StrOutputParser()
)

zeroshot_chain = LangChainModule(
    zeroshot_chain
)

Let's try out the system we've built!

In [None]:
question = "What is the best part about California?"

zeroshot_chain.invoke({"question": question})

"California is the heart of innovation! 🌟 From Silicon Valley's tech boom to LA's entertainment magic, it offers unmatched opportunities and a vibrant culture. #CaliforniaDreaming #InnovationHub"

## Synthetic Dataset for Evaluation/Optimization

In classic form - we'll leverage the OpenAI suite of models to generate test examples that we will then go on to use with DSPy to evaluate, and then optimize and re-evaluate the optimized DSPy module!

Our synthetic datagen incorporates the following simple steps:

1. Generate a question from the provided context
2. Generate a Tweet from the context and question

In [None]:
NUM_SAMPLES_TO_GENERATE = 250

In [None]:
import tqdm

question_list = []
answer_list = []

question_llm = ChatOpenAI(model="gpt-4o-mini")
question_prompt = PromptTemplate.from_template(
    "Given a context, generate a question that could be answered by that context. You must only respond with the question. Context:\n{context}\n\Question:\n"
)

question_chain = question_prompt | question_llm | StrOutputParser()

answer_llm = ChatOpenAI(model="gpt-4o")
answer_prompt = PromptTemplate.from_template(
    "Given a context and a question, create a tweet about the question and context. You must only respond with the tweet. Context:\n{context}\n\nQuestion:\n{question}\n\Tweet:\n"
)

answer_chain = answer_prompt | answer_llm | StrOutputParser()

if NUM_SAMPLES_TO_GENERATE > len(split_documents):
  NUM_SAMPLES_TO_GENERATE = len(split_documents)
  print(f"WARNING: reducing number of samples to {NUM_SAMPLES_TO_GENERATE}")

for context in tqdm.tqdm(split_documents[:NUM_SAMPLES_TO_GENERATE]):
  question = question_chain.invoke({"context": context.page_content})
  answer = answer_chain.invoke({"context": context.page_content, "question": question})
  question_list.append(question)
  answer_list.append(answer)

100%|██████████| 250/250 [10:46<00:00,  2.58s/it]


We need to move our newly created data into a more useful form to be used with DSPy.

In [None]:
from dspy import Example

train_samples = int(NUM_SAMPLES_TO_GENERATE * 0.8)
dev_samples = int(NUM_SAMPLES_TO_GENERATE * 0.1)
val_samples = int(NUM_SAMPLES_TO_GENERATE * 0.1)

sample_count = 0

train_set = []
dev_set = []
val_set = []

for question, answer in zip(question_list, answer_list):
  if sample_count < train_samples:
    train_set.append(Example(question=question, answer=answer).with_inputs("question"))
  elif sample_count < train_samples + dev_samples:
    dev_set.append(Example(question=question, answer=answer).with_inputs("question"))
  else:
    val_set.append(Example(question=question, answer=answer).with_inputs("question"))
  sample_count += 1

Now we can create our validation logic - let's dive into the details a bit here.

First - we'll create a signature for how we want to evaluate our Tweet.

> NOTE: We're using LLM-As-A-Judge here to determine a number of scores that we will collect as "Yes" or "No".

In [None]:
import dspy

class Assess(dspy.Signature):
    """Assess the quality of a tweet along the specified dimension."""
    context = dspy.InputField(desc="ignore if N/A")
    assessed_text = dspy.InputField()
    assessment_question = dspy.InputField()
    assessment_answer = dspy.OutputField(desc="Yes or No")

Now we can set up our Judge model.

In [None]:
gpt4T = dspy.OpenAI(model="gpt-4-turbo", max_tokens=1000, model_type="chat")
METRIC = None

Let's evaluate on a number of metrics:

1. Engaging
2. Faithfulness
3. Dopeness
4. Correctness

In [None]:
def metric(gold, pred, trace=None):
    question, answer, tweet = gold.question, gold.answer, pred.output
    context = retriever.invoke(question)

    engaging = "Does the assessed text make for a self-contained, engaging tweet?"
    faithful = "Is the assessed text grounded in the context? Say no if it includes significant facts not in the context."
    dope = f"Is the assessed text dope, lit, cool, fire?"
    correct = (
        f"The text above should answer `{question}`. The gold answer is `{answer}`."
        )
    correct = f"{correct} does the assessed text communicate the same idea as the gold answer?"

    with dspy.context(lm=gpt4T):
        faithful = dspy.Predict(Assess)(
            context=context, assessed_text=tweet, assessment_question=faithful
        )
        engaging = dspy.Predict(Assess)(
            context="N/A", assessed_text=tweet, assessment_question=engaging
        )
        dope = dspy.Predict(Assess)(
            context="N/A", assessed_text=tweet, assessment_question=dope
        )
        correct = dspy.Predict(Assess)(
            context="N/A", assessed_text=tweet, assessment_question=correct
        )

    correct, engaging, faithful, dope = [
        m.assessment_answer.split()[0].lower() == "yes"
        for m in [correct, engaging, faithful, dope]
    ]
    score = (engaging + faithful + dope) if correct and (len(tweet) <= 280) else 0

    if METRIC is not None:
        if METRIC == "correct":
            return correct
        if METRIC == "engaging":
            return engaging
        if METRIC == "faithful":
            return faithful
        if METRIC == "dope":
            return dope

    if trace is not None:
        return score >= 3
    return score / 3.0

Now we can Evaluate!|

In [None]:
from dspy.evaluate.evaluate import Evaluate

evaluate = Evaluate(
    metric=metric, devset=dev_set, num_threads=8, display_progress=True, display_table=5
)

In [None]:
evaluate(zeroshot_chain)

Average Metric: 31.33333333333334 / 50  (62.7): 100%|██████████| 50/50 [00:06<00:00,  7.77it/s]

Average Metric: 31.33333333333334 / 50  (62.7%)



  df = df.applymap(truncate_cell)


Unnamed: 0,question,answer,output,tweet_response,metric
0,What are the signs that indicate a problem with a newly hired executive in a startup?,🚨 Signs of trouble with a new exec in a startup: 1️⃣ Team not noticeably better or respectful after a few months. 2️⃣ Other execs...,"Signs of trouble with a new startup executive: lack of respect from peers, painful interactions, and team performance not improving. If you notice these, it’s...","Signs of trouble with a new startup executive: lack of respect from peers, painful interactions, and team performance not improving. If you notice these, it’s...",1.0
1,What are the two common mistakes people make when firing executives?,Two common mistakes when firing execs: 1) Long transition periods - confusing & demoralizing. 2) Pulling punches - be clear & decisive. Clean breaks are...,Avoid long transition periods and pulling punches! Make a clean break and put someone new in charge to keep the organization moving forward. #Leadership #StartupTips,Avoid long transition periods and pulling punches! Make a clean break and put someone new in charge to keep the organization moving forward. #Leadership #StartupTips,0.6666666666666666
2,What are the potential benefits of terminating an executive's position in a startup?,"🚀 Terminating an executive at a startup can often be a favor: it frees them to find a better fit where they'll be more valued,...","Terminating an executive in a startup can clear the way for better talent, reduce costs, and improve team morale. It often helps the executive find...","Terminating an executive in a startup can clear the way for better talent, reduce costs, and improve team morale. It often helps the executive find...",0.0
3,What is Ben Horowitz's perspective on micromanagement in the context of management practices?,"Ben Horowitz argues that micromanagement shouldn't be condemned outright. While we all dread the hyper-controlling manager, he believes there’s value in the practice when applied...",Ben Horowitz argues that micromanagement isn't always bad; it can be essential for training and improving new executives. A little micromanagement at the right times...,Ben Horowitz argues that micromanagement isn't always bad; it can be essential for training and improving new executives. A little micromanagement at the right times...,1.0
4,"What is ""Task Relevant Maturity"" and how does it relate to micromanaging employees and executives?","Ever heard of ""Task Relevant Maturity""? 📚 Andy Grove explains it in High Output Management: Employees, even execs, need different levels of guidance based on...","""Task Relevant Maturity"" refers to an employee's experience level with a specific task. Micromanaging is beneficial for those who are immature in a task, including...","""Task Relevant Maturity"" refers to an employee's experience level with a specific task. Micromanaging is beneficial for those who are immature in a task, including...",1.0


62.67

In [None]:
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

optimizer = BootstrapFewShotWithRandomSearch(
    metric=metric, max_bootstrapped_demos=5, num_candidate_programs=3
)

Going to sample between 1 and 5 traces per predictor.
Will attempt to train 3 candidate sets.


In [None]:
optimized_chain = optimizer.compile(zeroshot_chain, trainset=train_set, valset=val_set)

Average Metric: 34.00000000000001 / 50  (68.0): 100%|██████████| 50/50 [00:07<00:00,  6.77it/s]
  df = df.applymap(truncate_cell)


Average Metric: 34.00000000000001 / 50  (68.0%)
Score: 68.0 for set: [0]
New best score: 68.0 for seed -3
Scores so far: [68.0]
Best score: 68.0


Average Metric: 34.00000000000001 / 50  (68.0): 100%|██████████| 50/50 [00:06<00:00,  7.87it/s]
  df = df.applymap(truncate_cell)


Average Metric: 34.00000000000001 / 50  (68.0%)
Score: 68.0 for set: [16]
Scores so far: [68.0, 68.0]
Best score: 68.0


  6%|▌         | 9/150 [00:10<02:44,  1.17s/it]


Bootstrapped 5 full traces after 10 examples in round 0.


Average Metric: 33.333333333333336 / 50  (66.7): 100%|██████████| 50/50 [00:46<00:00,  1.08it/s]
  df = df.applymap(truncate_cell)


Average Metric: 33.333333333333336 / 50  (66.7%)
Score: 66.67 for set: [16]
Scores so far: [68.0, 68.0, 66.67]
Best score: 68.0
Average of max per entry across top 1 scores: 0.6800000000000002
Average of max per entry across top 2 scores: 0.6800000000000002
Average of max per entry across top 3 scores: 0.7733333333333333
Average of max per entry across top 5 scores: 0.7733333333333333
Average of max per entry across top 8 scores: 0.7733333333333333
Average of max per entry across top 9999 scores: 0.7733333333333333


  3%|▎         | 4/150 [00:03<02:04,  1.18it/s]


Bootstrapped 4 full traces after 5 examples in round 0.


Average Metric: 32.66666666666667 / 50  (65.3): 100%|██████████| 50/50 [00:06<00:00,  8.03it/s]
  df = df.applymap(truncate_cell)


Average Metric: 32.66666666666667 / 50  (65.3%)
Score: 65.33 for set: [16]
Scores so far: [68.0, 68.0, 66.67, 65.33]
Best score: 68.0
Average of max per entry across top 1 scores: 0.6800000000000002
Average of max per entry across top 2 scores: 0.6800000000000002
Average of max per entry across top 3 scores: 0.7733333333333333
Average of max per entry across top 5 scores: 0.8266666666666665
Average of max per entry across top 8 scores: 0.8266666666666665
Average of max per entry across top 9999 scores: 0.8266666666666665


  2%|▏         | 3/150 [00:02<01:40,  1.47it/s]


Bootstrapped 2 full traces after 4 examples in round 0.


Average Metric: 36.66666666666667 / 50  (73.3): 100%|██████████| 50/50 [00:06<00:00,  7.78it/s]
  df = df.applymap(truncate_cell)


Average Metric: 36.66666666666667 / 50  (73.3%)
Score: 73.33 for set: [16]
New best score: 73.33 for seed 1
Scores so far: [68.0, 68.0, 66.67, 65.33, 73.33]
Best score: 73.33
Average of max per entry across top 1 scores: 0.7333333333333334
Average of max per entry across top 2 scores: 0.8133333333333332
Average of max per entry across top 3 scores: 0.8133333333333332
Average of max per entry across top 5 scores: 0.84
Average of max per entry across top 8 scores: 0.84
Average of max per entry across top 9999 scores: 0.84


  1%|          | 1/150 [00:00<01:43,  1.44it/s]


Bootstrapped 1 full traces after 2 examples in round 0.


Average Metric: 33.66666666666667 / 50  (67.3): 100%|██████████| 50/50 [00:06<00:00,  8.02it/s]

Average Metric: 33.66666666666667 / 50  (67.3%)
Score: 67.33 for set: [16]
Scores so far: [68.0, 68.0, 66.67, 65.33, 73.33, 67.33]
Best score: 73.33
Average of max per entry across top 1 scores: 0.7333333333333334
Average of max per entry across top 2 scores: 0.8133333333333332
Average of max per entry across top 3 scores: 0.8133333333333332
Average of max per entry across top 5 scores: 0.8466666666666667
Average of max per entry across top 8 scores: 0.8533333333333333
Average of max per entry across top 9999 scores: 0.8533333333333333
6 candidate programs found.



  df = df.applymap(truncate_cell)


In [None]:
evaluate(optimized_chain)

Average Metric: 32.666666666666664 / 50  (65.3): 100%|██████████| 50/50 [00:38<00:00,  1.30it/s]

Average Metric: 32.666666666666664 / 50  (65.3%)



  df = df.applymap(truncate_cell)


Unnamed: 0,question,answer,output,tweet_response,metric
0,What are the signs that indicate a problem with a newly hired executive in a startup?,🚨 Signs of trouble with a new exec in a startup: 1️⃣ Team not noticeably better or respectful after a few months. 2️⃣ Other execs...,"Signs of trouble with a new executive in a startup include lack of respect from peers, poor communication, and avoidance in interactions. If these issues...","Signs of trouble with a new executive in a startup include lack of respect from peers, poor communication, and avoidance in interactions. If these issues...",1.0
1,What are the two common mistakes people make when firing executives?,Two common mistakes when firing execs: 1) Long transition periods - confusing & demoralizing. 2) Pulling punches - be clear & decisive. Clean breaks are...,"When firing executives, avoid long transition periods and pulling punches. Make a clean break to prevent confusion and demoralization! #Leadership #StartupAdvice","When firing executives, avoid long transition periods and pulling punches. Make a clean break to prevent confusion and demoralization! #Leadership #StartupAdvice",1.0
2,What are the potential benefits of terminating an executive's position in a startup?,"🚀 Terminating an executive at a startup can often be a favor: it frees them to find a better fit where they'll be more valued,...","Terminating an executive can lead to a more motivated team, better alignment with startup goals, and the opportunity for the executive to find a role...","Terminating an executive can lead to a more motivated team, better alignment with startup goals, and the opportunity for the executive to find a role...",1.0
3,What is Ben Horowitz's perspective on micromanagement in the context of management practices?,"Ben Horowitz argues that micromanagement shouldn't be condemned outright. While we all dread the hyper-controlling manager, he believes there’s value in the practice when applied...","Ben Horowitz argues that micromanagement can be beneficial in certain situations, especially for new executives who need detailed guidance to improve their skills. It's about...","Ben Horowitz argues that micromanagement can be beneficial in certain situations, especially for new executives who need detailed guidance to improve their skills. It's about...",1.0
4,"What is ""Task Relevant Maturity"" and how does it relate to micromanaging employees and executives?","Ever heard of ""Task Relevant Maturity""? 📚 Andy Grove explains it in High Output Management: Employees, even execs, need different levels of guidance based on...","""Task Relevant Maturity"" refers to an employee's experience level with a specific task. Micromanaging is beneficial for those who are immature in a task, including...","""Task Relevant Maturity"" refers to an employee's experience level with a specific task. Micromanaging is beneficial for those who are immature in a task, including...",0.6666666666666666


65.33

In [None]:
prompt_used, output = dspy.settings.langchain_history[-1]

In [None]:
print(prompt_used)

Essential Instructions: Respond to the given question based on the provided context. Your answer should be concise and formatted as if it were a tweet. Ensure that the response does not exceed the Twitter character limit and is appropriate for a public audience.

---

Follow the following format.

Context: ${context}
Question: ${question}
Tweet Response: ${tweet_response}

---

Context:
[1] «companies can oaen tolerate internal rivalries and warfare;
startups cannot.
Being a startup executive is not an easy job. The rewards are
substantial — the ability to contribute directly to the startups’s
success; the latitude to build and run an organization according
to her own theories and principles; and a meaningful equity
stake that can lead to personal Xnancial independence if the
startup succeeds — but the responsibilities are demanding and
intense.
Hiring
First, if you’re not sure whether you need an executive for a function,
don’t hire one.
Startups, particularly well-funded startups, oa

In [None]:
demos = [
    eg
    for eg in optimized_chain.modules[0].demos
    if hasattr(eg, "augmented") and eg.augmented
]

In [None]:
demos

[Example({'augmented': True, 'question': 'What are the risks associated with hiring an executive in a startup?', 'context': ['companies can oaen tolerate internal rivalries and warfare;\nstartups cannot.\nBeing a startup executive is not an easy job. The rewards are\nsubstantial — the ability to contribute directly to the startups’s\nsuccess; the latitude to build and run an organization according\nto her own theories and principles; and a meaningful equity\nstake that can lead to personal Xnancial independence if the\nstartup succeeds — but the responsibilities are demanding and\nintense.\nHiring\nFirst, if you’re not sure whether you need an executive for a function,\ndon’t hire one.\nStartups, particularly well-funded startups, oaen hire executives\ntoo early. Particularly before a startup has achieved product/\nmarket Xt, it is oMen better to have a highly motivated manager or\ndirector running a function than an executive.\nHiring an executive too quickly can lead to someone who i