# Step-Back Prompting (Question-Answering)

One prompting technique called "Step-Back" prompting can improve performance on complex questions by first asking a "step back" question. This can be combined with regular question-answering applications by then doing retrieval on both the original and step-back question.

Read the paper [here](https://arxiv.org/abs/2310.06117)

See an excelent blog post on this by Cobus Greyling [here](https://cobusgreyling.medium.com/a-new-prompt-engineering-technique-has-been-introduced-called-step-back-prompting-b00e8954cacb)

In this cookbook we will replicate this technique. We modify the prompts used slightly to work better with chat models.

In [17]:
from langchain.chat_models import ChatAnthropic
from langchain.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnableLambda

In [2]:
# Few Shot Examples
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what can the members of The Police do?"
    },
    {
        "input": "Jan Sindel’s was born in what country?", 
        "output": "what is Jan Sindel’s personal history?"
    },
]
# We now transform these to example messages
example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)

In [3]:
# This uses two few shot examples. These are gneer
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:"""),
    # Few shot examples
    few_shot_prompt,
    # New question
    ("user", "{question}"),
])

In [32]:
question_gen = prompt | ChatAnthropic(model="claude-2") | StrOutputParser()

In [52]:
question = "was langchain around in 2021?"

In [53]:
question_gen.invoke({"question": question})

' what is the history of langchain?'

In [54]:
from langchain.utilities import GoogleSearchAPIWrapper


search = GoogleSearchAPIWrapper()

def retriever(query):
    return search.results(query, 5)

In [55]:
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

{normal_context}
{step_back_context}

Original Question: {question}
Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

In [56]:
chain = {
    # Retrieve context using the normal question
    "normal_context": RunnableLambda(lambda x: x['question']) | retriever,
    # Retrieve context using the step-back question
    "step_back_context": question_gen | retriever,
    # Pass on the question
    "question": lambda x: x["question"]
} | response_prompt | ChatAnthropic(model="claude-2") | StrOutputParser()

In [57]:
chain.invoke({"question": question})

' Based on the provided context, it does not appear that LangChain existed in 2021. The key points indicating this:\n\n- The Wikipedia article on LangChain states it was launched in October 2022 by Harrison Chase while working at Robust Intelligence.\n\n- Several articles mention capabilities of LangChain in comparison to GPT-3 capabilities in 2021, implying LangChain came after GPT-3 was already established.\n\n- The StackOverflow post asking about best databases for LangChain history indicates the project is newly exploring storage options, not something established in 2021. \n\n- The Medium article about an Elasticsearch agent using LangChain shows examples querying data from 2015-2021, again implying LangChain arose after that timeframe.\n\nSo in summary, multiple sources suggest LangChain first launched in 2022, not 2021. There is no indication in the provided context that LangChain or a predecessor existed in 2021.'

## Baseline

In [58]:
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

{normal_context}

Original Question: {question}
Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

In [59]:
chain = {
    # Retrieve context using the normal question (only the first 3 results)
    "normal_context": RunnableLambda(lambda x: x['question']) | retriever,
    # Pass on the question
    "question": lambda x: x["question"]
} | response_prompt | ChatAnthropic(model="claude-2") | StrOutputParser()

In [60]:
chain.invoke({"question": question})

' Based on the provided context, there are a few relevant points:\n\n- The Towards Data Science article notes that LLMs like GPT-3 were trained on data before September 2021, implying LangChain did not exist yet. \n\n- The Hacker News post mentions that GPT-3 based companies emerged in a "first wave" around 2021. LangChain is not mentioned as part of this first wave.\n\n- The Medium article by gil fernandes uses LangChain for a project and is from October 2022, suggesting LangChain emerged sometime after 2021.\n\n- I could not find any direct mentions of LangChain existing in 2021 in the provided snippets. \n\nIn summary, the available context indicates that LangChain most likely did not exist in 2021 and instead emerged sometime after that. The sources suggest it was not part of the initial wave of GPT-3 based companies and products in 2021.'