### Step Back Prompting

The intuition behind step-back prompting is to encourage the LLM to think deeply and engage in a form of meta-reasoning before tackling a task. We generate a more abstract question from the user query to get a broader perspective of the task, retrieving more contexts. Then, we generate the direct context based on the user query and combine these two contexts to get a more relevant answer. This is considered to be the best approach for reasoning questions, as the model gets to explore a broader view of the user query and use it to compare with the specific context from the direct user query to get a more nuanced response. To summarize, this method involves abstraction and causal reasoning to make the model more generalized because it focuses on the underlying principles as well.

For example, if we ask a question of the form "Why does my LangGraph agent astream_events return {LONG_TRACE} instead of {DESIRED_OUTPUT}" we will likely retrieve more relevant documents if we search with the more generic question "How does astream_events work with a LangGraph agent" than if we search with the specific user question.

In [1]:
from dotenv import load_dotenv, dotenv_values
import google.generativeai as genai
from IPython.display import Markdown, display
import os 


load_dotenv()
os.getenv("GOOGLE_API_KEY") 
my_api_key = os.getenv("GOOGLE_API_KEY")
genai.configure(api_key=my_api_key)

In [10]:
#### INDEXING ####

# Load blog
import bs4
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

# Split
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300, 
    chunk_overlap=50)

# Make splits
splits = text_splitter.split_documents(blog_docs)

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [11]:
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_community.vectorstores import Chroma

## Call Embedding Model
embedding = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")


vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=embedding)

retriever = vectorstore.as_retriever()

In [13]:
from langchain_core.output_parsers import StrOutputParser

from langchain_google_genai import ChatGoogleGenerativeAI

from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate

# LLM
llm = ChatGoogleGenerativeAI(model= "gemini-1.5-flash", temperature = 0)


examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what can the members of The Police do?",
    },
    {
        "input": "Jan Sindel’s was born in what country?",
        "output": "what is Jan Sindel’s personal history?",
    },
]
# We now transform these to example messages
example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:""",
        ),
        # Few shot examples
        few_shot_prompt,
        # New question
        ("user", "{question}"),
    ]
)


In [14]:
generate_queries_step_back = prompt | llm | StrOutputParser()
question = "What is task decomposition for LLM agents?"
generate_queries_step_back.invoke({"question": question})

'How can complex tasks be broken down for AI agents? \n'

In [16]:
from langchain_core.runnables import RunnablePassthrough, RunnableLambda

# Response prompt 
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

# {normal_context}
# {step_back_context}

# Original Question: {question}
# Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

chain = (
    {
        # Retrieve context using the normal question
        "normal_context": RunnableLambda(lambda x: x["question"]) | retriever,
        # Retrieve context using the step-back question
        "step_back_context": generate_queries_step_back | retriever,
        # Pass on the question
        "question": lambda x: x["question"],
    }
    | response_prompt
    |llm 
    | StrOutputParser()
)

res = chain.invoke({"question": question})

In [17]:
print(res)

Task decomposition is a crucial aspect of LLM-powered autonomous agents, enabling them to tackle complex tasks effectively. It involves breaking down a large, intricate task into smaller, more manageable subtasks. This approach allows the agent to handle each subtask individually, making the overall problem easier to solve.

Here's how task decomposition works in LLM agents:

* **Chain of Thought (CoT):** This prompting technique encourages the LLM to "think step by step," decomposing a complex task into smaller, simpler steps. This approach helps the model utilize more test-time computation and provides insights into its reasoning process.
* **Tree of Thoughts (ToT):** ToT extends CoT by exploring multiple reasoning possibilities at each step. It creates a tree structure by generating multiple thoughts for each subtask, allowing for a more comprehensive exploration of potential solutions.
* **Methods for Task Decomposition:**
    * **LLM Prompting:** Simple prompts like "Steps for XYZ