# RAG part 2

## Query transformation

The idea is that we can transform the query into something which can make retrieval easier

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = os.getenv("LANGCHAIN_API_KEY")
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")

## Create Retriever

In [2]:


# Load blog
import bs4
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

# Split
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300, 
    chunk_overlap=50)

# Make splits
splits = text_splitter.split_documents(blog_docs)

# Index
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=OpenAIEmbeddings())

retriever = vectorstore.as_retriever()


### Idea 1: Multi query

A single query be reframed into multiple queries, so that we can query into a vector DB multiple times and take the union of the documents fetched everytime. 
![rag-multi-query](rag_part_2_multi_query.png)

Using prompts

In [3]:
from langchain.prompts import ChatPromptTemplate

# Multi Query: Different Perspectives
template = """You are an AI language model assistant. Your task is to generate five 
different versions of the given user question to retrieve relevant documents from a vector 
database. By generating multiple perspectives on the user question, your goal is to help
the user overcome some of the limitations of the distance-based similarity search. 
Provide these alternative questions separated by newlines. Original question: {question}"""
prompt_perspectives = ChatPromptTemplate.from_template(template)

from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4-turbo", temperature=0.0)

generate_queries = (
    prompt_perspectives 
    | llm
    | StrOutputParser() 
    | (lambda x: x.split("\n"))
)

In [4]:
from IPython.display import display, Markdown

generate_queries.invoke({"question": "What is task decomposition for LLM agents?"})

['How do LLM agents break down complex tasks into simpler components?',
 'What does task decomposition entail in the context of large language models?',
 'Can you explain the process of task decomposition in large language model agents?',
 'What are the methods used by LLM agents for decomposing tasks?',
 'How do large language models manage task decomposition?']

In [5]:
from langchain.load import dumps, loads

def get_unique_union(documents: list[list]):
    """ Unique union of retrieved docs """
    # Flatten list of lists, and convert each Document to string
    flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique documents
    unique_docs = list(set(flattened_docs))
    # Return
    return [loads(doc) for doc in unique_docs]

# Retrieve
question = "What is task decomposition for LLM agents?"
retrieval_chain = generate_queries | retriever.map() | get_unique_union

In [6]:
docs = retrieval_chain.invoke({"question": question})
len(docs)

  warn_beta(


9

In [7]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough

# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

# llm = ChatOpenAI(temperature=0)

final_rag_chain = (
    {"context": retrieval_chain, 
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

In [8]:
display(Markdown(final_rag_chain.invoke({"question":question})))


Task decomposition for LLM (Large Language Model) agents refers to the process of breaking down a complex task into smaller, more manageable subtasks. This technique enhances the performance of LLMs on intricate tasks by allowing them to focus on simpler components one at a time. The concept is rooted in the "Chain of Thought" (CoT) prompting technique, which instructs the model to think step-by-step, thereby utilizing more test-time computation to dissect difficult tasks into simpler steps. This not only transforms large tasks into multiple manageable tasks but also provides insights into the model’s reasoning process.

Further extending this concept, the "Tree of Thoughts" approach, as mentioned in the provided documents, explores multiple reasoning possibilities at each step of task decomposition. It generates a tree structure by creating multiple thoughts per step, which can be navigated using search strategies like breadth-first search (BFS) or depth-first search (DFS). Each state in this tree can be evaluated using methods such as classifier prompts or majority votes.

Task decomposition can be implemented in various ways:
1. By using simple LLM prompting techniques, such as asking for "Steps for XYZ" or querying about the subgoals for achieving a specific objective.
2. By employing task-specific instructions, for example, instructing the model to "Write a story outline" for novel writing.
3. Through human inputs, where the decomposition is guided by direct human interaction or oversight.

This methodical breakdown not only aids in the systematic handling of tasks but also aligns with enhancing the interpretability and effectiveness of LLMs in performing complex tasks.

In [12]:
display(Markdown(final_rag_chain.invoke({"question":"Describe some key sections of the article in a pointwise manner."})))


Based on the provided documents, here are some key sections of the article described in a pointwise manner:

1. **ReAct Prompt Template**:
   - Introduces a structured format for Language Model (LLM) operations, which includes steps like Thought, Action, and Observation, repeated multiple times to guide the LLM's processing and response generation.

2. **Challenges in Building LLM-Centered Agents**:
   - Discusses common limitations encountered when developing agents centered around large language models, highlighting practical challenges in implementation and operation.

3. **Detailed Coding Instructions**:
   - Provides comprehensive guidelines for coding, including the layout of core classes, functions, and methods with comments on their purposes.
   - Specifies the format for outputting code in markdown blocks, ensuring that filenames and languages are appropriately formatted and that the code is fully functional and compatible across different files.

4. **Performance Evaluation**:
   - Outlines methods for continuous review and analysis of actions to ensure optimal performance.
   - Emphasizes the importance of self-criticism, reflection on past decisions, and efficient command execution to minimize resource usage.

5. **Resources for Agent Operation**:
   - Lists resources available to the agent, such as internet access, long-term memory management, and GPT-3.5 powered agents for task delegation.

6. **Experiments and Observations**:
   - Mentions specific experiments like those conducted in AlfWorld Env and HotpotQA, noting particular issues like hallucination and inefficient planning.

These sections collectively provide insights into the structure, challenges, and operational guidelines for building and managing LLM-centered agents, as well as evaluating their performance in practical scenarios.

In [14]:
display(Markdown(final_rag_chain.invoke({"question":"Describe some of the studies mentioned in this article done in the area of 'Generative Agents Simulation' . "})))


The article mentions a study titled "Generative Agents" by Park et al. (2023), which is a significant experiment in the area of Generative Agents Simulation. In this study, 25 virtual characters, each controlled by a large language model (LLM)-powered agent, interact within a sandbox environment inspired by The Sims. This simulation aims to create believable simulacra of human behavior for interactive applications.

The design of these generative agents integrates LLMs with memory, planning, and reflection mechanisms. This allows the agents to behave based on past experiences and interact with other agents. A key feature of this system is the memory stream, which is an external database functioning as a long-term memory module. It records a comprehensive list of the agents' experiences in natural language, capturing each observation or event directly provided by the agent. This memory stream enables the agents to recall past interactions and use this information to inform future behaviors and decisions.

Inter-agent communication within this simulation can trigger new natural language statements, enhancing the dynamic and interactive nature of the environment. This study showcases the potential of LLMs to simulate complex social interactions and behaviors in a controlled virtual setting.

### Using MultiQueryRetriever

Source: [Link](https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever/)

In [16]:
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI

question = "What are the approaches to Task Decomposition?"
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
retriever_from_llm = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(), llm=llm
)

In [17]:
# Set logging for the queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

In [20]:
question = "What are the approaches to Task Decomposition in context of LLMs?"
unique_docs = retriever_from_llm.invoke(question)
len(unique_docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['What methods are used for breaking down tasks in large language models?', '', 'How do large language models handle task decomposition?', '', 'Can you describe the strategies for task decomposition in large language models?']


10

In [21]:
# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

# llm = ChatOpenAI(temperature=0)

final_rag_chain = (
    {"context": retriever_from_llm, 
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

In [23]:
display(Markdown(final_rag_chain.invoke({"question": question})))

INFO:langchain.retrievers.multi_query:Generated queries: ['What methods are used for breaking down tasks in large language models?', '', 'How do large language models handle task decomposition?', '', 'Can you describe the strategies for task decomposition in large language models?']


The approaches to task decomposition in the context of Large Language Models (LLMs) as described in the provided documents include:

1. **Chain of Thought (CoT)**: Introduced by Wei et al. in 2022, this technique involves instructing the model to "think step by step." This method allows the LLM to use more test-time computation to break down complex tasks into smaller, more manageable steps. The CoT approach helps in transforming large tasks into multiple manageable tasks and provides insights into the model’s reasoning process.

2. **Tree of Thoughts**: An extension of the CoT approach, developed by Yao et al. in 2023, which explores multiple reasoning possibilities at each step. It starts by decomposing the problem into multiple thought steps and then generates multiple thoughts per step, creating a tree structure. The search process within this structure can be conducted using either breadth-first search (BFS) or depth-first search (DFS), with each state evaluated by a classifier (via a prompt) or by majority vote.

3. **Simple Prompting Techniques**: These involve using straightforward prompts to guide the LLM in decomposing tasks. Examples include prompts like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", or task-specific instructions such as "Write a story outline." for writing a novel.

4. **Human Inputs**: Involving human interaction to guide or adjust the task decomposition process, ensuring that the decomposition aligns with human understanding and requirements.

These approaches highlight the versatility and adaptability of LLMs in handling complex tasks by breaking them down into simpler, more digestible components, either autonomously or with human guidance.

### Idea 2: RAG Fusion

This is very similar to multi-query, the only difference is that there is an extra re-ranking step added for all the retrieved documents instead of simple union.

![RAG Fusion](rag_part_2_rag_fusion.png)

Docs: [Link](https://github.com/langchain-ai/langchain/blob/master/cookbook/rag_fusion.ipynb?ref=blog.langchain.dev)

In [9]:
from langchain.prompts import ChatPromptTemplate

# RAG-Fusion: Related
template = """You are a helpful assistant that generates multiple search queries based on a single input query. \n
Generate multiple search queries related to: {question} \n
Output (4 queries):"""
prompt_rag_fusion = ChatPromptTemplate.from_template(template)

In [10]:
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4-turbo", temperature=0.0)

generate_queries = (
    prompt_rag_fusion 
    | llm
    | StrOutputParser() 
    | (lambda x: x.split("\n"))
)

In [11]:
question = "What are the approaches to Task Decomposition in context of LLMs?"
generate_queries.invoke({"question": question})

['1. "Overview of task decomposition strategies in large language models"',
 '2. "How do large language models handle task decomposition?"',
 '3. "Examples of task decomposition in AI language processing"',
 '4. "Effective task decomposition techniques for LLMs"']

In [18]:
from langchain.load import dumps, loads

def reciprocal_rank_fusion(results: list[list], k=60):
    """ Reciprocal_rank_fusion that takes multiple lists of ranked documents 
        and an optional parameter k used in the RRF formula """
    
    # Initialize a dictionary to hold fused scores for each unique document
    fused_scores = {}

    # Iterate through each list of ranked documents
    for docs in results:
        # Iterate through each document in the list, with its rank (position in the list)
        for rank, doc in enumerate(docs):
            # Convert the document to a string format to use as a key (assumes documents can be serialized to JSON)
            doc_str = dumps(doc)
            # If the document is not yet in the fused_scores dictionary, add it with an initial score of 0
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            # Retrieve the current score of the document, if any
            previous_score = fused_scores[doc_str]
            # Update the score of the document using the RRF formula: 1 / (rank + k)
            fused_scores[doc_str] += 1 / (rank + k)

    # Sort the documents based on their fused scores in descending order to get the final reranked results
    reranked_results = [
        loads(doc)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ]

    # Return the reranked results as a list of tuples, each containing the document and its fused score
    return reranked_results

In [15]:
def get_unique_union(documents: list[list]):
    """ Unique union of retrieved docs """
    # Flatten list of lists, and convert each Document to string
    flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique documents
    unique_docs = list(set(flattened_docs))
    # Return
    return [loads(doc) for doc in unique_docs]

In [19]:
retrieval_chain_rag_fusion = generate_queries | retriever.map() | reciprocal_rank_fusion
retrieval_chain_mq_union = generate_queries | retriever.map() | get_unique_union

In [20]:
docs_rag_fusion = retrieval_chain_rag_fusion.invoke({"question": question})
docs_mq_union = retrieval_chain_mq_union.invoke({"question": question})

In [22]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough

# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [23]:
final_rag_chain_rag_fusion = (
    {"context": retrieval_chain_rag_fusion, 
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

final_rag_chain_mq_union = (
    {"context": retrieval_chain_mq_union, 
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

In [24]:
display(Markdown(final_rag_chain_rag_fusion.invoke({"question": question})))

The approaches to task decomposition in the context of Large Language Models (LLMs) as described in the provided documents include:

1. **Chain of Thought (CoT)**: Introduced by Wei et al. in 2022, this technique involves prompting the model to "think step by step." This method allows the model to use more test-time computation to break down complex tasks into smaller, simpler steps. The CoT approach helps in transforming large tasks into multiple manageable tasks and provides insights into the model’s reasoning process.

2. **Tree of Thoughts**: An extension of the Chain of Thought, developed by Yao et al. in 2023. This method further explores multiple reasoning possibilities at each step of the task decomposition. It starts by breaking the problem into multiple thought steps and then generates multiple thoughts per step, creating a tree structure. The search process within this structure can be conducted using either breadth-first search (BFS) or depth-first search (DFS), with each state evaluated by a classifier (prompted) or by majority vote.

3. **Simple Prompting Techniques**: These involve using straightforward prompts to guide the LLM in decomposing tasks. Examples include prompts like "Steps for XYZ.\n1." or "What are the subgoals for achieving XYZ?" These prompts are designed to elicit a structured breakdown of tasks directly from the LLM.

4. **Task-Specific Instructions**: This approach uses instructions tailored to specific types of tasks to aid in decomposition. For example, instructing an LLM to "Write a story outline." for writing a novel helps in structuring the task into manageable components.

5. **Human Inputs**: Involving human participation in the task decomposition process. This can include humans providing initial inputs or corrections to the decomposed tasks suggested by the LLM, ensuring that the decomposition aligns with human understanding and requirements.

These approaches highlight the versatility and adaptability of LLMs in handling complex tasks by breaking them down into more manageable sub-tasks, thereby enhancing their performance and applicability in various domains.

In [25]:
display(Markdown(final_rag_chain_mq_union.invoke({"question": question})))

The approaches to task decomposition in the context of Large Language Models (LLMs) as described in the provided documents include:

1. **Chain of Thought (CoT)**: This technique, as mentioned by Wei et al. in 2022, involves instructing the model to "think step by step." This method allows the model to use more test-time computation to break down complex tasks into smaller, more manageable steps. The CoT approach transforms large tasks into multiple manageable tasks and provides insights into the model’s thinking process.

2. **Tree of Thoughts**: An extension of the Chain of Thought, developed by Yao et al. in 2023, this method explores multiple reasoning possibilities at each step of the task decomposition. It starts by breaking the problem into multiple thought steps and then generates multiple thoughts per step, creating a tree structure. The search process within this structure can be conducted using either breadth-first search (BFS) or depth-first search (DFS), with each state evaluated by a classifier (via a prompt) or by majority vote.

3. **Task-Specific Instructions**: This approach involves using specific instructions tailored to particular types of tasks. For example, instructing an LLM to "Write a story outline" when the task is to write a novel. This method relies on the ability of the LLM to understand and generate responses based on the specific instructions provided.

4. **Human Inputs**: Involving human inputs in the task decomposition process allows for a more guided and potentially accurate breakdown of tasks. This can be particularly useful in complex scenarios where human expertise can significantly enhance the model's performance and the accuracy of the task decomposition.

These approaches highlight the versatility and adaptability of LLMs in handling complex tasks by breaking them down into simpler, more manageable components, thereby enhancing their effectiveness and efficiency in various applications.

In [26]:
question

'What are the approaches to Task Decomposition in context of LLMs?'

### Idea 3: Decomposition

The idea is that we can break down a question into sub questions. Then we can answer each subquestion and obtain the final answer.

![Recursive Answering](rag_part_2_recursive_answer.png)

In this idea, we recursively answer each subquestion to obtain the final answer.

#### Resouces:
- [Paper 1](https://arxiv.org/pdf/2205.10625): Break questions into least to most prompt.
- [Paper 2](https://arxiv.org/pdf/2212.10509): Interleaves information retrieved with CoT

In [3]:
from langchain.prompts import ChatPromptTemplate

# Decomposition
template = """You are a helpful assistant that generates multiple sub-questions related to an input question. \n
The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n
Generate multiple search queries related to: {question} \n
Output (3 queries):"""
prompt_decomposition = ChatPromptTemplate.from_template(template)

In [4]:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

# LLM
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)

# Chain
generate_queries_decomposition = ( prompt_decomposition | llm | StrOutputParser() | (lambda x: x.split("\n")))

# Run
question = "What are the main components of an LLM-powered autonomous agent system?"
questions = generate_queries_decomposition.invoke({"question":question})

In [5]:
questions

['1. What are the core components of a language model in LLM-powered systems?',
 '2. How do autonomous agents integrate with large language models (LLMs)?',
 '3. What are the essential functionalities required for an autonomous agent system powered by LLMs?']

In [6]:
# Prompt
template = """Here is the question you need to answer:

\n --- \n {question} \n --- \n

Here is any available background question + answer pairs:

\n --- \n {q_a_pairs} \n --- \n

Here is additional context relevant to the question: 

\n --- \n {context} \n --- \n

Use the above context and any background question + answer pairs to answer the question: \n {question}
"""

decomposition_prompt = ChatPromptTemplate.from_template(template)

In [7]:
from operator import itemgetter
from langchain_core.output_parsers import StrOutputParser

def format_qa_pair(question, answer):
    """Format Q and A pair"""
    
    formatted_string = ""
    formatted_string += f"Question: {question}\nAnswer: {answer}\n\n"
    return formatted_string.strip()


In [8]:
q_a_pairs = ""
for q in questions:
    
    rag_chain = (
    {"context": itemgetter("question") | retriever, 
     "question": itemgetter("question"),
     "q_a_pairs": itemgetter("q_a_pairs")} 
    | decomposition_prompt
    | llm
    | StrOutputParser())

    answer = rag_chain.invoke({"question":q,"q_a_pairs":q_a_pairs})
    q_a_pair = format_qa_pair(q,answer)
    q_a_pairs = q_a_pairs + "\n---\n"+  q_a_pair

In [9]:
from IPython.display import display, Markdown

display(Markdown(answer))

An autonomous agent system powered by Large Language Models (LLMs) requires a set of essential functionalities to effectively perform tasks and solve problems autonomously. These functionalities are derived from the integration of LLMs as the core controller of the agent, enabling it to handle complex, multi-step tasks and adapt to new situations. Here are the essential functionalities required:

1. **Task Planning and Decomposition**: The system must be capable of understanding and planning tasks. This involves breaking down complex tasks into smaller, manageable subgoals. Techniques like Chain of Thought (CoT) and Tree of Thoughts (ToT) are used to enhance the model's ability to decompose tasks into simpler steps and explore multiple reasoning possibilities, respectively.

2. **Dynamic Learning and Memory**: The agent should incorporate mechanisms for dynamic learning and memory. This allows the agent to remember past interactions, learn from these experiences, and adapt its responses based on new data. Such capabilities are crucial for handling similar or evolving situations more effectively over time.

3. **Self-Reflection and Refinement**: The ability to engage in self-reflection and refine strategies based on past actions and outcomes is essential. This continuous learning loop helps the agent to improve its performance and decision-making capabilities over time.

4. **Natural Language Interface**: The system should utilize natural language as the interface between the LLM and other system components or external APIs. This facilitates seamless communication and interaction with various data sources and services, enhancing the agent's functionality and the scope of tasks it can perform.

5. **API Interactions**: Advanced implementations should enable the LLM to interact with external APIs to perform actions that extend beyond basic data processing or response generation. This can include making API calls for additional data retrieval, computation, or even controlling other software or hardware components as part of task execution.

6. **Handling Unexpected Situations**: The agent must be equipped to handle unexpected errors or changes in the environment. This involves mechanisms for the LLM to adjust plans dynamically and respond to new challenges effectively.

7. **Reliability and Error Handling**: Given the limitations such as finite context length and the occasional unreliability of natural language interfaces, the system must include robust error handling and output parsing mechanisms to ensure the reliability of model outputs and to manage formatting errors or other inconsistencies.

These functionalities collectively enable an LLM-powered autonomous agent system to perform a wide range of tasks, from simple data retrieval to complex, multi-step problem-solving scenarios, making them valuable across various applications.

#### Combining Answers:

Instead of recursively feeding previous question and answers of the past prompts, we can individually prompt on the sub questions followed by combining everything in one big prompt

![Answer Individually](rag_part_2_answer_individual.png)

In [10]:
from langchain.prompts import ChatPromptTemplate

# Decomposition
template = """You are a helpful assistant that generates multiple sub-questions related to an input question. \n
The goal is to break down the input into a set of sub-problems / sub-questions that can be answered in isolation. \n
Generate multiple search queries related to: {question} \n
Output (3 queries):"""
prompt_decomposition = ChatPromptTemplate.from_template(template)

In [15]:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

# LLM
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)

# Chain
generate_queries_decomposition = ( prompt_decomposition | llm | StrOutputParser() | (lambda x: x.split("\n")))

# Run
question = "What are the main components of an LLM-powered autonomous agent system?"
sub_questions = generate_queries_decomposition.invoke({"question":question})

In [16]:
sub_questions

['1. What are the core components of a language model in LLM-powered systems?',
 '2. How do autonomous agents integrate with large language models for decision-making?',
 '3. What are the essential hardware and software requirements for running an LLM-powered autonomous agent system?']

In [13]:
from langchain import hub
prompt_hub_rag = hub.pull("rlm/rag-prompt")

In [14]:
prompt_hub_rag

ChatPromptTemplate(input_variables=['context', 'question'], metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))])

In [17]:
sub_answers = []

for sub_question in sub_questions:
    docs = retriever.get_relevant_documents(question)

    rag_chain = (
        prompt_hub_rag
        | llm
        | StrOutputParser()
    )

    sub_answer = rag_chain.invoke({"context": docs, "question": sub_question})
    sub_answers.append(sub_answer)

In [18]:
def format_qa_pairs(questions, answers):
    """Format Q and A pairs"""
    
    formatted_string = ""
    for i, (question, answer) in enumerate(zip(questions, answers), start=1):
        formatted_string += f"Question {i}: {question}\nAnswer {i}: {answer}\n\n"
    return formatted_string.strip()

context = format_qa_pairs(sub_questions, sub_answers)

In [19]:
# Prompt
template = """Here is a set of Q+A pairs:

{context}

Use these to synthesize an answer to the question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    prompt
    | llm
    | StrOutputParser()
)

In [20]:
display(Markdown(final_rag_chain.invoke({"context":context,"question":question})))

An LLM-powered autonomous agent system primarily consists of several key components that work together to enable sophisticated decision-making and task execution. These components can be categorized into three main areas: the language model, integration mechanisms, and hardware/software infrastructure.

1. **Language Model**: At the core of an LLM-powered autonomous agent system is a large language model (LLM) such as GPT or similar models. This model is responsible for understanding and generating human-like text, which is crucial for processing and responding to complex tasks. The language model facilitates several critical functions:
   - **Planning**: The model strategizes future actions by understanding complex tasks.
   - **Task Decomposition**: It breaks down large tasks into smaller, manageable subgoals, making the problem-solving process more structured.
   - **Self-Reflection**: The ability to evaluate past actions and outcomes to refine strategies and improve performance over time.

2. **Integration Mechanisms**: To enhance decision-making capabilities, autonomous agents integrate with LLMs using various techniques:
   - **Chain of Thought and Tree of Thoughts**: These techniques help in decomposing tasks and exploring multiple reasoning pathways, which are essential for detailed planning and effective decision-making.
   - **External Tools**: Some systems may incorporate classical planners or specific APIs that assist in long-horizon planning and interaction with the environment. These tools help translate complex problems into a planning language that the LLM can understand and act upon.

3. **Hardware and Software Infrastructure**:
   - **Hardware Requirements**: Running an LLM-powered system requires robust computing resources, typically high-performance GPUs or TPUs, to handle the computational demands of large language models.
   - **Software Requirements**: Besides the language model itself, the system needs additional software for task-specific functionalities. This includes frameworks and interfaces like PDDL for planning, and software that supports task decomposition and planning methodologies like Chain of Thought or Tree of Thoughts.

Together, these components enable an LLM-powered autonomous agent system to perform complex tasks autonomously by processing natural language inputs, making reasoned decisions, and interacting effectively with its environment.

In [21]:
display(Markdown(context))

Question 1: 1. What are the core components of a language model in LLM-powered systems?
Answer 1: The core components of a language model in LLM-powered systems include planning, task decomposition, and self-reflection. Planning involves understanding complex tasks and strategizing future actions. Task decomposition breaks down large tasks into smaller, manageable subgoals, while self-reflection allows the model to learn from past actions and improve future performance.

Question 2: 2. How do autonomous agents integrate with large language models for decision-making?
Answer 2: Autonomous agents integrate with large language models (LLMs) for decision-making by using LLMs as the core controller that processes complex tasks into manageable subgoals and plans. Techniques such as Chain of Thought and Tree of Thoughts enhance LLMs' ability to decompose tasks and explore multiple reasoning pathways, thereby aiding in detailed planning and decision-making. Additionally, some systems may employ external tools like classical planners for long-horizon planning, where the LLM translates problems into a planning language and back, integrating detailed task-specific actions and natural language processing.

Question 3: 3. What are the essential hardware and software requirements for running an LLM-powered autonomous agent system?
Answer 3: The essential hardware requirements for running an LLM-powered autonomous agent system include powerful computing resources capable of handling large language models, such as high-performance GPUs or TPUs. On the software side, the system requires a sophisticated language model like GPT or similar, along with additional software for task-specific functionalities such as planning and self-reflection, which may involve using external tools like classical planners or specific APIs for interaction with the environment. Additionally, software frameworks that support task decomposition, such as Chain of Thought or Tree of Thoughts, and interfaces like PDDL for planning, are crucial for the system's operation.

### Idea 4: Step Back

We can generate a more abstract question from the question given , then combine the retrieved documents of both the normal and step-backed question to get the final answer.

![Step Back](rag_part_2_step_back.png)

Paper: [Link](https://arxiv.org/pdf/2310.06117)

In [23]:
# Few Shot Examples
from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate
# Examples picked from paper
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what can the members of The Police do?",
    },
    {
        "input": "Jan Sindel’s was born in what country?",
        "output": "what is Jan Sindel’s personal history?",
    },
]
# We now transform these to example messages
example_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
)
prompt_general_question = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are an expert at world knowledge. Your task is to step back and paraphrase a question to a more generic step-back question, which is easier to answer. Here are a few examples:""",
        ),
        # Few shot examples
        few_shot_prompt,
        # New question
        ("user", "{question}"),
    ]
)


In [32]:
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

# LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# Chain
generate_step_back_question = ( prompt_general_question | llm | StrOutputParser())

# Run
question = "What is task decomposition for LLM agents?"
step_back_question = generate_step_back_question.invoke({"question":question})
step_back_question

'What is the process of breaking down tasks for LLM agents?'

In [33]:
response_prompt_template = """You are an expert of world knowledge. I am going to ask you a question. Your response should be comprehensive and not contradicted with the following context if they are relevant. Otherwise, ignore them if they are not relevant.

# {normal_context}
# {step_back_context}

# Original Question: {question}
# Answer:"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

In [34]:
normal_context = retriever.get_relevant_documents(question)
step_back_context = retriever.get_relevant_documents(step_back_question)

In [35]:
response_chain = (response_prompt | llm | StrOutputParser())
display(Markdown(response_chain.invoke({"normal_context": normal_context,\
                                        "step_back_context": step_back_context,\
                                            "question": question})))

Task decomposition for LLM agents refers to the process of breaking down complex tasks into smaller, more manageable subgoals or steps. This approach allows LLM-powered autonomous agents to efficiently handle intricate tasks by dividing them into simpler components. Task decomposition is essential for planning and executing tasks effectively, as it enables the agent to focus on one subgoal at a time, leading to a more structured and organized problem-solving process.

In the context of LLM-powered autonomous agents, task decomposition can be achieved through various techniques such as Chain of Thought (CoT) and Tree of Thoughts. CoT involves prompting the model to "think step by step," encouraging it to decompose difficult tasks into smaller steps. This method enhances the model's performance on complex tasks by utilizing more test-time computation. On the other hand, Tree of Thoughts extends CoT by exploring multiple reasoning possibilities at each step, creating a tree structure of thought processes. This allows the agent to consider different paths and options during task decomposition.

Additionally, task decomposition can be facilitated by providing LLM with simple prompts or task-specific instructions. For example, prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?" can guide the model in breaking down tasks into manageable steps. Task-specific instructions, such as "Write a story outline" for writing a novel, can also help in structuring the task decomposition process.

Overall, task decomposition plays a crucial role in enabling LLM agents to handle complex tasks efficiently by breaking them down into smaller, more manageable subgoals. This approach enhances the agent's problem-solving capabilities and contributes to the overall effectiveness of the autonomous agent system.

### Idea 5: HyDE

Question and document embeddings may be far from each other. So, ask the llm to generate a passage for the question, and do retrieval based on that passage.

![HyDE](rag_part_2_hyde.png)

Paper: [Link](https://arxiv.org/pdf/2212.10496)

In [36]:
from langchain.prompts import ChatPromptTemplate

# HyDE document genration
template = """Please write a scientific paper passage to answer the question
Question: {question}
Passage:"""
prompt_hyde = ChatPromptTemplate.from_template(template)

from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)

generate_docs_for_retrieval = (
    prompt_hyde | llm | StrOutputParser() 
)

# Run
question = "What is task decomposition for LLM agents?"
display(Markdown(generate_docs_for_retrieval.invoke({"question":question})))

**Title: Understanding Task Decomposition in Large Language Models (LLMs)**

**Abstract:**
Task decomposition is a critical strategy in the field of artificial intelligence, particularly in the operation of large language models (LLMs). This paper explores the concept of task decomposition, its importance in enhancing the performance of LLMs, and the methodologies employed to achieve effective decomposition.

**1. Introduction**
Large Language Models (LLMs) such as GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers) have shown remarkable capabilities in generating human-like text and understanding complex language patterns. However, the efficiency and effectiveness of these models can be significantly enhanced through task decomposition. This involves breaking down a complex task into simpler, manageable subtasks, which can be individually addressed before synthesizing the results to form a comprehensive solution.

**2. Definition and Importance of Task Decomposition**
Task decomposition refers to the process of dividing a complex problem into smaller, more manageable components or subtasks. This approach is rooted in the divide-and-conquer algorithm strategy, which simplifies the problem-solving process, reduces computational overhead, and enhances focus on specific aspects of a problem. In the context of LLMs, task decomposition allows the model to handle specific parts of a language task more efficiently, leading to improved accuracy and faster processing times.

**3. Methodologies for Task Decomposition in LLMs**
Implementing task decomposition in LLMs can be approached through several methodologies:

**3.1 Hierarchical Processing:**
This method involves structuring the LLM’s processing layers to handle different levels of language complexity. For instance, initial layers could focus on basic syntax and grammar, while deeper layers could handle more complex semantic analysis and context integration.

**3.2 Modular Architecture:**
Another approach is to design LLMs with modular architectures, where different modules are responsible for different aspects of language processing, such as named entity recognition, sentiment analysis, or syntactic parsing. These modules can operate independently or in a coordinated manner, depending on the task requirements.

**3.3 Dynamic Attention Mechanisms:**
Dynamic attention mechanisms allow LLMs to focus on different parts of the input data at different times, effectively decomposing the task based on the relevance of information. This adaptability is crucial for handling tasks with varying contexts and complexities.

**4. Case Studies and Applications**
Several studies have demonstrated the effectiveness of task decomposition in LLMs. For example, in machine translation, decomposing the task into syntactic parsing and semantic analysis has led to more accurate translations by separately addressing the structural and meaning components of language.

**5. Challenges and Future Directions**
While task decomposition presents numerous benefits, it also introduces challenges such as the integration of outputs from different subtasks and the optimization of modular architectures. Future research could explore automated methods for dynamic task decomposition based on the nature of the task and the specific requirements of the application.

**6. Conclusion**
Task decomposition stands as a pivotal technique in enhancing the functionality and efficiency of LLMs. By breaking down complex tasks into simpler subtasks, LLMs can achieve higher performance metrics and adapt more readily to diverse language processing requirements. Continued advancements in this area are essential for the evolution of more sophisticated and capable language models.

**References:**
[Relevant academic and industry sources discussing LLMs, task decomposition, and related AI methodologies would be listed here.]

This passage provides a comprehensive overview of task decomposition in the context of LLMs, highlighting its definition, methodologies, applications, and the challenges faced in its implementation.

In [37]:
retrieval_chain = generate_docs_for_retrieval | retriever
retrieval_chain.invoke({"question":question})

[Document(page_content='Fig. 1. Overview of a LLM-powered autonomous agent system.\nComponent One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via 

In [38]:
# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

from operator import itemgetter

final_rag_chain = (
    {"context": itemgetter("question") | retrieval_chain,\
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

display(Markdown(final_rag_chain.invoke({"question":question})))

Task decomposition for LLM (large language model) agents involves breaking down complex tasks into smaller, more manageable subgoals. This process enables the agents to handle intricate tasks more efficiently by focusing on simpler, individual components of the overall task. The decomposition can be achieved through various methods:

1. **Chain of Thought (CoT)**: This technique prompts the model to think step-by-step, allowing it to use more test-time computation to break down difficult tasks into simpler steps. It transforms large tasks into multiple manageable tasks and provides insights into the model’s reasoning process.

2. **Tree of Thoughts**: An extension of CoT, this method explores multiple reasoning possibilities at each step of the decomposition. It generates a tree structure by creating multiple thoughts per step, which can be navigated using algorithms like breadth-first search (BFS) or depth-first search (DFS). Each state in the tree is evaluated either by a classifier or through a majority vote.

3. **Simple Prompting and Task-Specific Instructions**: LLMs can also perform task decomposition by responding to simple prompts like "Steps for XYZ" or task-specific instructions such as "Write a story outline" for writing a novel.

4. **Human Inputs**: Involving human inputs can further aid in the decomposition process, ensuring that the tasks are broken down effectively according to human understanding and requirements.

Overall, task decomposition is a critical function in LLM-powered autonomous agents, allowing them to plan and execute complex tasks by addressing smaller, sequential components that contribute to the achievement of the main goal.

#### HyDE as per documentation

Docs: [Link](https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb)

In [55]:
from langchain.chains.hyde.base import HypotheticalDocumentEmbedder
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser


In [61]:
# Load with `web_search` prompt
base_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0.0)
hyde_chain = llm | StrOutputParser()
hyde_embeddings = HypotheticalDocumentEmbedder.from_llm(llm=hyde_chain,\
                                                    base_embeddings=base_embeddings,\
                                                          prompt_key='sci_fact')

In [60]:
question = "What is task decomposition for LLM agents?"
display(Markdown(hyde_embeddings.invoke(question)['text']))

**Title: Understanding Task Decomposition in Large Language Models (LLMs)**

**Abstract:**
Task decomposition is a critical concept in the field of artificial intelligence, particularly in the functioning of large language models (LLMs). This paper explores the role of task decomposition in enhancing the performance and applicability of LLMs in complex problem-solving scenarios. We provide a detailed analysis of how task decomposition is implemented in LLMs and discuss its implications for future developments in AI.

**1. Introduction**
Large Language Models (LLMs), such as GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers), have shown remarkable capabilities in generating human-like text and understanding context within natural language processing tasks. An essential aspect of their application involves task decomposition, which refers to the process of breaking down a complex task into simpler, manageable subtasks. This decomposition is pivotal for improving the efficiency and effectiveness of LLMs in handling diverse and complex tasks.

**2. Task Decomposition in LLMs**
Task decomposition in LLMs can be understood through the lens of both model architecture and the operational strategies employed during problem-solving. Architecturally, LLMs are designed with multiple layers of transformers, each contributing to a different aspect of language understanding and generation. Operationally, when presented with a complex task, an LLM decomposes it by identifying key components and subtasks. This process involves parsing the task into smaller segments that can be individually addressed, and subsequently integrating the solutions to form a coherent output.

**3. Methodology**
To investigate the effectiveness of task decomposition in LLMs, we conducted a series of experiments where LLMs were tasked with complex problem-solving scenarios requiring multiple steps of reasoning. The performance of the models was evaluated based on accuracy, efficiency in task resolution, and the ability to generalize across similar tasks.

**4. Results**
The results indicate that LLMs employing task decomposition strategies outperform those that do not. Models that decomposed tasks into logical subunits were able to more accurately generate responses and showed improved performance on tasks involving multiple steps or requiring integration of various information types.

**5. Discussion**
The ability of LLMs to decompose tasks significantly contributes to their versatility and utility in real-world applications. Decomposition allows LLMs to manage complexity by addressing components of a task sequentially or in parallel, enhancing their problem-solving capabilities. Furthermore, task decomposition aligns with cognitive strategies employed by humans, such as breaking down a problem into smaller, more manageable parts, suggesting a pathway towards more intuitive human-machine interaction.

**6. Conclusion**
Task decomposition is a fundamental aspect of the functionality of LLMs, crucial for their performance in complex scenarios. By breaking down tasks into simpler components, LLMs can effectively manage and process large amounts of information, leading to better problem-solving capabilities and broader applicability in various fields. Future research should focus on optimizing decomposition algorithms and exploring their applications in more diverse contexts.

**References**
[Appropriate academic references supporting the claims and methodologies used in this paper]

---

This passage supports the claim by explaining what task decomposition is in the context of LLMs and detailing how it is crucial for enhancing the models' ability to handle complex tasks efficiently.

In [62]:
# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

from operator import itemgetter

final_rag_chain = (
    {"context": itemgetter("question") | hyde_embeddings | (lambda x: x['text']) | retriever,\
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

display(Markdown(final_rag_chain.invoke({"question":question})))

Task decomposition for LLM (large language model) agents involves breaking down complex tasks into smaller, more manageable subgoals. This process enables the agent to handle intricate tasks more efficiently by focusing on simpler components one at a time. The decomposition can be achieved through various methods:

1. **Simple Prompting**: Using straightforward prompts to guide the LLM, such as "Steps for XYZ.\n1." or "What are the subgoals for achieving XYZ?" This method leverages the LLM's ability to generate structured responses based on the input prompts.

2. **Chain of Thought (CoT)**: Introduced by Wei et al. in 2022, CoT is a prompting technique where the model is instructed to "think step by step." This approach allows the LLM to use more computation at test time to break down difficult tasks into simpler steps, providing a clearer interpretation of the model’s reasoning process.

3. **Tree of Thoughts (ToT)**: Developed by Yao et al. in 2023, ToT extends CoT by exploring multiple reasoning possibilities at each decomposition step. It creates a tree structure of thoughts, where each node can be explored using algorithms like breadth-first search (BFS) or depth-first search (DFS). Each state or node in this tree can be evaluated through methods such as classifier prompts or majority votes.

4. **Task-Specific Instructions**: Tailoring the instructions to fit specific tasks, such as "Write a story outline." for novel writing. This method uses the LLM's capabilities to generate content that aligns with specific goals or formats.

5. **Human Inputs**: Incorporating inputs from human users to guide or refine the task decomposition process, ensuring that the breakdown aligns with human understanding and objectives.

Overall, task decomposition is a critical component in the functionality of LLM-powered autonomous agents, allowing them to process and execute complex tasks by addressing smaller, sequential components effectively.