# RAG part 2

## Query transformation

The idea is that we can transform the query into something which can make retrieval easier

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = os.getenv("LANGCHAIN_API_KEY")
os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")

### Idea 1: Multi query

A single query be reframed into multiple queries, so that we can query into a vector DB multiple times and take the union of the documents fetched everytime. 
![rag-multi-query](rag_part_2_multi_query.png)

In [2]:
#### INDEXING ####

# Load blog
import bs4
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

# Split
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300, 
    chunk_overlap=50)

# Make splits
splits = text_splitter.split_documents(blog_docs)

# Index
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=OpenAIEmbeddings())

retriever = vectorstore.as_retriever()


Using prompts

In [3]:
from langchain.prompts import ChatPromptTemplate

# Multi Query: Different Perspectives
template = """You are an AI language model assistant. Your task is to generate five 
different versions of the given user question to retrieve relevant documents from a vector 
database. By generating multiple perspectives on the user question, your goal is to help
the user overcome some of the limitations of the distance-based similarity search. 
Provide these alternative questions separated by newlines. Original question: {question}"""
prompt_perspectives = ChatPromptTemplate.from_template(template)

from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4-turbo", temperature=0.0)

generate_queries = (
    prompt_perspectives 
    | llm
    | StrOutputParser() 
    | (lambda x: x.split("\n"))
)

In [5]:
from IPython.display import display, Markdown

generate_queries.invoke({"question": "What is task decomposition for LLM agents?"})

['How do LLM agents break down complex tasks into simpler components?',
 'What does task decomposition entail in the context of large language models?',
 'Can you explain the process of task decomposition used by large language model agents?',
 'What are the methods of task decomposition applied in large language models?',
 'How do large language models manage task decomposition to handle complex problems?']

In [6]:
from langchain.load import dumps, loads

def get_unique_union(documents: list[list]):
    """ Unique union of retrieved docs """
    # Flatten list of lists, and convert each Document to string
    flattened_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique documents
    unique_docs = list(set(flattened_docs))
    # Return
    return [loads(doc) for doc in unique_docs]

# Retrieve
question = "What is task decomposition for LLM agents?"
retrieval_chain = generate_queries | retriever.map() | get_unique_union

In [8]:
docs = retrieval_chain.invoke({"question": question})
len(docs)

8

In [9]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough

# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

# llm = ChatOpenAI(temperature=0)

final_rag_chain = (
    {"context": retrieval_chain, 
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

In [11]:
display(Markdown(final_rag_chain.invoke({"question":question})))


Task decomposition for LLM (Large Language Model) agents refers to the process of breaking down a complex task into smaller, more manageable subtasks. This technique is crucial for enhancing the performance of LLMs on intricate tasks that require multiple steps or stages. The decomposition not only simplifies the problem but also provides a structured approach that the LLM can follow, making it easier to generate coherent and contextually appropriate responses.

In the context of LLM-powered autonomous agents, task decomposition can be implemented in several ways:

1. **Chain of Thought (CoT)**: This is a prompting technique where the model is instructed to "think step by step." This method involves the LLM using more test-time computation to logically break down a task into simpler steps. The CoT approach helps the model to process and articulate each step, thereby improving its ability to handle complex tasks by focusing on one component at a time.

2. **Tree of Thoughts**: An extension of the CoT, this technique explores multiple reasoning possibilities at each decomposition step. It creates a tree structure where each node represents a thought or a sub-task, and branches represent different approaches or solutions to the sub-task. This method can utilize search techniques like breadth-first search (BFS) or depth-first search (DFS), with each state evaluated by a classifier or a majority vote to determine the most promising path forward.

3. **Task-Specific Instructions**: LLMs can also be guided through task decomposition by using specific instructions tailored to the task at hand. For example, if the task is to write a novel, the instruction might be "Write a story outline," which prompts the LLM to first outline the plot before delving into detailed writing.

4. **Human Inputs**: Involving human inputs for task decomposition allows for a hybrid approach where humans and LLMs collaborate. Humans can provide initial guidance on breaking down the task, and the LLM can then proceed with detailed execution based on these guidelines.

Overall, task decomposition is a fundamental strategy in the design of LLM-powered autonomous agents, enabling them to tackle complex tasks more effectively by reducing cognitive load and providing clear execution paths.

In [12]:
display(Markdown(final_rag_chain.invoke({"question":"Describe some key sections of the article in a pointwise manner."})))


Based on the provided documents, here are some key sections of the article described in a pointwise manner:

1. **ReAct Prompt Template**:
   - Introduces a structured format for Language Model (LLM) operations, which includes steps like Thought, Action, and Observation, repeated multiple times to guide the LLM's processing and response generation.

2. **Challenges in Building LLM-Centered Agents**:
   - Discusses common limitations encountered when developing agents centered around large language models, highlighting practical challenges in implementation and operation.

3. **Detailed Coding Instructions**:
   - Provides comprehensive guidelines for coding, including the layout of core classes, functions, and methods with comments on their purposes.
   - Specifies the format for outputting code in markdown blocks, ensuring that filenames and languages are appropriately formatted and that the code is fully functional and compatible across different files.

4. **Performance Evaluation**:
   - Outlines methods for continuous review and analysis of actions to ensure optimal performance.
   - Emphasizes the importance of self-criticism, reflection on past decisions, and efficient command execution to minimize resource usage.

5. **Resources for Agent Operation**:
   - Lists resources available to the agent, such as internet access, long-term memory management, and GPT-3.5 powered agents for task delegation.

6. **Experiments and Observations**:
   - Mentions specific experiments like those conducted in AlfWorld Env and HotpotQA, noting particular issues like hallucination and inefficient planning.

These sections collectively provide insights into the structure, challenges, and operational guidelines for building and managing LLM-centered agents, as well as evaluating their performance in practical scenarios.

In [14]:
display(Markdown(final_rag_chain.invoke({"question":"Describe some of the studies mentioned in this article done in the area of 'Generative Agents Simulation' . "})))


The article mentions a study titled "Generative Agents" by Park et al. (2023), which is a significant experiment in the area of Generative Agents Simulation. In this study, 25 virtual characters, each controlled by a large language model (LLM)-powered agent, interact within a sandbox environment inspired by The Sims. This simulation aims to create believable simulacra of human behavior for interactive applications.

The design of these generative agents integrates LLMs with memory, planning, and reflection mechanisms. This allows the agents to behave based on past experiences and interact with other agents. A key feature of this system is the memory stream, which is an external database functioning as a long-term memory module. It records a comprehensive list of the agents' experiences in natural language, capturing each observation or event directly provided by the agent. This memory stream enables the agents to recall past interactions and use this information to inform future behaviors and decisions.

Inter-agent communication within this simulation can trigger new natural language statements, enhancing the dynamic and interactive nature of the environment. This study showcases the potential of LLMs to simulate complex social interactions and behaviors in a controlled virtual setting.

### Using MultiQueryRetriever

Source: [Link](https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever/)

In [16]:
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI

question = "What are the approaches to Task Decomposition?"
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
retriever_from_llm = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(), llm=llm
)

In [17]:
# Set logging for the queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

In [20]:
question = "What are the approaches to Task Decomposition in context of LLMs?"
unique_docs = retriever_from_llm.invoke(question)
len(unique_docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['What methods are used for breaking down tasks in large language models?', '', 'How do large language models handle task decomposition?', '', 'Can you describe the strategies for task decomposition in large language models?']


10

In [21]:
# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

# llm = ChatOpenAI(temperature=0)

final_rag_chain = (
    {"context": retriever_from_llm, 
     "question": itemgetter("question")} 
    | prompt
    | llm
    | StrOutputParser()
)

In [23]:
display(Markdown(final_rag_chain.invoke({"question": question})))

INFO:langchain.retrievers.multi_query:Generated queries: ['What methods are used for breaking down tasks in large language models?', '', 'How do large language models handle task decomposition?', '', 'Can you describe the strategies for task decomposition in large language models?']


The approaches to task decomposition in the context of Large Language Models (LLMs) as described in the provided documents include:

1. **Chain of Thought (CoT)**: Introduced by Wei et al. in 2022, this technique involves instructing the model to "think step by step." This method allows the LLM to use more test-time computation to break down complex tasks into smaller, more manageable steps. The CoT approach helps in transforming large tasks into multiple manageable tasks and provides insights into the model’s reasoning process.

2. **Tree of Thoughts**: An extension of the CoT approach, developed by Yao et al. in 2023, which explores multiple reasoning possibilities at each step. It starts by decomposing the problem into multiple thought steps and then generates multiple thoughts per step, creating a tree structure. The search process within this structure can be conducted using either breadth-first search (BFS) or depth-first search (DFS), with each state evaluated by a classifier (via a prompt) or by majority vote.

3. **Simple Prompting Techniques**: These involve using straightforward prompts to guide the LLM in decomposing tasks. Examples include prompts like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", or task-specific instructions such as "Write a story outline." for writing a novel.

4. **Human Inputs**: Involving human interaction to guide or adjust the task decomposition process, ensuring that the decomposition aligns with human understanding and requirements.

These approaches highlight the versatility and adaptability of LLMs in handling complex tasks by breaking them down into simpler, more digestible components, either autonomously or with human guidance.