## LangChain - Query translation

### Multi-Query

In [1]:
import bs4
from langchain import hub
from langchain.load import dumps, loads
from langchain.prompts import ChatPromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import AzureChatOpenAI, AzureOpenAIEmbeddings
from dotenv import load_dotenv

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
## INDEXING ##

# Load blog
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

# Split
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300,
    chunk_overlap=50
)
# Make splits 
splits = text_splitter.split_documents(blog_docs)

# embedding
embedding = AzureOpenAIEmbeddings(
    model="text-embedding-3-large"
)

# Index
vector_store = Chroma.from_documents(documents=splits, embedding=embedding)
retriever = vector_store.as_retriever()

In [3]:
llm = AzureChatOpenAI(
    model="gpt-4o-mini",
    api_version="2024-12-01-preview"
)

In [4]:
## Prompt ##

# Multi Query: Different Perspectives
template = """
    You are an AI language model assistant. Your task is to generate five different
    versions of the given user question to retrieve relevant documents from a vector
    database. By generating multiple perspectives on the user question, your goal is
    to help the user overcome some of the limitations of the distance-based similarity
    search. Provide these alternative questions separated by newlines.
    Original question: {question}
"""
prompt_perspective = ChatPromptTemplate.from_template(template)

generate_queries = (
    prompt_perspective
    | llm
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

In [5]:
def get_unique_union(documents: list[list]):
    """Unique union of retrieved docs"""
    # Flatten list of lists, and convert each Document to string
    flatten_docs = [dumps(doc) for sublist in documents for doc in sublist]
    # Get unique document
    unique_docs = list(set(flatten_docs))
    return [loads(doc) for doc in unique_docs]

# Retrieve
question = "What is task decomposition for LLM agents?"
retrieval_chain = generate_queries | retriever.map() | get_unique_union
docs = retrieval_chain.invoke({"question": question})

  return [loads(doc) for doc in unique_docs]


In [6]:
len(docs)

7

In [7]:
from operator import itemgetter

template = """"
    Answer the following question based on this context:

    {context}
    Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    {
        "context": retrieval_chain,
        "question": itemgetter("question")
    }
    | prompt
    | llm 
    | StrOutputParser()
)

final_rag_chain.invoke({"question": question})

"Task decomposition for LLM (large language model) agents involves breaking down complex tasks into smaller, more manageable subgoals to enhance efficiency in handling intricate problems. This process can be facilitated by techniques such as Chain of Thought (CoT) prompting, which encourages the model to “think step by step.” CoT helps in transforming a large task into a series of simpler steps, thereby illuminating the model's thinking process.\n\nAdditionally, the Tree of Thoughts approach extends CoT by allowing the exploration of multiple reasoning possibilities at each step. It involves decomposing the problem into various thought steps and generating multiple thoughts per step, forming a tree structure that can be traversed using search methods like breadth-first search (BFS) or depth-first search (DFS).\n\nTask decomposition can be achieved through various means, including:\n1. Simple prompting, such as asking for the steps needed to complete a task or identifying subgoals.\n2. 

### RAG Fusion

In [8]:
llm = AzureChatOpenAI(
    model="gpt-4o-mini",
    api_version="2024-12-01-preview",
    temperature=0
)

In [9]:
# RAG-Fusion: Related
fusion_template = """
    You are a helpful assistant that generates multiple search queries based 
    on a single input query. \n Generates a multiple search queries related to: \n
    {question}
    Output (4 queries):
"""
prompt_rag_fusion = ChatPromptTemplate.from_template(fusion_template)

In [10]:
generate_queries_fusion = (
    prompt_rag_fusion
    | llm
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

In [11]:
def reciprocal_rank_fusion(results: list[list], k=60):
    """
    Reciprocal_rank_fusion that takes multiple list of ranked
    documents and an optional parameter k used in the RRF formula
    """
    # Initialize a dictionary to hold fused scores for each unique document
    fused_scores = {}

    # Iterate through each list of ranked documents
    for docs in results:
        # Iterate through each document in the list, with its rank (position in the list)
        for rank, doc in enumerate(docs):
            # Convert the document to a string format to use as a key (assumes documents can be serialized to JSON)
            doc_str = dumps(doc)
            # If the document is not yet in the fused_scores dictionary, add it with an initial score of 0
            if doc_str not in fused_scores:
                fused_scores[doc_str] = 0
            # Retrieve the current score of the document, if any
            previous_score = fused_scores[doc_str]
            # Update the score of the document using the RRF formula 1 / (rank + k)
            fused_scores[doc_str] += 1 / (rank + k)

    # Sort the documents based on their fused scores in descending order to get the final ranked results
    reranked_results = [
        (loads(doc), score)
        for doc, score in sorted(fused_scores.items(), key=lambda x: x[1], reverse=True)
    ]

    # Return the reranked results as a list of tuples, each containing the document and its fused score
    return reranked_results

retrieval_chain_rag_fusion = generate_queries_fusion | retriever.map() | reciprocal_rank_fusion
docs = retrieval_chain_rag_fusion.invoke({"question": question})
len(docs)

7

In [12]:
template = """"
    Answer the following question based on this context:

    {context}
    Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    {
        "context": retrieval_chain_rag_fusion,
        "question": itemgetter("question") 
    }
    | prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"question": question})

'Task decomposition for LLM (large language model) agents involves breaking down complex tasks into smaller, manageable subgoals. This process enables the agent to handle intricate tasks more efficiently. There are several methods for task decomposition:\n\n1. **Chain of Thought (CoT)**: This technique encourages the model to "think step by step," allowing it to decompose difficult tasks into simpler steps, thereby enhancing its performance.\n\n2. **Tree of Thoughts**: This approach extends CoT by exploring multiple reasoning possibilities at each step, creating a tree structure of thoughts. It decomposes the problem into multiple thought steps and generates various thoughts per step, which can be evaluated through methods like breadth-first search (BFS) or depth-first search (DFS).\n\n3. **Prompting Techniques**: Task decomposition can be achieved through simple prompts, such as asking for the steps to achieve a goal or identifying subgoals.\n\n4. **Task-Specific Instructions**: Provi

### Decomposition

In [13]:
template_decomposition = """
    You are a helpful assistant that generates multiple sub-questions related
    to an input question. \n The goal is to break down the input a set of sub-problems /
    sub-questions that can be answered in isolation. \n Generate multiple search queries
    related to: {question}.
    Output (3 queries): 
"""
prompt_decomposition = ChatPromptTemplate.from_template(template_decomposition)

In [14]:
generate_query_decomposition = (
    prompt_decomposition
    | llm
    | StrOutputParser()
    | (lambda x: x.split("\n"))
)

In [15]:
question = "What are the main components of an LLM-powered agent system?"
questions = generate_query_decomposition.invoke({"question": question})

In [16]:
questions

['1. What are the key components of a large language model (LLM) architecture?',
 '2. How do LLMs integrate with other systems in an agent-based architecture?',
 '3. What role does data preprocessing play in the performance of an LLM-powered agent system?']

In [17]:
template = """"
    Here is the question you need to answer:
    \n --- \n {question} \n --- \n

    Here is any available background question + answer pairs:
    \n --- \n {q_a_pairs} \n --- \n

    Here is the additional context relevant to the question:
    \n --- \n {context} \n --- \n

    Use the above context and any background question + answer pairs to answer
    the question: \n {question}.
"""
decomposition_prompt = ChatPromptTemplate.from_template(template)

In [18]:
def format_qa_pair(question, answer):
    """Format Q and A pair"""
    formatted_str = ""
    formatted_str += f"Question: {question}\nAnswer: {answer}\n\n"
    return formatted_str.strip()

q_a_pairs = ""
for q in questions:
    rag_chain_decomposition = (
        {
            "context": itemgetter("question") | retriever,
            "question": itemgetter("question"),
            "q_a_pairs": itemgetter("q_a_pairs")
        }
        | decomposition_prompt
        | llm
        | StrOutputParser()
    )

    answer = rag_chain_decomposition.invoke({"question": q, "q_a_pairs": q_a_pairs})
    q_a_pair = format_qa_pair(q, answer)
    q_a_pairs = q_a_pairs + "\n --- \n" + q_a_pair

In [19]:
answer

"Data preprocessing plays a crucial role in the performance of an LLM-powered agent system by ensuring that the input data is clean, relevant, and structured in a way that maximizes the model's ability to understand and generate appropriate responses. Here are several key aspects of how data preprocessing impacts the performance of such systems:\n\n1. **Quality of Input Data**: Preprocessing helps filter out noise and irrelevant information from the input data. High-quality, well-structured data allows the LLM to learn more effectively from the training phase and perform better during inference. This is particularly important in agent systems where the LLM needs to interact with various external components and APIs.\n\n2. **Normalization and Standardization**: Preprocessing often involves normalizing and standardizing data formats, which helps the LLM interpret inputs consistently. This is essential for maintaining reliability in the natural language interface, as inconsistencies in da

### Step-back

In [21]:
from langchain_core.prompts import FewShotChatMessagePromptTemplate

In [22]:
# Few Show Examples
examples = [
    {
        "input": "Could the members of The Police perform lawful arrests?",
        "output": "what cna the member of The Police do?"
    },
    {
        "input": "Jan Sindel's was born in what country?",
        "output": "what is Jan Sindel's personal history?"
    },
]

examples_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{input}"),
        ("ai", "{output}"),
    ]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(
    example_prompt=examples_prompt,
    examples=examples
)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are an expert world knowledge. Your task is to step back and paraphrase a question to more generic step-back question, which is easier to answer. Here a few examples: "
        ),
        few_shot_prompt,
        ("user", "{question}")
    ]
)

In [23]:
generate_queries_step_back = prompt | llm | StrOutputParser()
question = "What is task decomposition for LLM agents?"
generate_queries_step_back.invoke({"question": question})

'What does task decomposition mean in the context of agents?'

In [25]:
from langchain_core.runnables import RunnableLambda

In [26]:
response_prompt_template = """
    You are an expert of world knowledge. I am going to ask you a question.
    Your response should be comprehensive and not contradicted with the following 
    context if they are relevant. Otherwise, ignore them if they are not relevant.

    # {normal_context}
    # {step_back_context}

    # Original question: {question}
    # Answer: 
"""
response_prompt = ChatPromptTemplate.from_template(response_prompt_template)

chain_step_back = (
    {
        # retrieve context using the normal question
        "normal_context": RunnableLambda(lambda x: x["question"]) | retriever,
        # retrieve context using the step-back question
        "step_back_context": generate_queries_step_back | retriever,
        # pass on the question
        "question": lambda x: x["question"],
    }
    | response_prompt
    | llm
    | StrOutputParser()
)

chain_step_back.invoke({"question": question})

'Task decomposition for LLM (large language model) agents refers to the process of breaking down complex tasks into smaller, more manageable subgoals or steps. This approach enhances the agent\'s ability to handle intricate problems by allowing it to focus on simpler components, thereby improving efficiency and effectiveness in task execution.\n\nThere are several techniques and methodologies for task decomposition in LLM agents:\n\n1. **Chain of Thought (CoT)**: This standard prompting technique encourages the model to "think step by step." By doing so, it utilizes more computational resources at test time to decompose difficult tasks into simpler, sequential steps. This method not only aids in task management but also provides insights into the model\'s reasoning process.\n\n2. **Tree of Thoughts**: An extension of CoT, this approach explores multiple reasoning possibilities at each step. It decomposes the problem into various thought steps and generates multiple thoughts for each st

### HyDE

In [28]:
# HyDE document generation
template_hyde = """
    Please write a scientific paper passage to answer the question
    Question: {question}
    Passage: 
"""
prompt_hyde = ChatPromptTemplate.from_template(template_hyde)

generate_docs_for_retrieval_hyde = (
    prompt_hyde | llm | StrOutputParser()
)

# run
question = "What is task decomposition for LLM Agents?"
generate_docs_for_retrieval_hyde.invoke({"question": question})

"**Task Decomposition for LLM Agents: A Comprehensive Overview**\n\nTask decomposition is a critical concept in the realm of Large Language Model (LLM) agents, referring to the systematic breakdown of complex tasks into smaller, more manageable sub-tasks. This process is essential for enhancing the efficiency and effectiveness of LLMs in performing intricate operations that require multi-step reasoning and decision-making.\n\nIn the context of LLM agents, task decomposition involves several key steps. First, the overarching task is analyzed to identify its constituent components. This may include recognizing distinct phases of the task, such as data gathering, processing, analysis, and output generation. By segmenting the task, LLM agents can leverage their capabilities to address each sub-task individually, thereby reducing cognitive load and improving accuracy.\n\nMoreover, task decomposition facilitates parallel processing, where multiple sub-tasks can be executed simultaneously or 

In [29]:
retrieval_chain_hyde = generate_docs_for_retrieval_hyde | retriever
retrieved_docs_hyde = retrieval_chain_hyde.invoke({"question": question})
retrieved_docs_hyde

[Document(metadata={'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}, page_content='Component One: Planning#\nA complicated task usually involves many steps. An agent needs to know what they are and plan ahead.\nTask Decomposition#\nChain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.\nTree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a

In [30]:
# RAG
template = """
    Answer the following question based on this context:
    {context}

    Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
final_rag_chain = (
    prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"context": retrieved_docs_hyde, "question": question})

'Task decomposition for LLM (large language model) agents involves breaking down complex tasks into smaller, manageable subgoals. This process enhances the agent\'s ability to handle complicated tasks by allowing it to plan ahead and think step by step. \n\nThere are several methods for task decomposition:\n\n1. **Chain of Thought (CoT)**: This technique encourages the model to "think step by step," transforming large tasks into simpler, manageable steps, which also provides insight into the model\'s reasoning process.\n\n2. **Tree of Thoughts**: This approach extends CoT by exploring multiple reasoning possibilities at each step. It decomposes the problem into various thought steps and generates multiple thoughts per step, creating a tree structure that can be searched using breadth-first or depth-first search methods.\n\n3. **Prompting**: Simple prompts can be used to guide the LLM in task decomposition, such as asking for the steps to achieve a goal or identifying subgoals.\n\n4. **