# Query-Decomposition Retriever Demo

The QueryDecompositionRetriever automates the process of prompt tuning by using an LLM to decompose a query into multiple sub-problems from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the MultiQueryRetriever can mitigate some of the limitations of the distance-based retrieval and get a richer set of results.

## What this notebook contains
- Decomposing a query into multiple sub-problems.
- Retrieving documents for each sub-problem using a vector store (Chroma) and OpenAI embeddings.
- Building a simple RAG pipeline that answers a question using either individually or recursively.

In [1]:
from langchain.text_splitter import RecursiveCharacterTextSplitter  
from langchain_community.document_loaders import WebBaseLoader  
from langchain_community.vectorstores import Chroma  
from langchain_core.output_parsers import StrOutputParser  
from langchain_openai import ChatOpenAI, OpenAIEmbeddings 
from langchain.prompts import ChatPromptTemplate
from langchain.load import dumps, loads
from langchain import hub
from operator import itemgetter
import yaml
import bs4  
import os

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [2]:
# Get the current working directory
cwd = os.getcwd()

# Build the path to config.yaml
config_path = os.path.join(cwd, '..', 'configs', 'config.yaml')

# Normalize the path
config_path = os.path.abspath(config_path)

# Load credential from config file
with open(config_path, 'r') as file:
    config = yaml.safe_load(file)

# Set environment variables
os.environ['LANGCHAIN_API_KEY'] = config['API']['LANGCHAIN']
os.environ['OPENAI_API_KEY'] = config['API']['OPENAI']

# Configure chat LLM (deterministic)
llm = ChatOpenAI(temperature=0) 

In [3]:
# Create a loader that fetches and parses the target web page
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),  # tuple of URLs to load
    bs_kwargs=dict(  # pass BeautifulSoup-specific kwargs to limit parsing
        parse_only=bs4.SoupStrainer(  # only parse these parts of the page to reduce noise
            class_=("post-content", "post-title", "post-header")
        )
    ),
)

# Fetch and return a list of Document objects
docs = loader.load()  

# Split long documents into smaller overlapping chunks suitable for embeddings
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
splits = text_splitter.split_documents(docs)  # list of smaller Document chunks

# Create embeddings and store them in a vector DB (Chroma)
vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=OpenAIEmbeddings())  # uses OpenAI embeddings under the hood

# Create a retriever to fetch relevant docs (return the top 1 result)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

In [4]:
# Ask an LLM to decompose the question into multiple sub-questions
template ="""
You are a helpful assistant that generates multiple sub-questions related to an input question. \n
The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n
Generate multiple search queries related to: {question} \n
Output (3 queries):
"""

# Build the prompt object
prompt_perspectives = ChatPromptTemplate.from_template(template)

# Pipeline: prompt -> chat model -> parse -> split into separate queries
generate_queries = (
    prompt_perspectives 
    | llm
    | StrOutputParser() 
    | (lambda x: x.split("\n"))  # split on newlines into a list
)

In [5]:
# Example input question
question = "What are the main components of an LLM-powered autonomous agent system?"

# decompose the question into multiple sub-questions
result = generate_queries.invoke({"question":question})  

# Display the generated variants
display(result)

['1. What is LLM technology and how does it work in autonomous agent systems?',
 '2. What are the specific components that make up an LLM-powered autonomous agent system?',
 '3. How do the main components of an LLM-powered autonomous agent system interact with each other to enable autonomous behavior?']

##### Answer each sub-question individually via retrieval


In [6]:
# RAG prompt
prompt_rag = hub.pull("rlm/rag-prompt")

def retrieve_and_rag(question,prompt_rag,sub_question_generator_chain):
    """
    RAG on each sub-question
    """
    
    # Use our decomposition / 
    sub_questions = sub_question_generator_chain.invoke({"question":question})
    
    # Initialize a list to hold RAG chain results
    rag_results = []
    
    for sub_question in sub_questions:
        
        # Retrieve documents for each sub-question
        retrieved_docs = retriever.get_relevant_documents(sub_question)
        
        # Use retrieved documents and sub-question in RAG chain
        answer = (prompt_rag | llm | StrOutputParser()).invoke({"context": retrieved_docs, 
                                                                "question": sub_question})
        rag_results.append(answer)
    
    return rag_results,sub_questions

# Wrap the retrieval and RAG process in a RunnableLambda for integration into a chain
answers, questions = retrieve_and_rag(question, prompt_rag, generate_queries)
print(f"Questions: {questions}")
print(f"Answers: {answers}")

  retrieved_docs = retriever.get_relevant_documents(sub_question)


Questions: ['1. What is LLM technology and how does it work in autonomous agent systems?', '2. What are the specific components that make up an LLM-powered autonomous agent system?', '3. How do the main components of an LLM-powered autonomous agent system interact with each other to enable autonomous behavior?']
Answers: ['LLM technology functions as the brain of autonomous agent systems. It works by being complemented by key components within the system.', "The specific components that make up an LLM-powered autonomous agent system include LLM as the agent's brain, along with several key components.", "The main components of an LLM-powered autonomous agent system interact with LLM, which acts as the agent's brain. These components work together to enable autonomous behavior in the system. The key components complement LLM's functions to facilitate autonomous decision-making and actions."]


In [7]:
def format_qa_pairs(questions, answers):
    """
    Format Q and A pairs
    """
    
    formatted_string = ""
    for i, (question, answer) in enumerate(zip(questions, answers), start=1):
        formatted_string += f"Question {i}: {question}\nAnswer {i}: {answer}\n\n"
    return formatted_string.strip()

context = format_qa_pairs(questions, answers)

# Prompt
template = """Here is a set of Q+A pairs:

{context}

Use these to synthesize an answer to the question: {question}
"""

# Build the prompt object
prompt = ChatPromptTemplate.from_template(template)

# Compose pipeline: retrieval_chain -> prompt -> llm -> parse
final_rag_chain = (
    prompt
    | llm
    | StrOutputParser()
)

# Execute the chain and get the answer
final_rag_chain.invoke({"context":context,"question":question})

'The main components of an LLM-powered autonomous agent system include LLM technology as the brain of the system, along with key components that work together to enable autonomous behavior. These components interact with LLM to facilitate autonomous decision-making and actions within the system.'

##### Answer each sub-question recursively via retrieval


In [8]:
# Prompt
template = """Here is the question you need to answer:

\n --- \n {question} \n --- \n

Here is any available background question + answer pairs:

\n --- \n {q_a_pairs} \n --- \n

Here is additional context relevant to the question: 

\n --- \n {context} \n --- \n

Use the above context and any background question + answer pairs to answer the question: \n {question}
"""

recursive_prompt = ChatPromptTemplate.from_template(template)

In [9]:
def format_qa_pair(question, answer):
    """
    Format Q and A pair
    """
    
    formatted_string = ""
    formatted_string += f"Question: {question}\nAnswer: {answer}\n\n"
    return formatted_string.strip()

# Execute the chain and get the answer
q_a_pairs = ""
for q in questions:
    
    rag_chain = (
    {"context": itemgetter("question") | retriever, 
     "question": itemgetter("question"),
     "q_a_pairs": itemgetter("q_a_pairs")} 
    | recursive_prompt
    | llm
    | StrOutputParser())

    answer = rag_chain.invoke({"question":q,"q_a_pairs":q_a_pairs})
    q_a_pair = format_qa_pair(q,answer)
    q_a_pairs = q_a_pairs + "\n---\n"+  q_a_pair

In [10]:
answer

"The main components of an LLM-powered autonomous agent system interact with each other to enable autonomous behavior by working together in a coordinated manner. The LLM functions as the agent's brain, processing and generating human-like text to understand and respond to user inputs. This natural language processing capability allows the agent to communicate effectively with users. \n\nIn addition to the LLM, there are several key components that complement its functions within the system. These components may include modules for speech recognition, natural language understanding, dialogue management, and decision-making. Each component plays a specific role in the overall functioning of the autonomous agent system.\n\nThrough the interaction and collaboration of these components, the autonomous agent system is able to process user inputs, generate appropriate responses, make decisions based on the context of the conversation, and carry out tasks autonomously. This coordinated intera