# Advanced Retrieval-Augmented Generation: Query Expansion

Advanced RAG techniques introduce improvements to overcome the limitations of naive RAG, focusing on enhancing retrieval quality and integration with generation. These enhancements are often implemented in two key phases:

1. **Pre-retrieval Enhancements**

  These techniques focus on improving the indexing structure and refining the user’s query. By organizing the index more effectively and incorporating additional metadata or context (e.g., embeddings or domain-specific tags), retrieval becomes more accurate and relevant to the query.

2. **Post-retrieval Processing**
  
  We combine the data obtained during the pre-retrieval phase with the original query. This could involve reraking to highlight the most important content.



In this notebook, we demonstrate how to to enhance the pre-retrieval phase of RAG using two methods:

1. Query expansion with hypothetical answer
2. Query expansion with multiple queries

Before introducing query expansion, let's first revisit RAG and its pitfalls.

## Naive RAG Overview

Naive RAG involves three primary phases:

1. **Indexing**:
   - Prepares the document collection for retrieval by cleaning and extracting relevant information from each document. This phase involves parsind and preprocessing documents, chunking the parsed documents, using an embedding model to generate vectors out of the chunks, and storing them into a vector database.
   
2. **Retrieval**:
   - Converts the user query into a vector representation using the embedding model.
   - Compares the vectorized query with the vectors stored in the vector database to retrieve the most relevant chunks.

3. **Generation**:
   - Augments the user query with the retrieved chunks into a single prompt.
   - Generates an answer to the user query using a language model based on this combined information.

## Naive RAG Pitfalls

Naive RAG faces several challenges that can hinder its performance. For instance, during the retrieval phase, crucial information relevant to the query might be overlooked, while irrelevant chunks may be selected. Additionally, hallucinations can occur during the generation phase, resulting in incorrect or fabricated information.

These challenges occur mainly due to:

1. **Limited contextual understanding**:
   
   Naive RAG often relies on keyword matching or basic semantic search, which may retrieve irrelevant or only partially relevant documents. For example, a query like "*...the impact of climate change on polar bears*" might retrieve documents related to *climate change* and *polar bears* individually, but fail to prioritize documents specifically addressing the relationship between *climate change* **and** *polar bears*. This results in incomplete or misleading results.
  
2. **Inconsistent Relevance and Quality of Retrieved Documents**  
   Naive RAG struggles with ranking retrieved documents effectively, often leading to the inclusion of irrelevant or low-quality inputs. This compromises the quality of the subsequent generation phase.

3. **Poor Integration Between Retrieval and Generation**  
   In naive RAG, the retriever and the generator operate independently, resulting in a lack of alignment between the retrieved information and the context required for accurate generation. This disjointed operation often leads to incoherent or irrelevant outputs.

4. **Inefficient Handling of Large-Scale Data**  
   Standard retrieval techniques in naive RAG can be inefficient when dealing with large datasets. This inefficiency may cause delays in retrieving relevant documents or lead to the omission of critical information due to inadequate indexing or suboptimal retrieval strategies.

5. **Lack of Robustness and Adaptability**  
   Naive RAG lacks advanced mechanisms for managing ambiguous or complex queries. When queries include multiple or nuanced components, it often fails to adapt effectively, producing incomplete or unsatisfactory answers due to its limited ability to process such complexity.

## Query Expansion

Query expansion is a technique used in information retrieval and search engines to improve search results by augmenting a user's original query with additional, related terms or phrases. The primary goal is to address issues such as ambiguous or overly broad queries and to increase the likelihood of retrieving relevant documents. This process is particularly beneficial for users who may be unfamiliar with the best search terms to use, helping them locate the information they need more effectively.

While query expansion can reduce irrelevant results and improve specificity, over-expansion — adding too many or inappropriate terms — can dilute the query, potentially introducing irrelevant results.



### Query Expansion Methods
1. **Phrase Expansion**
  
  Instead of adding single words, this method incorporates related phrases to make search results more precise. For example, a query for "heart attack" might expand to include "myocardial infarction," which is a synonymous medical term.


2. **Contextual or Conceptual Expansion**
  
  This approach leverages a knowledge base or thesaurus (e.g., WordNet, medical ontologies) to include terms related to the context or meaning of the original query. For instance, a search for "virus" could be expanded to include terms like "infection" or "disease" if the system recognizes the medical context.


3. **Synonym Expansion**
  
  Synonyms of the original query terms are added to broaden the search scope. For example, a query for "car" might expand to include "automobile" and "vehicle."


4. **Hypothetical Answer-Based Expansion**
  
  This method uses a LLM to generate a hypothetical response to the user’s query. The response is analyzed for relevant terms and phrases, which are then incorporated into the retrieval process. For example, given the query "What are the symptoms of flu?" the LLM might generate a response mentioning "fever, fatigue, and sore throat," which could enhance retrieval. However, inaccuracies in generated information could lead to noise, requiring careful handling to maintain retrieval quality.


5. **Multi-Query Expansion**
  
  This technique involves using an LLM to generate multiple variations of the user’s query. Each query variation, along with the original, is used for separate retrieval processes, increasing the likelihood of retrieving relevant information. Due to the larger volume of retrieved data, a reranker—typically a machine learning model—can be employed to prioritize the most relevant chunks of information. Find more [here](https://www.youtube.com/watch?v=JChPi0CRnDY).

# Implementation

## Install Dependencies

In [1]:
pip install -qU pypdf langchain-chroma langchain-huggingface langchain-text-splitters langchain-community umap-learn

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m298.0/298.0 kB[0m [31m11.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m47.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.8/88.8 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m628.3/628.3 kB[0m [31m32.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m64.8 MB/s[0m eta [36m0:00:00

## Import Libraries

In [45]:
import os
import logging
import re
import numpy as np
import nest_asyncio

from pydantic import BaseModel, Field
from typing import List

# from langchain import hub
from langchain_chroma import Chroma
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings, HuggingFaceEndpoint, ChatHuggingFace
from langchain_core.output_parsers import StrOutputParser, BaseOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain.retrievers.multi_query import MultiQueryRetriever


nest_asyncio.apply()

os.environ["HUGGINGFACEHUB_API_TOKEN"]="HF_APIKEY"
os.environ["VECTOR_DB_DIR"] = "./content/advanced_rag_vector_db"

## Download Data For RAG

We are going to use Apple Inc.'s 2023 Form 10-K report for RAG. A Form 10-K is an annual report mandated by the U.S. Securities and Exchange Commission (SEC) that provides a detailed overview of a company's financial performance. It includes critical information such as company history, organizational structure, executive compensation, equity holdings, subsidiaries, and audited financial statements. These reports are extensive and complex, containing essential financial data, risk factors, and operational insights.

In [3]:
!wget https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf

--2024-12-11 11:52:13--  https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 68.70.205.1, 68.70.205.3, 68.70.205.2, ...
Connecting to s2.q4cdn.com (s2.q4cdn.com)|68.70.205.1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 714094 (697K) [application/pdf]
Saving to: ‘_10-K-Q4-2023-As-Filed.pdf’


2024-12-11 11:52:13 (11.9 MB/s) - ‘_10-K-Q4-2023-As-Filed.pdf’ saved [714094/714094]



## Extract Text From PDF File

In [5]:
FILE_PATH = "/content/_10-K-Q4-2023-As-Filed.pdf"

loader = PyPDFLoader(FILE_PATH)
pages = []
async for page in loader.alazy_load():
    pages.append(page)

In [6]:
print(f"{pages[0].metadata}\n")
print(pages[0].page_content)

{'source': '/content/_10-K-Q4-2023-As-Filed.pdf', 'page': 0}

UNITED STATES
SECURITIES AND EXCHANGE COMMISSION
Washington, D.C. 20549
FORM 10-K
(Mark One)
☒    ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934
For the fiscal year ended September 30, 2023
or
☐    TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934
For the transition period from              to             .
Commission File Number: 001-36743
Apple Inc.
(Exact name of Registrant as specified in its charter)
California 94-2404110
(State or other jurisdiction
of incorporation or organization)
(I.R.S. Employer Identification No.)
One Apple Park Way
Cupertino, California 95014
(Address of principal executive offices) (Zip Code)
(408) 996-1010
(Registrant’s telephone number, including area code)
Securities registered pursuant to Section 12(b) of the Act:
Title of each class
Trading 
symbol(s) Name of each exchange on which registered
Common Stock, $0.00001

## Preprocess Text Content

In [7]:
# Remove unnecessary pages
filtered_pages = pages[3:77]

In [8]:
# Preprocess text
preprocessed_pages = filtered_pages
for preprocess_page in preprocessed_pages:
    page_content = preprocess_page.page_content
    page_content = re.sub(r"☒|☐|—|®|™", "", page_content) # remove special symbols
    page_content = re.sub(r"Apple Inc. \| \d{4} Form 10-K \| \d+[a-zA-Z]*", "", page_content, flags=re.IGNORECASE) # remove boilerplates
    page_content = page_content.lower() # convert text to lowercase
    page_content = re.sub(r"\s+", " ", page_content) # replace multiple spaces with a single space
    page_content = page_content.strip() # strip leading/trailing spaces

    preprocess_page.page_content = page_content

In [9]:
preprocessed_pages[0]

Document(metadata={'source': '/content/_10-K-Q4-2023-As-Filed.pdf', 'page': 3}, page_content='this annual report on form 10-k (“form 10-k”) contains forward-looking statements, within the meaning of the private securities litigation reform act of 1995, that involve risks and uncertainties. many of the forward-looking statements are located in part i, item 1 of this form 10-k under the heading “business” and part ii, item 7 of this form 10-k under the heading “management’s discussion and analysis of financial condition and results of operations.” forward-looking statements provide current expectations of future events based on certain assumptions and include any statement that does not directly relate to any historical or current fact. for example, statements in this form 10-k regarding the potential future impact of macroeconomic conditions on the company’s business and results of operations are forward-looking statements. forward- looking statements can also be identified by words suc

In [10]:
# Split text into chunks with a specified max length and overlap between chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=512,
    chunk_overlap=128,
    length_function=len,
    is_separator_regex=False,
)

In [11]:
chunks = await text_splitter.atransform_documents(preprocessed_pages)

In [12]:
chunks[0]

Document(metadata={'source': '/content/_10-K-Q4-2023-As-Filed.pdf', 'page': 3}, page_content='this annual report on form 10-k (“form 10-k”) contains forward-looking statements, within the meaning of the private securities litigation reform act of 1995, that involve risks and uncertainties. many of the forward-looking statements are located in part i, item 1 of this form 10-k under the heading “business” and part ii, item 7 of this form 10-k under the heading “management’s discussion and analysis of financial condition and results of operations.” forward-looking statements provide current')

In [17]:
print(f"Total number of chunks: {len(chunks)}")
print(f"Average chunk length: {np.mean([len(chunk.page_content) for chunk in chunks]).astype(int)}")

Total number of chunks: 666
Average chunk length: 487


## Create a Chroma Vector Database

We will first load an embedding model. Next, we will create our vector database, embed and store the chunks. Finally, we save the vector database in a persistent directory, so we do not need to create it again.

In [18]:
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

  from tqdm.autonotebook import tqdm, trange
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [19]:
!unzip /content/advanced_rag_vector_db.zip

Archive:  /content/advanced_rag_vector_db.zip
   creating: content/advanced_rag_vector_db/
  inflating: content/advanced_rag_vector_db/chroma.sqlite3  
   creating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/
 extracting: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/link_lists.bin  
  inflating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/header.bin  
  inflating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/length.bin  
  inflating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/data_level0.bin  
   creating: content/advanced_rag_vector_db/.ipynb_checkpoints/
   creating: content/advanced_rag_vector_db/1aa2a274-82f6-444a-9d94-b4ef247b4cc9/
 extracting: content/advanced_rag_vector_db/1aa2a274-82f6-444a-9d94-b4ef247b4cc9/link_lists.bin  
  inflating: content/advanced_rag_vector_db/1aa2a274-82f6-444a-9d94-b4ef247b4cc9/header.bin  
  inflating: content/advanced_rag_vector_db

In [157]:
vector_db = Chroma(
    collection_name="example_collection",
    embedding_function=embedding_model,
    persist_directory=os.getenv("VECTOR_DB_DIR")
)

In [16]:
# _ = await vector_db.aadd_documents(preprocessed_pages)

In [17]:
# !zip -r advanced_rag_vector_db.zip /content/advanced_rag_vector_db

updating: content/advanced_rag_vector_db/ (stored 0%)
updating: content/advanced_rag_vector_db/chroma.sqlite3 (deflated 64%)
updating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/ (stored 0%)
updating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/link_lists.bin (stored 0%)
updating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/header.bin (deflated 61%)
updating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/length.bin (deflated 15%)
updating: content/advanced_rag_vector_db/fbe97fe7-5eeb-4d05-aa42-287a3b26ec61/data_level0.bin (deflated 100%)
  adding: content/advanced_rag_vector_db/.ipynb_checkpoints/ (stored 0%)
  adding: content/advanced_rag_vector_db/1aa2a274-82f6-444a-9d94-b4ef247b4cc9/ (stored 0%)
  adding: content/advanced_rag_vector_db/1aa2a274-82f6-444a-9d94-b4ef247b4cc9/link_lists.bin (stored 0%)
  adding: content/advanced_rag_vector_db/1aa2a274-82f6-444a-9d94-b4ef247b4cc9/header.bin (de

In [158]:
# Retrieve the top 2 documents related to the query about Apple based on similarity search
retriever = vector_db.as_retriever(
    search_type="similarity", search_kwargs={"k": 2}
)

In [159]:
retriever.invoke("Apple's sales")

[Document(metadata={'page': 30, 'source': '/content/_10-K-Q4-2023-As-Filed.pdf'}, page_content='apple inc. consolidated statements of operations (in millions, except number of shares, which are reflected in thousands, and per-share amounts) years ended september 30, 2023 september 24, 2022 september 25, 2021 net sales: products $ 298,085 $ 316,199 $ 297,392 services 85,200 78,129 68,425 total net sales 383,285 394,328 365,817 cost of sales: products 189,282 201,471 192,266 services 24,855 22,075 20,715 total cost of sales 214,137 223,546 212,981 gross margin 169,148 170,782 152,836 operating expenses: research and development 29,915 26,251 21,914 selling, general and administrative 24,932 25,094 21,973 total operating expenses 54,847 51,345 43,887 operating income 114,301 119,437 108,949 other income/(expense), net (565) (334) 258 income before provision for income taxes 113,736 119,103 109,207 provision for income taxes 16,741 19,300 14,527 net income $ 96,995 $ 99,803 $ 94,680 earnin

## Pre-trained Model for Generation

For convenience, we will send our prompts to a Hugging Face endpoint, which will return an answer using a model we choose. Using an endpoint is conveniente because it does not require loading the model and making inferences on our end.

In [50]:
llm = HuggingFaceEndpoint(
    repo_id="microsoft/Phi-3-mini-4k-instruct",
    task="text-generation",
    max_new_tokens=512,
    do_sample=False,
    repetition_penalty=1.03
)

model = ChatHuggingFace(llm=llm, verbose=True)

tokenizer_config.json:   0%|          | 0.00/3.44k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.94M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/306 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/599 [00:00<?, ?B/s]

## Query

In [51]:
query = "What was the net sales by category of product?"

## Naive RAG

We first implement naive RAG, which will serve as a baseline.

In [52]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [53]:
RAG_TEMPLATE = """
You are a helpful AI assistant specialized in question-answering tasks in the field of Finance.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.

<context>
{context}
</context>

Answer the following question:

{question}"""

rag_prompt = ChatPromptTemplate.from_template(RAG_TEMPLATE)

In [54]:
naive_rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

In [108]:
print(f"Query:\n{query}\n\nResponse (Naive RAG):\n{naive_rag_chain.invoke(query)}")

Query:
What was the net sales by category of product?

Response (Naive RAG):
The net sales by category of product for 2023 were as follows (in millions): iPhone - $200,583; Mac - $29,357; iPad - $28,300; Wearables, Home, and Accessories - $39,845; Services - $85,200. The total net sales were $383,285. The net sales for the other


## Advanced RAG: Query Expansion with Hypothetical Answer

### Hypothetical Answer Chain

First, we define a chain that generates a hypothetical answer to our query.

In [56]:
GENERATE_HYPOTHETICAL_ANSWER_TEMPLATE = """
    You are a helpful AI assistant specialized in question-answering tasks in the field of Finance.
    Provide an example answer to the given question, that might be found in a document like an annual report.

    Answer the following question:

    {question}
"""

gen_hypothetical_answer_prompt = ChatPromptTemplate.from_template(GENERATE_HYPOTHETICAL_ANSWER_TEMPLATE)

In [58]:
gen_hypothetical_answer_chain = (
    gen_hypothetical_answer_prompt
    | model
    | StrOutputParser()
)

In [66]:
gen_hypothetical_answer_chain.invoke(query)

'The net sales by category of product were as follows: Electronics generated $25 million in net sales, while Apparel sales amounted to $15 million, and Home Goods sales reached $10 million, resulting in a total net sales of $50 million this year.'

### Expand Query With Hypothetical Answer

Next, we create the chain that expands our initial query with the hypothetical answer and retrieves documents based on the expanded query.

In [61]:
def expand_query(question):
    hypothetical_answer = gen_hypothetical_answer_chain.invoke(question)
    return f"""{question}\n{hypothetical_answer}"""

In [62]:
rag_with_hypothetical_answer_chain = (
    {
        "context": expand_query | retriever | format_docs,
        "question": RunnablePassthrough()
     }
    | rag_prompt
    | model
    | StrOutputParser()
)

In [107]:
print(f"Query:\n{query}\n\nResponse (RAG With Hypothetical Answer):\n{rag_with_hypothetical_answer_chain.invoke(query)}")

Query:
What was the net sales by category of product?

Response (RAG With Hypothetical Answer):
The net sales by category of product for 2023 were as follows (in millions): iPhone - $200,583; Mac - $29,357; iPad - $28,300; Wearables, Home, and Accessories - $39,845; Services - $85,200. The total net sales were $383,285. The net sales for the other


## Advanced RAG: Query Expansion with Multiple Queries

### Multiple Queries Chain

Now, we will implement the second query expansion method we discussed. We first create chain that generate multiple queries. Next, we use this multi-query chain to instantiate a multi-query retriever. This retriever returns the union of all unique docs retrieved from our vector database using all queries we generated. Read more on [LangChain MultiQuery Retriever](https://python.langchain.com/docs/how_to/MultiQueryRetriever/).


In [79]:
GENERATE_MULTIPLE_QUERIES_TEMPLATE = """
    You are a helpful AI assistant specialized in Finance.
    Your task is to generate five different versions of the given user question to retrieve relevant documents from a vector database.
    By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity search.
    These five versions should be concise, single-topic questions (without compounding sentences) that cover various aspects of the topic.
    Ensure each question is complete and directly related to the original inquiry.
    List each question on a separate line.
    Do not number the queries.

    Original question: {question}
"""

gen_multiple_queries_prompt = PromptTemplate(input_variables=["question"], template=GENERATE_MULTIPLE_QUERIES_TEMPLATE)

In [80]:
# Output parser will split the LLM result into a list of queries
class LineListOutputParser(BaseOutputParser[List[str]]):
    """Output parser for a list of lines."""

    def parse(self, text: str) -> List[str]:
        lines = text.strip().split("\n")
        return list(filter(None, lines))  # Remove empty lines

In [None]:
output_parser = LineListOutputParser()

In [81]:
gen_multiple_queries_chain = (
    gen_multiple_queries_prompt
    | model
    | output_parser
)

In [82]:
gen_multiple_queries_chain.invoke(query)

['"Can you provide the figures for total net sales categorized by product type?"',
 '"Could you break down the sales figures according to different categories of products?"',
 '"Please list the net sales numbers categorized by the type of products available."',
 '"How do the sales by product category compare for net sales figures?"',
 '"Show the net sales reported categorized by the different areas of product offerings."']

In [109]:
multi_query_retriever = MultiQueryRetriever(
    retriever=retriever,
    llm_chain=gen_multiple_queries_chain,
    parser_key="lines"
)

In [110]:
# Set logging for the queries
logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

In [135]:
unique_docs = multi_query_retriever.invoke(query)
print("\n")
unique_docs

INFO:langchain.retrievers.multi_query:Generated queries: ['"Can you provide the figures for total net sales categorized by product type?"', '"Could you break down the sales figures according to different categories of products?"', '"Please list the net sales numbers categorized by the type of products available."', '"How do the sales by product category compare for net sales figures?"', '"Show the net sales reported categorized by the different areas of product offerings."']






[Document(metadata={'page': 37, 'source': '/content/_10-K-Q4-2023-As-Filed.pdf'}, page_content='net sales disaggregated by significant products and services for 2023, 2022 and 2021 were as follows (in millions): 2023 2022 2021 iphone (1) $ 200,583 $ 205,489 $ 191,973 mac (1) 29,357 40,177 35,190 ipad (1) 28,300 29,292 31,862 wearables, home and accessories (1) 39,845 41,241 38,367 services (2) 85,200 78,129 68,425 total net sales $ 383,285 $ 394,328 $ 365,817 (1) products net sales include amortization of the deferred value of unspecified software upgrade rights, which are bundled in the sales price of the respective product. (2) services net sales include amortization of the deferred value of services bundled in the sales price of certain products. total net sales include $8.2 billion of revenue recognized in 2023 that was included in deferred revenue as of september 24, 2022, $7.5 billion of revenue recognized in 2022 that was included in deferred revenue as of september 25, 2021 , a

### RAG Based On Multiple Queries

Now, we use the multi-query retriever to retrieve the union of all documents using the multiple queries. In order not to retrieve more than 2 docs, we take only the first two documents that were retrieved.

In [142]:
def limit_n_retrieved_docs(docs: List) -> List:
    return docs[:2]

In [145]:
multiple_queries_rag_chain = (
    {"context": multi_query_retriever | limit_n_retrieved_docs | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

In [146]:
print(f"Query:\n{query}\n\nResponse (RAG With Multi-Query):\n{multiple_queries_rag_chain.invoke(query)}")

INFO:langchain.retrievers.multi_query:Generated queries: ['"Can you provide the figures for total net sales categorized by product type?"', '"Could you break down the sales figures according to different categories of products?"', '"Please list the net sales numbers categorized by the type of products available."', '"How do the sales by product category compare for net sales figures?"', '"Show the net sales reported categorized by the different areas of product offerings."']


Query:
What was the net sales by category of product?

Response (RAG With Multi-Query):
iPhone: $200,583 million
Mac: $29,357 million
iPad: $28,300 million
Wearables, Home and Accessories: $39,845 million
Services: $85,200 million

Total Net Sales: $383,285 million
