# File-level Retrieval with LlamaCloud

<a href="https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/10k_apple_tesla/demo_file_retrieval.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we show you how to perform file-level retrieval with LlamaCloud. File-level retrieval is useful for handling user questions that require the entire document context to properly answer the question. We first show you how to build a file-level and chunk-level retriever / query engine. 

Since only doing file-level retrieval can be slow + expensive, we also show you how to build an agent that can dynamically decide whether to do file-level or chunk-level retrieval! 

## Setup

Install core packages, download files. You will need to upload these documents to LlamaCloud.

In [None]:
!pip install llama-index
!pip install llama-index-core
!pip install llama-index-embeddings-openai
!pip install llama-index-question-gen-openai
!pip install llama-index-postprocessor-flag-embedding-reranker
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
!pip install llama-parse

In [None]:
# download Apple 
!wget "https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf" -O data/apple_2023.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf" -O data/apple_2022.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf" -O data/apple_2021.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2020/ar/_10-K-2020-(As-Filed).pdf" -O data/apple_2020.pdf
!wget "https://www.dropbox.com/scl/fi/i6vk884ggtq382mu3whfz/apple_2019_10k.pdf?rlkey=eudxh3muxh7kop43ov4bgaj5i&dl=1" -O data/apple_2019.pdf

# download Tesla
!wget "https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf" -O data/tesla_2023.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000095017023001409/tsla-20221231-gen.pdf" -O data/tesla_2022.pdf
!wget "https://www.dropbox.com/scl/fi/ptk83fmye7lqr7pz9r6dm/tesla_2021_10k.pdf?rlkey=24kxixeajbw9nru1sd6tg3bye&dl=1" -O data/tesla_2021.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000156459021004599/tsla-10k_20201231-gen.pdf" -O data/tesla_2020.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000156459020004475/tsla-10k_20191231-gen_0.pdf" -O data/tesla_2019.pdf

Some OpenAI and LlamaParse details. The OpenAI LLM is used for response synthesis.

In [1]:
# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio
import nest_asyncio
nest_asyncio.apply()

In [2]:
import os
# API access to llama-cloud
os.environ["LLAMA_CLOUD_API_KEY"] = "llx-"

In [3]:
# Using OpenAI API for embeddings/llms
os.environ["OPENAI_API_KEY"] = "sk-"

## Load Documents into LlamaCloud

The first order of business is to download the 5 Apple and Tesla 10Ks and upload them into LlamaCloud.

You can easily do this by creating a pipeline and uploading docs via the "Files" mode.

After this is done, proceed to the next section.

## Define LlamaCloud File/Chunk Retriever over Documents

In this section we define both a file-level and chunk-level LlamaCloud Retriever over these documents.

The file-level LlamaCloud retriever returns entire documents with a `files_top_k`. There are two retrieval modes:
- `files_via_content`: Retrieve top-k chunks, dereference into source files. Use a weighted average heuristic to determine the top files to return.
- `files_via_metadata`: Use an LLM to analyze the metadata of each file, and determine the top files that are most relevant to the query.

The chunk-level LlamaCloud retriever is our default retriever that returns chunks via hybrid search + reranking.

In [3]:
from llama_index.indices.managed.llama_cloud import LlamaCloudIndex
import os

index = LlamaCloudIndex(
  name="apple_tesla_demo_base",
  project_name="llamacloud_demo",
  api_key=os.environ["LLAMA_CLOUD_API_KEY"]
)

#### Define File Retriever

In this section we define the file-level retriever. By default we use `retrieval_mode="files_via_content"`, but you can also change it to `files_via_metadata`.

In [4]:
doc_retriever = index.as_retriever(
    retrieval_mode="files_via_content",
    # retrieval_mode="files_via_metadata",
    files_top_k=1
)

In [5]:
nodes = doc_retriever.retrieve("Give me a summary of Tesla in 2019") 

In [None]:
print(len(nodes))
print(nodes[0].get_content(metadata_mode="all"))

In [13]:
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")
query_engine_doc = RetrieverQueryEngine.from_args(
    doc_retriever, 
    llm=llm,
    response_mode="tree_summarize"
)

In [9]:
response = query_engine_doc.query("Give me a summary of Tesla in 2019")
print(str(response))

In 2019, Tesla, Inc. made significant strides in its mission to promote sustainable energy through the production of electric vehicles, solar energy systems, and energy storage solutions. The company achieved record vehicle deliveries and production, with 367,656 vehicles delivered and 365,232 produced. Key developments included the start of Model 3 production at Gigafactory Shanghai, preparations for Model Y production, and the unveiling of the Cybertruck. Enhancements to Autopilot and Full Self-Driving features improved user experience.

In the energy sector, Tesla saw a 48% increase in solar deployments in the latter half of the year and deployed 1.65 GWh of energy storage. Notable products launched included the third generation of the Solar Roof and the Megapack for utility-scale energy storage.

Financially, Tesla reported revenues of $24.58 billion, a 15% increase from the previous year, although it faced a net loss of $862 million, an improvement from the prior year's loss. The 

#### Define chunk retriever

The chunk-level retriever does vector search with a final reranked set of `rerank_top_n=5`.

In [14]:
chunk_retriever = index.as_retriever(
    retrieval_mode="chunks",
    rerank_top_n=5
)

llm = OpenAI(model="gpt-4o-mini")
query_engine_chunk = RetrieverQueryEngine.from_args(
    chunk_retriever, 
    llm=llm,
    response_mode="tree_summarize"
)

## Build an Agent

In this section we build an agent that takes in both file-level and chunk-level query engines as tools. It decides which query engine to call depending on the nature of this question.

In [15]:
from llama_index.core.tools import FunctionTool, ToolMetadata, QueryEngineTool


# this variable tells the agent specific properties about your document.
doc_metadata_extra_str = """\
Each document represents a complete 10K report for a given year (e.g. Apple in 2019). 
Here's an example of relevant documents:
1. apple_2019.pdf
2. tesla_2020.pdf
"""

tool_doc_description = f"""\
Synthesizes an answer to your question by feeding in an entire relevant document as context. Best used for higher-level summarization questions.
Do NOT use if the answer can be found in a specific chunk of a given document. Use the chunk_query_engine instead for that purpose.

Below we give details on the format of each document:
{doc_metadata_extra_str}

"""

tool_chunk_description = f"""\
Synthesizes an answer to your question by feeding in a relevant chunk as context. Best used for questions that are more pointed in nature.
Do NOT use if the question asks seems to require a general summary of any given document. Use the doc_query_engine instead for that purpose.

Below we give details on the format of each document:
{doc_metadata_extra_str}
"""

tool_doc = QueryEngineTool(
    query_engine=query_engine_doc,
    metadata=ToolMetadata(
        name="doc_query_engine",
        description=tool_doc_description
    ),
)
tool_chunk = QueryEngineTool(
    query_engine=query_engine_chunk,
    metadata=ToolMetadata(
        name="chunk_query_engine",
        description=tool_chunk_description
    ),
)

In [18]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
from llama_index.llms.openai import OpenAI

llm_agent = OpenAI(model="gpt-4o")
agent = FunctionCallingAgentWorker.from_tools(
    [tool_doc, tool_chunk], llm=llm_agent, verbose=True
).as_agent()

In [19]:
response = agent.chat("Tell me the revenue for Apple and Tesla in 2021?")

Added user message to memory: Tell me the revenue for Apple and Tesla in 2021?
=== Calling Function ===
Calling function: chunk_query_engine with args: {"input": "What was Apple's revenue in 2021?"}
=== Function Output ===
Apple's revenue in 2021 was $365.8 billion, which includes net sales from various regions and product categories.
=== Calling Function ===
Calling function: chunk_query_engine with args: {"input": "What was Tesla's revenue in 2021?"}
=== Function Output ===
Tesla's total revenue in 2021 was $53,823 million.
=== LLM Response ===
In 2021, Apple's revenue was $365.8 billion, while Tesla's revenue was $53.823 billion.


In [20]:
response = agent.chat("Tell me the tailwinds for Apple and Tesla in 2021?")

Added user message to memory: Tell me the tailwinds for Apple and Tesla in 2021?
=== Calling Function ===
Calling function: chunk_query_engine with args: {"input": "What were the tailwinds for Apple in 2021?"}
=== Function Output ===
In 2021, Apple experienced several positive factors contributing to its growth. Key tailwinds included increased net sales of iPhone, Services, and Mac across various regions, including the Americas, Europe, Greater China, Japan, and the Rest of Asia Pacific. Additionally, favorable currency movements, particularly the strength of the Chinese renminbi and the impact of foreign currencies in Europe and the Rest of Asia Pacific, positively influenced net sales. The successful launch of new iPhone models and a favorable mix of iPhone sales also played a significant role in driving revenue growth.
=== Calling Function ===
Calling function: chunk_query_engine with args: {"input": "What were the tailwinds for Tesla in 2021?"}
=== Function Output ===
In 2021, Tes

In [21]:
response = agent.chat("How was apple doing generally in 2019?")

Added user message to memory: How was apple doing generally in 2019?
=== Calling Function ===
Calling function: doc_query_engine with args: {"input": "How was Apple doing generally in 2019?"}
=== Function Output ===
In 2019, Apple experienced a slight decline in total net sales, reporting $260.2 billion compared to $265.6 billion in 2018. The decrease was primarily driven by a significant drop in iPhone sales, which fell by 14% to $142.4 billion. However, other product categories showed growth; iPad sales increased by 16%, and the Wearables, Home and Accessories segment saw a substantial rise of 41%. Additionally, services revenue grew by 16%, reaching $46.3 billion, reflecting strong performance in digital content and subscription services.

Despite the overall decline in net sales, Apple maintained a solid net income of $55.3 billion, although this was lower than the previous year's $59.5 billion. The company also continued to invest heavily in research and development, with expenses