<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/multi_document_agents.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi-Document Agents

In this guide, you learn towards setting up an agent that can effectively answer different types of questions over a larger set of documents.

These questions include the following

- QA over a specific doc
- QA comparing different docs
- Summaries over a specific doc
- Comparing summaries between different docs

We do this with the following architecture:

- setup a "document agent" over each Document: each doc agent can do QA/summarization within its doc
- setup a top-level agent over this set of document agents. Do tool retrieval and then do CoT over the set of tools to answer a question.

## Setup and Download Data

In this section, we'll define imports and then download Wikipedia articles about different cities. Each article is stored separately.

We load in 18 cities - this is not quite at the level of "hundreds" of documents but its still large enough to warrant some top-level document retrieval!

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [2]:
from llama_index.core import (
    VectorStoreIndex,
    SimpleKeywordTableIndex,
    SimpleDirectoryReader,
)
from llama_index.core import SummaryIndex
from llama_index.core.schema import IndexNode
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.callbacks import CallbackManager

In [3]:
wiki_titles = [
    "A Case Study of Understanding the Bonaparte Basin using Unstructured Data Analysis with Machine Learning Techniques",
    "An Automated Information Retrieval Platform For Unstructured Well Data Utilizing Smart Machine Learning Algorithms Within A Hybrid Cloud Container",
    "Supporting the UN 2050 Net Zero goals by reading the earth better",
]

In [4]:
# Load all wiki documents
city_docs = {}
for wiki_title in wiki_titles:
    city_docs[wiki_title] = SimpleDirectoryReader(
        input_files=[f"data/{wiki_title}.txt"]
    ).load_data()

In [5]:
print(city_docs)

{'A Case Study of Understanding the Bonaparte Basin using Unstructured Data Analysis with Machine Learning Techniques': [Document(id_='246df66b-6e36-4797-a348-80136f389a35', embedding=None, metadata={'file_path': 'data\\A Case Study of Understanding the Bonaparte Basin using Unstructured Data Analysis with Machine Learning Techniques.txt', 'file_name': 'A Case Study of Understanding the Bonaparte Basin using Unstructured Data Analysis with Machine Learning Techniques.txt', 'file_type': 'text/plain', 'file_size': 9304, 'creation_date': '2024-07-04', 'last_modified_date': '2024-07-02'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='Title:\r\nA CASE STUDY OF UNDERSTANDING THE BONAPARTE BASIN USING UNSTRUCTURED DATA ANALYSIS WITH MACHINE LEARNING TECHNIQUE

Define Global LLM and Embeddings

In [141]:
from llama_index.llms.ollama import Ollama
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

llm = Ollama(
    model="phi3:medium",
    base_url="https://7a6b-35-201-236-203.ngrok-free.app",
    request_timeout=200.0
)

Settings.llm = llm
Settings.embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

## Building Multi-Document Agents

In this section we show you how to construct the multi-document agent. We first build a document agent for each document, and then define the top-level parent agent with an object index.

### Build Document Agent for each Document

In this section we define "document agents" for each document.

We define both a vector index (for semantic search) and summary index (for summarization) for each document. The two query engines are then converted into tools that are passed to an OpenAI function calling agent.

This document agent can dynamically choose to perform semantic search or summarization within a given document.

We create a separate document agent for each city.

In [142]:
from llama_index.core import load_index_from_storage, StorageContext
from llama_index.core.node_parser import SentenceSplitter
import os
from llama_index.core.agent import ReActAgent

node_parser = SentenceSplitter(chunk_size=1500, chunk_overlap=300)

# Build agents dictionary
agents = {}
query_engines = {}

# this is for the baseline
all_nodes = []

for idx, wiki_title in enumerate(wiki_titles):
    nodes = node_parser.get_nodes_from_documents(city_docs[wiki_title])
    all_nodes.extend(nodes)

    if not os.path.exists(f"./data/{wiki_title}"):
        # build vector index
        vector_index = VectorStoreIndex(nodes)
        vector_index.storage_context.persist(
            persist_dir=f"./data/{wiki_title}"
        )
    else:
        vector_index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=f"./data/{wiki_title}"),
        )

    # build summary index
    summary_index = SummaryIndex(nodes)
    # define query engines
    vector_query_engine = vector_index.as_query_engine(llm=Settings.llm)
    summary_query_engine = summary_index.as_query_engine(llm=Settings.llm)
    
    s = f"Document_{idx+1}"
    
    # define tools
    query_engine_tools = [
        QueryEngineTool(
            query_engine=vector_query_engine,
            metadata=ToolMetadata(
                name="vector_tool",
                description=(
                    f"Useful for questions related to specific aspects of {s}"
                    f" {wiki_title} (e.g. the title, authors,"
                    " content, references, or more)."
                ),
            ),
        ),
        QueryEngineTool(
            query_engine=summary_query_engine,
            metadata=ToolMetadata(
                name="summary_tool",
                description=(
                    "Useful for any requests that require a holistic summary"
                    f" of EVERYTHING about {s} which is about {wiki_title}. For questions about"
                    " more specific sections, please use the vector_tool."
                ),
            ),
        ),
    ]
    
    # build agent
    function_llm = Settings.llm
    agent = ReActAgent.from_tools(
        query_engine_tools,
        llm=function_llm,
        verbose=True,
        system_prompt=f"""\
You are a specialized agent designed to answer queries about {s} which is about {wiki_title}.
You must ALWAYS use at least one of the tools provided when answering a question; do NOT rely on prior knowledge.\
""",
    )
    
    print(s)
    agents[s] = agent
    query_engines[s] = vector_index.as_query_engine(
        similarity_top_k=1
    )

Document_1
Document_2
Document_3


### Build Retriever-Enabled OpenAI Agent

We build a top-level agent that can orchestrate across the different document agents to answer any user query.

This agent takes in all document agents as tools. This specific agent `RetrieverOpenAIAgent` performs tool retrieval before tool use (unlike a default agent that tries to put all tools in the prompt).

Here we use a top-k retriever, but we encourage you to customize the tool retriever method!


In [143]:
# define tool for each document agent
all_tools = []
for idx, wiki_title in enumerate(wiki_titles):
    s = f"Document_{idx+1}"
    wiki_summary = (
        f"This content contains {s} information about {wiki_title}. Use"
        f" this tool if you want to answer any questions about {wiki_title}.\n"
    )
    print(s)
    doc_tool = QueryEngineTool(
        query_engine=agents[s],
        metadata=ToolMetadata(
            name=f"tool_{s}",
            description=wiki_summary,
        ),
    )
    all_tools.append(doc_tool)

Document_1
Document_2
Document_3


In [144]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [145]:
content = ""
for idx, doc in enumerate(wiki_titles):
    content += f"DOCUMENT {idx+1} refers to {doc}\n"

In [146]:
top_agent = ReActAgent.from_tools(
    tool_retriever=obj_index.as_retriever(similarity_top_k=1),
    system_prompt=f""" \
You are an agent designed to answer queries about a set of research papers.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.
Additionally, {content}. \n if any of those are called, use the tools depending on their document number.
\

""",
    verbose=True,
)

### Define Baseline Vector Store Index

As a point of comparison, we define a "naive" RAG pipeline which dumps all docs into a single vector index collection.

We set the top_k = 4

In [147]:
base_index = VectorStoreIndex(all_nodes)
base_query_engine = base_index.as_query_engine(similarity_top_k=4)

## Running Example Queries

Let's run some example queries, ranging from QA / summaries over a single document to QA / summarization over multiple documents.

In [148]:
# should use Boston agent -> vector tool
response = top_agent.chat("Tell me the author of the Bonaparte Basin Paper")

HTTPStatusError: Client error '404 Not Found' for url 'https://7a6b-35-201-236-203.ngrok-free.app/api/chat'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404

In [None]:
print(response)

In [None]:
# should use Boston agent -> vector tool
response = top_agent.chat("What was used as Datas in the paper?")

In [149]:
print(response)

The introduction of document 1 is about a case study that uses machine learning techniques to analyze unstructured data from the Bonaparte Basin, leading to improvements in analysis and decision-making time.


In [150]:
# should use Boston agent -> vector tool
response = top_agent.chat("How may wells were used in the paper?")

HTTPStatusError: Client error '404 Not Found' for url 'https://7a6b-35-201-236-203.ngrok-free.app/api/chat'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404

In [77]:
print(response)

The paper mentions 440 wells.


In [79]:
# should use Boston agent -> vector tool
response = top_agent.chat("Tel me something about document 2")

[1;3;38;5;200mThought: I need to use a tool to help me answer this question.
Action: tool_Document_3
Action Input: {'input': 'Document 2', 'type': 'object'}
[0m[1;3;38;5;200mThought: The current input is "Document 2". I need to use a tool to help me answer the question.
Action: summary_tool
Action Input: {'input': 'Document 2'}
[0m[1;3;34mObservation: Strong, flexible technical skills will be needed to explore for suitable carbon capture facilities, assess their storage, containment and injectivity capacities. Meanwhile, the new energy industry will continue to gather, integrate and analyse empirical data, whether it is on the reservoir, sub-surface or at hydrocarbon or future hydrogen production facilities.
[0m[1;3;38;5;200mThought: The user provided some text related to Document 2. I'll use the vector_tool to help me analyze this specific section.
Action: vector_tool
Action Input: {'input': 'Strong, flexible technical skills will be needed to explore for suitable carbon captur

In [80]:
print(response)

It seems that Document 2 is discussing the importance of strong technical skills, particularly geological knowledge, for exploring suitable carbon capture facilities and assessing their capabilities. The text also emphasizes the need for engineers to leverage diverse data sources to accelerate the transition towards cleaner energy.


In [81]:
# should use Boston agent -> vector tool
response = top_agent.chat("In connection with the second question, who are the author of the paper")

[1;3;38;5;200mThought: (Implicit) I can answer without any more tools!
Answer: I apologize, but since we only have access to Document 1, I don't have any information about Document 2 or its authors.
[0m

In [82]:
print(response)# should use Boston agent -> vector tool
response = top_agent.chat("Okay, can you tell me who if the author of that document")

I apologize, but since we only have access to Document 1, I don't have any information about Document 2 or its authors.
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: tool_Document_1
Action Input: {'input': 'author', 'type': 'string'}
[0m[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: summary_tool
Action Input: {'input': 'A Case Study of Understanding the Bonaparte Basin using Unstructured Data Analysis with Machine Learning Techniques'}
[0m[1;3;34mObservation: As part of exploration and production, the oil and gas industry produces substantial amounts of data within different disciplines, with 80% being unstructured like reports, presentations, spreadsheets, etc. To assist geoscientists and engineers, Machine Learning (ML) and Artificial Intelligence (AI) technologies are applied to process the unstructured data from 

In [116]:
# baseline
response = base_query_engine.query(
    "Tell me the author of the Bonaparte Basin Paper"
)
print(str(response))

The new energy industry will continue to gather, integrate and analyze empirical data, whether it is on the reservoir, sub-surface or at hydrocarbon or future hydrogen production facilities.


In [None]:
# baseline: the response tells you nothing about Chicago...
response.source_nodes[3].get_content()

In [None]:
response = top_agent.query(
    "Tell me the differences between Shanghai and Beijing in terms of history"
    " and current economy"
)

=== Calling Function ===
Calling function: tool_Shanghai with args: {
  "input": "history"
}
=== Calling Function ===
Calling function: vector_tool with args: {
  "input": "history"
}
Got output: Shanghai has a rich history that dates back to ancient times. However, in the context provided, the history of Shanghai is mainly discussed in relation to its modern development. After the war, Shanghai's economy experienced significant growth, with increased agricultural and industrial output. The city's administrative divisions were rearranged, and it became a center for radical leftism during the 1950s and 1960s. The Cultural Revolution had a severe impact on Shanghai's society, but the city maintained economic production with a positive growth rate. Shanghai also played a significant role in China's Third Front campaign and has been a major contributor of tax revenue to the central government. Economic reforms were initiated in Shanghai in 1990, leading to the development of the Pudong dis

In [None]:
print(str(response))

In terms of history, both Shanghai and Beijing have rich and complex pasts. Shanghai's history dates back to ancient times, but its modern development is particularly noteworthy. It experienced significant economic growth after the war and played a major role in China's economic reforms. Beijing, on the other hand, has a history that spans several dynasties and served as the capital during the Ming and Qing dynasties. It has preserved its historical heritage while evolving into a modern metropolis.

In terms of current economy, Shanghai is a global center for finance and innovation. It has a diverse economy and has experienced rapid development, with a high GDP and significant foreign investment. It is a major player in the global financial industry and is home to the Shanghai Stock Exchange. Beijing's economy is primarily driven by the tertiary sector, with a focus on services such as professional services, information technology, and commercial real estate. It has identified high-end

In [None]:
# baseline
response = base_query_engine.query(
    "Tell me the differences between Shanghai and Beijing in terms of history"
    " and current economy"
)
print(str(response))

Shanghai and Beijing have distinct differences in terms of history and current economy. Historically, Shanghai was the largest and most prosperous city in East Asia during the 1930s, while Beijing served as the capital of the Republic of China and later the People's Republic of China. Shanghai experienced significant growth and redevelopment in the 1990s, while Beijing expanded its urban area and underwent rapid development in the last two decades.

In terms of the current economy, Shanghai is considered the "showpiece" of China's booming economy. It is a global center for finance and innovation, with a strong focus on industries such as retail, finance, IT, real estate, machine manufacturing, and automotive manufacturing. Shanghai is also home to the world's busiest container port, the Port of Shanghai. The city has a high GDP and is classified as an Alpha+ city by the Globalization and World Cities Research Network.

On the other hand, Beijing is a global financial center and ranks t