# Building Agentic RAG with LlamaIndex - Complete Notebook Content

This notebook contains all lessons from the course on building agentic RAG systems using LlamaIndex.

Setup and Installation
First, let's install the required packages.

# Setup and Installation
First, let's install the required packages.

In [1]:
!pip install llama-index
!pip install llama-index-llms-openai
!pip install llama-index-embeddings-openai
!pip install nest-asyncio
!pip install openai



## Set up OpenAI API Key

In [2]:
# Set up OpenAI API Key
import os
from getpass import getpass

# You can either set it directly or use getpass for security
OPENAI_API_KEY = getpass("Enter your OpenAI API Key:")
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

Enter your OpenAI API Key:··········


In [3]:
import nest_asyncio
nest_asyncio.apply()

# Lesson 1: Router Engine

### Load Data
Download the MetaGPT paper:

In [4]:
!wget "https://openreview.net/pdf?id=VtmBAGCN7o" -O metagpt.pdf

--2025-10-28 17:26:26--  https://openreview.net/pdf?id=VtmBAGCN7o
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘metagpt.pdf’


2025-10-28 17:26:27 (68.3 MB/s) - ‘metagpt.pdf’ saved [16911937/16911937]



In [5]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

## Define LLM and Embedding Model

In [6]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)

In [7]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-3.5-turbo")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

## Define Summary Index and Vector Index

In [8]:
from llama_index.core import SummaryIndex, VectorStoreIndex

summary_index = SummaryIndex(nodes)
vector_index = VectorStoreIndex(nodes)

## Define Query Engines and Set Metadata

In [9]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [10]:
from llama_index.core.tools import QueryEngineTool

summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description=(
        "Useful for summarization questions related to MetaGPT"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description=(
        "Useful for retrieving specific context from the MetaGPT paper."
    ),
)

## Define Router Query Engine

In [11]:
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
    verbose=True
)

In [12]:
response = query_engine.query("What is the summary of the document?")
print(str(response))
print(len(response.source_nodes))

[1;3;38;5;200mSelecting query engine 0: This choice indicates that the document is useful for summarization questions related to MetaGPT..
[0mThe document introduces MetaGPT, a meta-programming framework that utilizes Standardized Operating Procedures (SOPs) to enhance multi-agent systems based on Large Language Models (LLMs). It outlines the roles of various agents in the software development process and discusses the efficient workflows, structured communication interfaces, and executable feedback mechanism incorporated in the framework. Through extensive experiments, MetaGPT demonstrates state-of-the-art performance on different benchmarks, showcasing its potential for developing LLM-based multi-agent systems. The document also provides insights into the development process of a software application called the "Drawing App" using MetaGPT, highlighting stages like requirements gathering, UI design, system architecture, implementation approach, testing, and task allocation. It discu

In [13]:
response = query_engine.query(
    "How do agents share information with other agents?"
)
print(str(response))

[1;3;38;5;200mSelecting query engine 1: This choice is more relevant as it specifically mentions retrieving specific context from the MetaGPT paper, which would likely include information on how agents share information with other agents..
[0mAgents share information with other agents by utilizing a shared message pool where they can publish structured messages and subscribe to relevant messages based on their profiles. This shared message pool allows all agents to exchange messages directly, enabling them to access messages from other entities transparently. By storing information in this global message pool, agents can retrieve required information without the need to inquire about other agents and wait for their responses, thus enhancing communication efficiency.


# Lesson 2: Tool Calling

### 1. Define a Simple Tool

In [14]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def mystery(x: int, y: int) -> int:
    """Mystery function that operates on top of two numbers."""
    return (x + y) * (x + y)

add_tool = FunctionTool.from_defaults(fn=add)
mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [15]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
response = llm.predict_and_call(
    [add_tool, mystery_tool],
    "Tell me the output of the mystery function on 2 and 9",
    verbose=True
)
print(str(response))

=== Calling Function ===
Calling function: mystery with args: {"x": 2, "y": 9}
=== Function Output ===
121
121


### 2. Define an Auto-Retrieval Tool

In [16]:
from llama_index.core import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader(input_files=["metagpt.pdf"]).load_data()

In [17]:
from llama_index.core.node_parser import SentenceSplitter

splitter = SentenceSplitter(chunk_size=1024)
nodes = splitter.get_nodes_from_documents(documents)
print(nodes[0].get_content(metadata_mode="all"))

page_label: 1
file_name: metagpt.pdf
file_path: metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2025-10-28
last_modified_date: 2025-10-28

Preprint
METAGPT: M ETA PROGRAMMING FOR A
MULTI -AGENT COLLABORATIVE FRAMEWORK
Sirui Hong1∗, Mingchen Zhuge2∗, Jonathan Chen1, Xiawu Zheng3, Yuheng Cheng4,
Ceyao Zhang4, Jinlin Wang1, Zili Wang, Steven Ka Shing Yau5, Zijuan Lin4,
Liyang Zhou6, Chenyu Ran1, Lingfeng Xiao1,7, Chenglin Wu1†, J¨urgen Schmidhuber2,8
1DeepWisdom, 2AI Initiative, King Abdullah University of Science and Technology,
3Xiamen University, 4The Chinese University of Hong Kong, Shenzhen,
5Nanjing University, 6University of Pennsylvania,
7University of California, Berkeley, 8The Swiss AI Lab IDSIA/USI/SUPSI
ABSTRACT
Remarkable progress has been made on automated problem solving through so-
cieties of agents based on large language models (LLMs). Existing LLM-based
multi-agent systems can already solve simple dialogue tasks. Solutions to more
complex tasks

In [18]:
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex(nodes)
query_engine = vector_index.as_query_engine(similarity_top_k=2)

In [19]:
from llama_index.core.vector_stores import MetadataFilters

query_engine = vector_index.as_query_engine(
    similarity_top_k=2,
    filters=MetadataFilters.from_dicts(
        [
            {"key": "page_label", "value": "2"}
        ]
    )
)

response = query_engine.query(
    "What are some high-level results of MetaGPT?",
)
print(str(response))
for n in response.source_nodes:
    print(n.metadata)

Some high-level results of MetaGPT include achieving a new state-of-the-art in code generation benchmarks with 85.9% and 87.7% in Pass@1, outperforming other popular frameworks like AutoGPT, LangChain, AgentVerse, and ChatDev. Additionally, MetaGPT demonstrates robustness and efficiency by achieving a 100% task completion rate in experimental evaluations, highlighting its effectiveness in handling higher levels of software complexity and offering extensive functionality.
{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2025-10-28', 'last_modified_date': '2025-10-28'}


### Define the Auto-Retrieval Tool

In [20]:
from typing import List
from llama_index.core.vector_stores import FilterCondition

def vector_query(
    query: str,
    page_numbers: List[str]
) -> str:
    """Perform a vector search over an index.

    query (str): the string query to be embedded.
    page_numbers (List[str]): Filter by set of pages. Leave BLANK if we want to perform a vector search
        over all pages. Otherwise, filter by the set of specified pages.

    """

    metadata_dicts = [
        {"key": "page_label", "value": p} for p in page_numbers
    ]

    query_engine = vector_index.as_query_engine(
        similarity_top_k=2,
        filters=MetadataFilters.from_dicts(
            metadata_dicts,
            condition=FilterCondition.OR
        )
    )
    response = query_engine.query(query)
    return response


vector_query_tool = FunctionTool.from_defaults(
    name="vector_tool",
    fn=vector_query
)

In [21]:
llm = OpenAI(model="gpt-3.5-turbo", temperature=0)
response = llm.predict_and_call(
    [vector_query_tool],
    "What are the high-level results of MetaGPT as described on page 2?",
    verbose=True
)
for n in response.source_nodes:
    print(n.metadata)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "high-level results of MetaGPT", "page_numbers": ["2"]}
=== Function Output ===
MetaGPT achieves a new state-of-the-art (SoTA) in code generation benchmarks with 85.9% and 87.7% in Pass@1. It stands out in handling higher levels of software complexity and offering extensive functionality, demonstrating a 100% task completion rate in experimental evaluations.
{'page_label': '2', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2025-10-28', 'last_modified_date': '2025-10-28'}


### Add More Tools

In [22]:
from llama_index.core import SummaryIndex
from llama_index.core.tools import QueryEngineTool

summary_index = SummaryIndex(nodes)
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
summary_tool = QueryEngineTool.from_defaults(
    name="summary_tool",
    query_engine=summary_query_engine,
    description=(
        "Useful if you want to get a summary of MetaGPT"
    ),
)

In [23]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool],
    "What are the MetaGPT comparisons with ChatDev described on page 8?",
    verbose=True
)
for n in response.source_nodes:
    print(n.metadata)

=== Calling Function ===
Calling function: vector_tool with args: {"query": "MetaGPT comparisons with ChatDev", "page_numbers": ["8"]}
=== Function Output ===
MetaGPT outperforms ChatDev on the SoftwareDev dataset in various metrics. For example, MetaGPT achieves a higher score in executability, takes less time for execution, requires more tokens but fewer tokens per line of code compared to ChatDev. Additionally, MetaGPT demonstrates autonomous software generation capabilities and highlights the benefits of using SOPs in collaborations between multiple agents.
{'page_label': '8', 'file_name': 'metagpt.pdf', 'file_path': 'metagpt.pdf', 'file_type': 'application/pdf', 'file_size': 16911937, 'creation_date': '2025-10-28', 'last_modified_date': '2025-10-28'}


In [24]:
response = llm.predict_and_call(
    [vector_query_tool, summary_tool],
    "What is a summary of the paper?",
    verbose=True
)

=== Calling Function ===
Calling function: summary_tool with args: {"input": "Please provide a summary of the paper."}
=== Function Output ===
The paper introduces MetaGPT, a meta-programming framework that enhances multi-agent systems based on Large Language Models (LLMs) through Standardized Operating Procedures (SOPs). It incorporates role specialization, workflow management, and efficient communication mechanisms to improve problem-solving capabilities. MetaGPT utilizes an executable feedback mechanism to enhance code generation quality during runtime and achieves state-of-the-art performance on various benchmarks. The framework facilitates natural language programming by transforming abstract requirements into detailed class and function designs, generating code for tasks like creating games, data processing, and developing GUI applications. It involves various agents like Architects, Engineers, and QA Engineers to ensure high-quality software output. The paper also discusses chal

# Lesson 3: Building an Agent Reasoning Loop

## Setup Function Calling Agent

In [25]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0)

In [26]:
from llama_index.core.agent.workflow import FunctionAgent

agent = FunctionAgent(
    tools=[vector_tool, summary_tool],
    llm=llm,
    verbose=True
)

In [27]:
# For FunctionAgent - must use run() with await
response = await agent.run(
    "Tell me about the agent roles in MetaGPT, and how they communicate."
)
print(str(response))

Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent
In MetaGPT, the agent roles include Product Manager, Architect, Project Manager, Engineer, and QA Engineer. These agents communicate through a shared message pool where they publish structured messages and subscribe to relevant messages based on their profiles. This communication protocol enables agents to exchange information transparen

In [28]:
response = await  agent.run(
    "Tell me about the evaluation datasets used."
)
print(str(response))

Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent
The evaluation datasets used are HumanEval, MBPP, and SoftwareDev.


In [29]:
response = await agent.run("Tell me the results over one of the above datasets.")
print(str(response))

Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent
The results over the SoftwareDev dataset show that MetaGPT achieved an average score of 3.9, outperforming ChatDev's score of 2.1. Other general intelligent algorithms like AutoGPT scored 1.0, indicating a lower performance in generating executable code effectively. Models such as AutoGPT, Langchain, and Agent-Verse demonstrate strong pr

# Lesson 4: Building a Multi-Document Agent

## 1. Setup an Agent Over 3 Papers

In [30]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

In [31]:
# Download papers
for url, paper in zip(urls, papers):
    !wget "{url}" -O "{paper}"

--2025-10-28 17:27:39--  https://openreview.net/pdf?id=VtmBAGCN7o
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘metagpt.pdf’


2025-10-28 17:27:40 (56.3 MB/s) - ‘metagpt.pdf’ saved [16911937/16911937]

--2025-10-28 17:27:40--  https://openreview.net/pdf?id=6PmJoRfdaK
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1168720 (1.1M) [application/pdf]
Saving to: ‘longlora.pdf’


2025-10-28 17:27:40 (10.3 MB/s) - ‘longlora.pdf’ saved [1168720/1168720]

--2025-10-28 17:27:40--  https://openreview.net/pdf?id=hSyW5go0v8
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP req

In [32]:
# Helper function to create tools for each paper
from pathlib import Path

def get_doc_tools(file_path: str, name: str):
    """Get vector and summary query engine tools from a document."""

    # Load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)

    # Create indices
    vector_index = VectorStoreIndex(nodes)
    summary_index = SummaryIndex(nodes)

    # Create query engines
    vector_query_engine = vector_index.as_query_engine()
    summary_query_engine = summary_index.as_query_engine(
        response_mode="tree_summarize",
        use_async=True,
    )

    # Create tools
    vector_tool = QueryEngineTool.from_defaults(
        query_engine=vector_query_engine,
        description=f"Useful for retrieving specific context from {name}.",
    )

    summary_tool = QueryEngineTool.from_defaults(
        query_engine=summary_query_engine,
        description=f"Useful for summarization questions related to {name}.",
    )

    return vector_tool, summary_tool

In [33]:
paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: selfrag.pdf


In [34]:
initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [35]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo")
print(f"Number of tools: {len(initial_tools)}")

Number of tools: 6


In [36]:
# LlamaIndex >= 0.14.6
from llama_index.core.agent.workflow import FunctionAgent
# If any items in `initial_tools` are plain Python functions, wrap them first:
# from llama_index.core.tools import FunctionTool
# initial_tools = [FunctionTool.from_defaults(fn) for fn in initial_tools]

agent = FunctionAgent(
    tools=initial_tools,
    llm=llm,
    verbose=True,  # optional: shows workflow logs
)


In [37]:
import asyncio

response = asyncio.run(agent.run(
    user_msg=(
        "Tell me about the evaluation dataset used in LongLoRA, "
        "and then tell me about the evaluation results"
    )
))
print(str(response))


Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced no event
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent
The evaluation dataset used in LongLoRA is KILT. 

The evaluation results of LongLoRA suggest that the model's predictions are frequently plausible and well-supported by relevant passages, 

In [38]:
import asyncio

response = asyncio.run(
    agent.run(
        user_msg="Give me a summary of both Self-RAG and LongLoRA"
    )
)
print(str(response))


Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced no event
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced event StopEvent
Self-RAG is a framework that enhances the quality and factuality of large language models through retrieval on demand and self-reflection. It trains a single LM to retrieve, generate, and c

## 2. Setup an Agent Over 11 Papers

In [39]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=LzPWWPAdY4",
    "https://openreview.net/pdf?id=VTF8yNQM66",
    "https://openreview.net/pdf?id=hSyW5go0v8",
    "https://openreview.net/pdf?id=9WD9KwssyT",
    "https://openreview.net/pdf?id=yV6fD7LYkF",
    "https://openreview.net/pdf?id=hnrB5YHoYu",
    "https://openreview.net/pdf?id=WbWtOYIzIK",
    "https://openreview.net/pdf?id=c5pwL0Soay",
    "https://openreview.net/pdf?id=TpD2aG1h0D"
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "loftq.pdf",
    "swebench.pdf",
    "selfrag.pdf",
    "zipformer.pdf",
    "values.pdf",
    "finetune_fair_diffusion.pdf",
    "knowledge_card.pdf",
    "metra.pdf",
    "vr_mcl.pdf"
]

In [40]:
# Download all papers
for url, paper in zip(urls, papers):
    !wget "{url}" -O "{paper}"

--2025-10-28 17:28:09--  https://openreview.net/pdf?id=VtmBAGCN7o
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16911937 (16M) [application/pdf]
Saving to: ‘metagpt.pdf’


2025-10-28 17:28:11 (10.9 MB/s) - ‘metagpt.pdf’ saved [16911937/16911937]

--2025-10-28 17:28:11--  https://openreview.net/pdf?id=6PmJoRfdaK
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1168720 (1.1M) [application/pdf]
Saving to: ‘longlora.pdf’


2025-10-28 17:28:12 (7.47 MB/s) - ‘longlora.pdf’ saved [1168720/1168720]

--2025-10-28 17:28:12--  https://openreview.net/pdf?id=LzPWWPAdY4
Resolving openreview.net (openreview.net)... 35.184.86.251
Connecting to openreview.net (openreview.net)|35.184.86.251|:443... connected.
HTTP req

In [41]:
from pathlib import Path

paper_to_tools_dict = {}

for paper in papers:
    print(f"Getting tools for paper: {paper}")
    try:
        vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
        paper_to_tools_dict[paper] = [vector_tool, summary_tool]

    except UnicodeEncodeError as e:
        print(f" Unicode error while processing {paper}: {e}")
        try:
            # Attempt to re-read text safely and re-generate tools
            text_bytes = Path(paper).read_bytes()
            safe_text = text_bytes.decode("utf-8", errors="replace")

            # Optionally, save the cleaned text for inspection
            clean_path = Path(paper).with_name(Path(paper).stem + "_clean.txt")
            clean_path.write_text(safe_text, encoding="utf-8")
            print(f"Saved cleaned version: {clean_path}")

            # Retry tool creation if your get_doc_tools can accept a string path
            vector_tool, summary_tool = get_doc_tools(clean_path, Path(paper).stem)
            paper_to_tools_dict[paper] = [vector_tool, summary_tool]
            print(f" Retried successfully for {paper}")

        except Exception as inner_e:
            print(f" Still failed on {paper}: {inner_e}")

    except Exception as e:
        print(f" Unexpected error for {paper}: {e}")


Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: loftq.pdf
Getting tools for paper: swebench.pdf
Getting tools for paper: selfrag.pdf
Getting tools for paper: zipformer.pdf
Getting tools for paper: values.pdf
Getting tools for paper: finetune_fair_diffusion.pdf
Getting tools for paper: knowledge_card.pdf
Getting tools for paper: metra.pdf
Getting tools for paper: vr_mcl.pdf
 Unicode error while processing vr_mcl.pdf: 'utf-8' codec can't encode character '\ud835' in position 94129: surrogates not allowed
Saved cleaned version: vr_mcl_clean.txt
 Retried successfully for vr_mcl.pdf


## Extend the Agent with Tool Retrieval

In [42]:
all_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

In [43]:
# Define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [44]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [45]:
tools = obj_retriever.retrieve(
    "Tell me about the eval dataset used in MetaGPT and SWE-Bench"
)
tools[2].metadata

ToolMetadata(description='Useful for summarization questions related to vr_mcl.', name='query_engine_tool', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)

In [46]:
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.core.tools import RetrieverTool  # ✅ wrap retrievers as tools

retriever_tool = RetrieverTool.from_defaults(
    retriever=obj_retriever,
    name="paper_retriever",
    description="Retrieve relevant chunks from the loaded papers."
)

agent = FunctionAgent(
    tools=[retriever_tool],
    llm=llm,
    system_prompt=(
        "You are an agent designed to answer queries over a set of given papers. "
        "Always use the provided tools to answer a question. Do not rely on prior knowledge."
    ),
    verbose=True,
)


In [47]:
import asyncio

response = asyncio.run(
    agent.run(
        user_msg="Give me a summary of both Self-RAG and LongLoRA"
    )
)
print(str(response))


Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced no event
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced no event
Step call_tool produced

In [48]:
import asyncio

response = asyncio.run(
    agent.run(
        user_msg=(
            "Tell me about the evaluation dataset used "
            "in MetaGPT and compare it against SWE-Bench."
        )
    )
)
print(str(response))


Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_ag

In [49]:
import asyncio

response = asyncio.run(
    agent.run(
        user_msg=(
            "Compare and contrast the LoRA papers (LongLoRA, LoftQ). "
            "Analyze the approach in each paper first."
        )
    )
)
print(str(response))


Running step init_run
Step init_run produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_agent_output produced no event
Running step call_tool
Step call_tool produced event ToolCallResult
Running step aggregate_tool_results
Step aggregate_tool_results produced event AgentInput
Running step setup_agent
Step setup_agent produced event AgentSetup
Running step run_agent_step
Step run_agent_step produced event AgentOutput
Running step parse_agent_output
Step parse_ag

**End of Notebook**

This complete notebook covers all 4 lessons for building agentic RAG systems with LlamaIndex:


*   Router Engine - Route queries to appropriate tools

*   Tool Calling - Create and use custom function tools
*   Agent Reasoning Loop - Build agents with multi-step reasoning
*   Multi-Document Agent - Scale to multiple documents with tool retrieval