# Agentic RAG with LlamaIndex


In this notebook we will walk through the basic concepts to create an agents that orchestrate between two query engine tools as follows.

- Define a reader to read the `pdf` sample file [AraGPT2](./data/aragpt2.pdf) paper.
- Define a `splitter` to process the texts of the document.
- Set the LLM embedding and generation model ids.
- Create the engines from the Indexes and define a tool wrapper around them.
- Define a router engine to select the proper tool based on the input query.
- Use the agent to answer some queries.


## Setups


In [1]:
import os
from rich import print
from dotenv import load_dotenv

In [2]:
# load env variables
_ = load_dotenv()

In [3]:
# define some constants
GENERATION_MODEL_ID = "gpt-4o-mini"
EMBEDDING_MODEL_ID = "text-embedding-3-small"

## Load Documents


In [4]:
from llama_index.core import SimpleDirectoryReader

documents_reader = SimpleDirectoryReader(input_files=["./data/aragpt2.pdf"])
documents = documents_reader.load_data()

In [5]:
print(documents[0])

In [6]:
from llama_index.core.node_parser import SentenceSplitter

sentence_splitter = SentenceSplitter(chunk_size=512, chunk_overlap=32)
nodes = sentence_splitter.get_nodes_from_documents(documents)

In [7]:
print(len(nodes))
print(nodes[0])

## Define Vector and Summary Indexes (Indecies)


In [8]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model=GENERATION_MODEL_ID)
Settings.embed_model = OpenAIEmbedding(model=EMBEDDING_MODEL_ID)

In [9]:
from llama_index.core import VectorStoreIndex, SummaryIndex

vector_index = VectorStoreIndex(nodes=nodes)
summary_index = SummaryIndex(nodes=nodes)

## Define Query Engines

Just tell the engine that there is a tool and here is a short description of how to use it.


In [10]:
vector_query_engine = vector_index.as_query_engine()
summary_query_engine = summary_index.as_query_engine(response_mode="tree_summarize")

In [11]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata


vector_tool = QueryEngineTool(
    query_engine=vector_query_engine,
    metadata=ToolMetadata(
        description="Useful for retrieving specific context from the aragpt2 paper."
    ),
)

summary_tool = QueryEngineTool(
    query_engine=summary_query_engine,
    metadata=ToolMetadata(
        description="Useful for summarization questions related to the aragpt2 paper."
    ),
)

## Define Router Query Engine


In [12]:
from llama_index.core.selectors import LLMSingleSelector
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine

agent = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[vector_tool, summary_tool],
    verbose=True,
)

In [13]:
query = "Summarize the AraGPT2 paper for me please."
response = agent.query(query)

[1;3;38;5;200mSelecting query engine 1: The question specifically asks for a summary of the AraGPT2 paper, making choice 2, which is useful for summarization questions, the most relevant..
[0m

In [18]:
print(response.response)

In [19]:
query = "What data used to train the AraGPT2 model?"
response = agent.query(query)

[1;3;38;5;200mSelecting query engine 0: The question asks for specific context regarding the data used to train the AraGPT2 model, which aligns with retrieving specific information..
[0m

In [20]:
print(response.response)