<a href="https://colab.research.google.com/github/MikeG27/colab_backups/blob/main/Agentic_AI_LlamaIndex.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LlamaIndex - Another step into Agentic World!

Welcome to the world of **LlamaIndex**, where AI agents gain the ability to understand, index, and retrieve information efficiently! In this notebook, we will explore how LlamaIndex enhances **knowledge retrieval and orchestration** in agentic workflows.

We will dive into the **core concepts**, set up a working environment, and experiment with LlamaIndex’s capabilities through hands-on examples.

## Overview
LlamaIndex is a powerful **data framework** designed to help AI agents seamlessly **connect, retrieve, and utilize** structured and unstructured knowledge. It acts as a **bridge** between AI models and external data sources, enabling efficient memory and reasoning in multi-agent workflows.

## Why LlamaIndex?
LlamaIndex provides key advantages in agentic workflows:

- **Flexible Data Integration:** Supports a variety of data sources, including databases, documents, and APIs.
- **Efficient Retrieval:** Utilizes **vector-based search** and **keyword indexing** to fetch relevant information quickly.
- **Seamless AI Compatibility:** Integrates with **LLMs**, allowing agents to **understand, analyze, and generate** knowledge-based responses.
- **Optimized for Agentic Systems:** Works smoothly with multi-agent frameworks like CrewAI, LangChain, and OpenAI tools.

## Core Concepts in LlamaIndex
LlamaIndex is built on a few fundamental abstractions:

- **Index:** A structured representation of data that enables efficient retrieval (e.g., vector index, keyword index).
- **Node:** The smallest unit of indexed data (e.g., a paragraph, document, or dataset entry).
- **Retriever:** A mechanism that searches and returns the most relevant information based on user queries.
- **Query Engine:** Uses indexing and retrieval techniques to answer questions from external data sources.
- **Storage:** Manages persistence for indexed knowledge, ensuring efficient lookups.

These components work together to **enhance AI agents' memory and reasoning capabilities**, making them smarter and more context-aware.

---

## Install Dependencies

Before diving into examples, we need to install and configure LlamaIndex. The following commands will install LlamaIndex and its dependencies.

In [None]:
!pip install llama-index
!pip install llama-index-llms-azure-openai
!pip install llama-index-embeddings-azure-openai



## Setup API Keys

In [None]:
import os
from google.colab import userdata

os.environ["AZURE_OPENAI_API_KEY"] = userdata.get('AZURE_OPENAI_API_KEY')
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://saturn-poc1.openai.azure.com/"
os.environ["OPENAI_API_VERSION"] = "2024-05-01-preview"

## Setting Up the LLM
The LLM is instantiated with a specific model:

In [None]:
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.core import Settings

llm = AzureOpenAI(
    model="gpt-4o-mini",
    deployment_name="gpt-4o-mini",
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_version=os.environ["OPENAI_API_VERSION"]
)

# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
    model="text-embedding-3-small",
    deployment_name="text-embedding-3-small",
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_version=os.environ["OPENAI_API_VERSION"]
)

Settings.llm = llm
Settings.embed_model = embed_model

# Example 1: Single agent with Math Tools

### Description
This example demonstrates how to create a single agent that can perform basic mathematical operations (addition, subtraction, multiplication) using function tools. The agent processes a given query, performs calculations, and returns the final result.

In [None]:
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import (
    FunctionCallingAgentWorker,
    ReActAgent,
)

from IPython.display import display, HTML

### Tools definition

In [None]:
def multiply(a: int, b: int) -> int:
    """Multiply two integers and returns the result integer"""
    return a * b


def add(a: int, b: int) -> int:
    """Add two integers and returns the result integer"""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two integers and returns the result integer"""
    return a - b


multiply_tool = FunctionTool.from_defaults(fn=multiply)
add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)

### Define an agent

In [None]:
agent = ReActAgent.from_tools(
    [multiply_tool, add_tool, subtract_tool], llm=llm, verbose=True
)

In [None]:
response = agent.chat("What is (26 * 2) + 2024?")
#display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

> Running step c72ed023-c354-4cec-a510-beeb97bf8d2a. Step input: What is (26 * 2) + 2024?
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: multiply
Action Input: {'a': 26, 'b': 2}
[0m[1;3;34mObservation: 52
[0m> Running step b34b7bd5-c658-46a3-9f6a-0c9b0c1a0b0d. Step input: None
[1;3;38;5;200mThought: Now I have the result of (26 * 2), which is 52. I will add this to 2024.
Action: add
Action Input: {'a': 52, 'b': 2024}
[0m[1;3;34mObservation: 2076
[0m> Running step 90bdf331-9d99-4dbe-862c-a2e712bd52bd. Step input: None
[1;3;38;5;200mThought: I can answer without using any more tools. I'll use the user's language to answer.
Answer: The result of (26 * 2) + 2024 is 2076.
[0m

### Summary
This example showcases how to attach the tools to LlamaIndex to integrate simple mathematical operations into an AI agent workflow. The agent correctly interprets and executes the computation in a structured manner.

# Example 2: Building an agent Reasoning Loop


### Description
This example demonstrates a more advanced agent with reasoning capabilities. The agent is designed to process questions related to the MetaGPT framework and retrieve relevant information from a given document.


In [None]:
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, SummaryIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.tools import FunctionTool, QueryEngineTool
from llama_index.core.vector_stores import MetadataFilters, FilterCondition
from typing import List, Optional

def get_doc_tools(
    file_path: str,
    name: str,
) -> str:
    """Get vector query and summary query tools from a document."""

    # Load documents
    documents = SimpleDirectoryReader(input_files=[file_path]).load_data()
    splitter = SentenceSplitter(chunk_size=1024)
    nodes = splitter.get_nodes_from_documents(documents)

    # Define Azure OpenAI embedding model
    embed_model = AzureOpenAIEmbedding(
      model="text-embedding-3-small",
      deployment_name="text-embedding-3-small",
      api_key=os.environ["AZURE_OPENAI_API_KEY"],
      azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
      api_version=os.environ["OPENAI_API_VERSION"]
    )

    # Use Azure embedding model in VectorStoreIndex
    vector_index = VectorStoreIndex(nodes, embed_model=embed_model)

    def vector_query(
        query: str,
        page_numbers: Optional[List[str]] = None
    ) -> str:
        """Use to answer questions over a given paper."""

        page_numbers = page_numbers or []
        metadata_dicts = [
            {"key": "page_label", "value": p} for p in page_numbers
        ]

        query_engine = vector_index.as_query_engine(
            similarity_top_k=2,
            filters=MetadataFilters.from_dicts(
                metadata_dicts,
                condition=FilterCondition.OR
            )
        )
        response = query_engine.query(query)
        return response

    vector_query_tool = FunctionTool.from_defaults(
        name=f"vector_tool_{name}",
        fn=vector_query
    )

    summary_index = SummaryIndex(nodes)
    # Pass the llm to the as_query_engine function
    summary_query_engine = summary_index.as_query_engine(
        llm=llm, # This line is added to specify the LLM
        response_mode="tree_summarize",
        use_async=True,
    )
    summary_tool = QueryEngineTool.from_defaults(
        name=f"summary_tool_{name}",
        query_engine=summary_query_engine,
        description=(
            f"Useful for summarization questions related to {name}"
        ),
    )

    return vector_query_tool, summary_tool

In [None]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

vector_tool, summary_tool = get_doc_tools("metagpt.pdf", "metagpt")

agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool, summary_tool],
    llm=llm,
    verbose=True
)
agent = AgentRunner(agent_worker)

In [None]:
response = agent.query(
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "agent roles in MetaGPT"}
=== Function Output ===
In MetaGPT, agent roles are specialized positions that contribute to the software development process, each with distinct responsibilities. These roles include Product Manager, Architect, Engineer, and Project Manager, among others. The framework emphasizes the importance of these roles in enhancing collaboration and efficiency by following established standards and workflows. The presence of multiple roles leads to improved code generation, reduced human revision costs, and higher executability of the generated code. The effectiveness of these roles is demonstrated through experiments showing that their inclusion consistently enhances performance metrics, such as revisions and executability.
=== Calling Function ===
Calling function: ve

In [None]:
print(response.source_nodes[0].get_content(metadata_mode="all"))

page_label: 2
file_name: metagpt.pdf
file_path: metagpt.pdf
file_type: application/pdf
file_size: 16911937
creation_date: 2025-03-14
last_modified_date: 2025-03-14

Preprint
Figure 1: The software development SOPs between MetaGPT and real-world human teams.
In software engineering, SOPs promote collaboration among various roles. MetaGPT showcases
its ability to decompose complex tasks into specific actionable procedures assigned to various roles
(e.g., Product Manager, Architect, Engineer, etc.).
documents, design artifacts, flowcharts, and interface specifications. The use of intermediate struc-
tured outputs significantly increases the success rate of target code generation. Because it helps
maintain consistency in communication, minimizing ambiguities and errors during collaboration.
More graphically, in a company simulated by MetaGPT, all employees follow a strict and stream-
lined workflow, and all their handovers must comply with certain established standards. This reduces
the ri

In [None]:
response = agent.chat("Tell me the results over one of the above datasets.")

Added user message to memory: Tell me the results over one of the above datasets.
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "results over one of the datasets in MetaGPT"}
=== Function Output ===
MetaGPT demonstrates superior performance on the MBPP and HumanEval benchmarks, achieving pass rates of 85.9% and 87.7%, respectively. This performance surpasses all previous approaches, indicating its effectiveness in generating executable code. Additionally, when paired with GPT-4, MetaGPT significantly enhances the Pass @k metrics in the HumanEval benchmark.
=== LLM Response ===
MetaGPT shows impressive results on the MBPP and HumanEval benchmarks, achieving pass rates of 85.9% and 87.7%, respectively. These results surpass all previous approaches, highlighting MetaGPT's effectiveness in generating executable code. Furthermore, when combined with GPT-4, MetaGPT significantly improves the Pass @k metrics in the HumanEval benchmark, demonstrating enhan

## Lower-Level: Debuggability and control

In [None]:
agent_worker = FunctionCallingAgentWorker.from_tools(
    [vector_tool, summary_tool],
    llm=llm,
    verbose=True
)
agent = AgentRunner(agent_worker)

In [None]:
task = agent.create_task(
    "Tell me about the agent roles in MetaGPT, "
    "and then how they communicate with each other."
)

In [None]:
step_output = agent.run_step(task.task_id)

Added user message to memory: Tell me about the agent roles in MetaGPT, and then how they communicate with each other.
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "agent roles in MetaGPT"}
=== Function Output ===
In MetaGPT, agent roles are specialized positions that contribute to the software development process. Each role, such as Product Manager, Architect, and Engineer, has specific expertise and responsibilities, allowing for a structured workflow. This role-based task management enhances collaboration and efficiency, as agents follow established standards and procedures. The inclusion of multiple roles has been shown to improve code quality, reduce human revision costs, and increase the overall executability of the generated code. The effectiveness of these roles is demonstrated through experiments, where the addition of different roles consistently leads to better performance outcomes.
=== Calling Function ===
Calling function: vector_tool

In [None]:
completed_steps = agent.get_completed_steps(task.task_id)
print(f"Num completed for task {task.task_id}: {len(completed_steps)}")
print(completed_steps[0].output.sources[0].raw_output)

Num completed for task 51137ae9-2bc1-4af7-ae34-92789c6aedf7: 1
In MetaGPT, agent roles are specialized positions that contribute to the software development process. Each role, such as Product Manager, Architect, and Engineer, has specific expertise and responsibilities, allowing for a structured workflow. This role-based task management enhances collaboration and efficiency, as agents follow established standards and procedures. The inclusion of multiple roles has been shown to improve code quality, reduce human revision costs, and increase the overall executability of the generated code. The effectiveness of these roles is demonstrated through experiments, where the addition of different roles consistently leads to better performance outcomes.


In [None]:
upcoming_steps = agent.get_upcoming_steps(task.task_id)
print(f"Num upcoming steps for task {task.task_id}: {len(upcoming_steps)}")
upcoming_steps[0]

Num upcoming steps for task 51137ae9-2bc1-4af7-ae34-92789c6aedf7: 1


TaskStep(task_id='51137ae9-2bc1-4af7-ae34-92789c6aedf7', step_id='40a2a96c-6997-4aa0-9d27-4bf8e2e9622f', input=None, step_state={}, next_steps={}, prev_steps={}, is_ready=True)

In [None]:
step_output = agent.run_step(
    task.task_id, input="What about how agents share information?"
)

Added user message to memory: What about how agents share information?
=== Calling Function ===
Calling function: vector_tool_metagpt with args: {"query": "how agents share information in MetaGPT"}
=== Function Output ===
Agents within the framework collaborate by adhering to defined development and communication protocols. This allows them to share information effectively while working together on complex tasks or projects.


In [None]:
step_output = agent.run_step(task.task_id)
print(step_output.is_last)

=== LLM Response ===
In MetaGPT, agents share information by adhering to defined development and communication protocols. This structured approach facilitates effective collaboration, enabling agents to work together on complex tasks or projects efficiently. They utilize a shared message pool to publish and subscribe to relevant messages, ensuring that information is exchanged in an organized manner.
True


In [None]:
response = agent.finalize_response(task.task_id)
print(str(response))

In MetaGPT, agents share information by adhering to defined development and communication protocols. This structured approach facilitates effective collaboration, enabling agents to work together on complex tasks or projects efficiently. They utilize a shared message pool to publish and subscribe to relevant messages, ensuring that information is exchanged in an organized manner.


## Summary
This example illustrates how an AI agent can extract and summarize information from documents efficiently using LlamaIndex's vector and summary tools. It also highlights structured communication within an agent-based workflow.

# Example 3: Multi doc agent

### Description
This example expands on the previous ones by introducing an agent capable of handling multiple documents. The agent downloads, indexes, and queries multiple research papers for analysis.

In [None]:
def download_papers(urls, papers):
    """Downloads the papers from the given URLs if they are not already present."""
    for url, paper in zip(urls, papers):
        if not Path(paper).exists():
            print(f"Downloading {paper}...")
            response = requests.get(url)
            with open(paper, "wb") as f:
                f.write(response.content)
            print(f"Downloaded {paper} successfully.")
        else:
            print(f"{paper} already exists. Skipping download.")

In [None]:
import requests
from pathlib import Path

urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=hSyW5go0v8",
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "selfrag.pdf",
]

download_papers(urls, papers)

metagpt.pdf already exists. Skipping download.
longlora.pdf already exists. Skipping download.
selfrag.pdf already exists. Skipping download.


In [None]:
paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]


initial_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]
initial_tools

Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: selfrag.pdf


[<llama_index.core.tools.function_tool.FunctionTool at 0x783c85cd51d0>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x783c87a17550>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x783c85abff90>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x783c85a96210>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x783c85b3b890>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x783c85b3fed0>]

In [None]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    initial_tools,
    llm=llm,
    verbose=True
)
agent = AgentRunner(agent_worker)

In [None]:
response = agent.query(
    "Tell me about the evaluation dataset used in LongLoRA, "
    "and then tell me about the evaluation results"
)
print(response)

Added user message to memory: Tell me about the evaluation dataset used in LongLoRA, and then tell me about the evaluation results
=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "evaluation dataset"}
=== Function Output ===
The evaluation dataset is not explicitly mentioned in the provided context. However, it discusses the performance of the proposed framework on cross-dataset and cross-manipulation evaluations, indicating that various datasets may have been used for evaluation purposes.
=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "evaluation results"}
=== Function Output ===
The evaluation results indicate that the proposed model, fine-tuned on a context length of 16,384, demonstrates comparable or superior performance to other long-context models, including GPT-3.5-Turbo and various Llama2-based models. Specifically, in the LongBench benchmark, the model achieved an average score of 36.8 across different 

In [None]:
response = agent.query("Give me a summary of both Self-RAG and LongLoRA")
print(str(response))

Added user message to memory: Give me a summary of both Self-RAG and LongLoRA
=== Calling Function ===
Calling function: summary_tool_selfrag with args: {"input": "Self-RAG (Self Retrieval-Augmented Generation) is a framework that enhances the capabilities of language models by integrating retrieval mechanisms directly into the generation process. It allows models to access external knowledge dynamically during inference, improving their ability to generate accurate and contextually relevant responses. Self-RAG operates by retrieving relevant documents or information from a knowledge base based on the input query and then using this information to inform the generation of responses. This approach helps in reducing hallucinations and improving the factual accuracy of the generated content."}
=== Function Output ===
Encountered error: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-05-01-preview 

In [None]:
urls = [
    "https://openreview.net/pdf?id=VtmBAGCN7o",
    "https://openreview.net/pdf?id=6PmJoRfdaK",
    "https://openreview.net/pdf?id=LzPWWPAdY4",
    "https://openreview.net/pdf?id=VTF8yNQM66",
    "https://openreview.net/pdf?id=hSyW5go0v8",
    "https://openreview.net/pdf?id=9WD9KwssyT",
    "https://openreview.net/pdf?id=yV6fD7LYkF",
    "https://openreview.net/pdf?id=hnrB5YHoYu",
    "https://openreview.net/pdf?id=WbWtOYIzIK",
    "https://openreview.net/pdf?id=c5pwL0Soay"
]

papers = [
    "metagpt.pdf",
    "longlora.pdf",
    "loftq.pdf",
    "swebench.pdf",
    "selfrag.pdf",
    "zipformer.pdf",
    "values.pdf",
    "finetune_fair_diffusion.pdf",
    "knowledge_card.pdf",
    "metra.pdf"
]

In [None]:
download_papers(urls, papers)

metagpt.pdf already exists. Skipping download.
longlora.pdf already exists. Skipping download.
loftq.pdf already exists. Skipping download.
swebench.pdf already exists. Skipping download.
selfrag.pdf already exists. Skipping download.
zipformer.pdf already exists. Skipping download.
values.pdf already exists. Skipping download.
finetune_fair_diffusion.pdf already exists. Skipping download.
knowledge_card.pdf already exists. Skipping download.
metra.pdf already exists. Skipping download.


In [None]:
paper_to_tools_dict = {}
for paper in papers:
    print(f"Getting tools for paper: {paper}")
    vector_tool, summary_tool = get_doc_tools(paper, Path(paper).stem)
    paper_to_tools_dict[paper] = [vector_tool, summary_tool]

all_tools = [t for paper in papers for t in paper_to_tools_dict[paper]]

Getting tools for paper: metagpt.pdf
Getting tools for paper: longlora.pdf
Getting tools for paper: loftq.pdf
Getting tools for paper: swebench.pdf
Getting tools for paper: selfrag.pdf
Getting tools for paper: zipformer.pdf
Getting tools for paper: values.pdf
Getting tools for paper: finetune_fair_diffusion.pdf
Getting tools for paper: knowledge_card.pdf
Getting tools for paper: metra.pdf


In [None]:
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [None]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [None]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=llm,
    system_prompt=""" \
You are an agent designed to answer queries over a set of given papers.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.\
""",
    verbose=True
)

In [None]:
agent = AgentRunner(agent_worker)

response = agent.query(
    "Tell me about the evaluation dataset used "
    "in MetaGPT and compare it against SWE-Bench"
)
print(response)

Added user message to memory: Tell me about the evaluation dataset used in MetaGPT and compare it against SWE-Bench
=== Calling Function ===
Calling function: summary_tool_metagpt with args: {"input": "evaluation dataset"}
=== Function Output ===
Encountered error: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-05-01-preview have exceeded token rate limit of your current OpenAI S0 pricing tier. Please retry after 60 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}
=== Calling Function ===
Calling function: summary_tool_swebench with args: {"input": "evaluation dataset"}
=== Function Output ===
Encountered error: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-05-01-preview have exceeded token rate limit of your current OpenAI S0 

In [None]:
response = agent.query(
    "Compare and contrast the LoRA papers (LongLoRA, LoftQ). "
    "Analyze the approach in each paper first. "
)

Added user message to memory: Compare and contrast the LoRA papers (LongLoRA, LoftQ). Analyze the approach in each paper first. 
=== Calling Function ===
Calling function: vector_tool_longlora with args: {"query": "Analyze the approach of the LongLoRA paper."}
=== Function Output ===
The LongLoRA paper introduces an efficient fine-tuning method aimed at extending the context sizes of pre-trained large language models (LLMs) while minimizing computational costs. The approach leverages two main strategies: the use of sparse local attention during training and the integration of improved parameter-efficient fine-tuning techniques.

One of the key innovations is the shifted sparse attention (S2-Attn), which allows for effective context extension with significant computational savings compared to traditional dense global attention. This method can be implemented with minimal code changes during training and remains optional during inference, ensuring that the original architecture is preser

### Summary
This example demonstrates a scalable approach to document-based knowledge retrieval and summarization, allowing an agent to process and analyze multiple documents efficiently.


# Call to Action!

Now that you've seen how LlamaIndex can be used for knowledge retrieval and agentic workflows, it's time to try it yourself!

1. **Choose a Document** – Pick a document (e.g., a research paper, report, or dataset) that you’d like to analyze.
2. **Index and Process** – Use the `SimpleDirectoryReader`, `VectorStoreIndex`, and `SummaryIndex` to structure your data.
3. **Ask Questions** – Create an agent to query your document and extract meaningful insights.

Experiment with different configurations and retrieval methods to enhance your AI’s understanding of the content. Happy coding! 🚀
