# Build an Agentic RAG Service

Setup an agent service that can interact with a tool service (containing RAG tools over annual reports).

In this notebook, we:
- Setup our indexes and query engine tools
- Define our multi-agent framework
  - A message queue.
  - An agentic orchestrator.
  - A tools service containing our query engine tools. This will act as a remote executor for tools
  - Define meta-tools for our agents. These will make calls to the tools service instead of executing directly
  - Our agent services. These wrap existing llama-index agents
  - Put all this into a local launcher, to simulate one task passing through the system at a time.

In [1]:
import nest_asyncio

nest_asyncio.apply()

import os
from dotenv import load_dotenv
load_dotenv()

OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

## Load Data

In [2]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
    Settings
)

from llama_index.core.tools import QueryEngineTool, ToolMetadata

In [3]:
try:
    storage_context = StorageContext.from_defaults(persist_dir="./storage/q1-23")
    q1_23_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(persist_dir="./storage/q1-24")
    q1_24_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

In [4]:
if not index_loaded:
    # load data
    q1_23_docs = SimpleDirectoryReader(
        input_files=["./data/GOOG-10-Q-Q1-2023.pdf"]
    ).load_data()
    q1_24_docs = SimpleDirectoryReader(
        input_files=["./data/goog-10-q-q1-2024.pdf"]
    ).load_data()

    # build index
    q1_23_index = VectorStoreIndex.from_documents(q1_23_docs)
    q1_24_index = VectorStoreIndex.from_documents(q1_24_docs)

    # persist index
    q1_23_index.storage_context.persist(persist_dir="./storage/q1-23")
    q1_24_index.storage_context.persist(persist_dir="./storage/q1-24")

In [5]:
q1_23_engine = q1_23_index.as_query_engine(similarity_top_k=3)
q1_24_engine = q1_24_index.as_query_engine(similarity_top_k=3)

In [6]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=q1_23_engine,
        metadata=ToolMetadata(
            name="goog-q1-23",
            description=(
                "Provides information about google financials for first quarter of year 2023. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=q1_24_engine,
        metadata=ToolMetadata(
            name="goog_q1_24",
            description=(
                "Provides information about Google financials for first quarter of year 2024. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

## Setup Agents

Now that we've defined the query tools, we can wrap these under a `ToolService`.

In [7]:
from llama_agents import (
    AgentService,
    ToolService,
    LocalLauncher,
    MetaServiceTool,
    ControlPlaneServer,
    SimpleMessageQueue,
    AgentOrchestrator,
)

from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.llms.openai import OpenAI


# create our multi-agent framework components
message_queue = SimpleMessageQueue()
control_plane = ControlPlaneServer(
    message_queue=message_queue,
    orchestrator=AgentOrchestrator(llm=OpenAI(model="gpt-4o")),
)

# define Tool Service
tool_service = ToolService(
    message_queue=message_queue,
    tools=query_engine_tools,
    running=True,
    step_interval=0.5,
)

# define meta-tools here
meta_tools = [
    await MetaServiceTool.from_tool_service(
        t.metadata.name,
        message_queue=message_queue,
        tool_service=tool_service,
    )
    for t in query_engine_tools
]


# define Agent and agent service
worker1 = FunctionCallingAgentWorker.from_tools(
    meta_tools,
    llm=OpenAI(),
)
agent1 = worker1.as_agent()
agent_server_1 = AgentService(
    agent=agent1,
    message_queue=message_queue,
    description="Used to answer questions over Google for financial quarters of year 2023 and 2024 documents",
    service_name="goog_q1_23_q1_24_analyst_agent",
)

## Launch agent 

With our services, orchestrator, control plane, and message queue defined, we can test our llama-agents network by passing in single messages, and observing the results.

This is an excellent way to test, iterate, and debug your llama-agents system.

In [8]:
import logging

# change logging level to enable or disable more verbose logging
logging.getLogger("llama_agents").setLevel(logging.INFO)

In [9]:
## Define Launcher
launcher = LocalLauncher(
    [agent_server_1, tool_service],
    control_plane,
    message_queue,
)

In [10]:
# query_str = "What was Lyft's revenue growth in 2021?"
# gets stuck in a loop, should mostly be called once
query_str = "What are the risk factors for Google?"
result = launcher.launch_single(query_str)

INFO:llama_agents.message_queues.simple - Consumer AgentService-ff09d4da-2e19-482f-9a20-7f2ecd736d73: goog_q1_23_q1_24_analyst_agent has been registered.
INFO:llama_agents.message_queues.simple - Consumer ToolService-983c62e5-79b7-4f1e-b93f-e9cbadb4fc68: default_tool_service has been registered.
INFO:llama_agents.message_queues.simple - Consumer 1c7a7c0a-65cf-43b3-9fb8-6045bc9320f0: human has been registered.
INFO:llama_agents.message_queues.simple - Consumer ControlPlaneServer-4d05cde9-f47c-44e3-b55b-6749eb4fe1fc: control_plane has been registered.
INFO:llama_agents.services.agent - goog_q1_23_q1_24_analyst_agent launch_local
INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'new_task'
INFO:llama_agents.message_queues.simple - Launching message queue locally
INFO:llama_agents.message_queues.base - Publishing message to 'goog_q1_23_q1_24_analyst_agent' with action 'new_task'
INFO:llama_agents.message_queues.simple - Successfully published message

In [11]:
print(result)

The risk factors for Google in the first quarter of 2023 include fluctuations in revenues and margins, changes in monetization trends, variability in foreign exchange rates, fluctuations in capital expenditures, potential increases in expenses, uncertainties related to compensation expenses, fluctuations in other income (expense), variations in the effective tax rate, seasonal fluctuations in internet usage and advertiser expenditures, and exposure to regulatory scrutiny and legal proceedings.

In the first quarter of 2024, the risk factors include fluctuations in revenues due to changes in foreign currency exchange rates, pricing adjustments, general economic conditions, geopolitical events, regulations, new product and service launches, seasonality, and other external dynamics impacting advertiser, consumer, and enterprise spending.


In [12]:
query_str = "What was Google's revenue growth in q1 2024?"
result = launcher.launch_single(query_str)

INFO:llama_agents.message_queues.simple - Consumer AgentService-ff09d4da-2e19-482f-9a20-7f2ecd736d73: goog_q1_23_q1_24_analyst_agent has been registered.
INFO:llama_agents.message_queues.simple - Consumer ToolService-983c62e5-79b7-4f1e-b93f-e9cbadb4fc68: default_tool_service has been registered.
INFO:llama_agents.message_queues.simple - Consumer 2f1e87c7-2553-4903-b174-8ff1e8562d13: human has been registered.
INFO:llama_agents.message_queues.simple - Consumer ControlPlaneServer-4d05cde9-f47c-44e3-b55b-6749eb4fe1fc: control_plane has been registered.
INFO:llama_agents.services.agent - goog_q1_23_q1_24_analyst_agent launch_local
INFO:llama_agents.message_queues.base - Publishing message to 'control_plane' with action 'new_task'
INFO:llama_agents.message_queues.simple - Launching message queue locally
INFO:llama_agents.message_queues.base - Publishing message to 'goog_q1_23_q1_24_analyst_agent' with action 'new_task'
INFO:llama_agents.message_queues.simple - Successfully published message

In [13]:
print(result)

The revenue growth for Google in the first quarter of 2024 was $10,752 million.
