## Setup

Set up your OpenAI API key and `nest_asyncio`.

In [1]:
import os
os.environ["OPENAI_API_KEY"]="openai_api_key"
os.environ["LLAMA_INDEX_API_KEY"]="llama_index_api_key"

In [2]:
import nest_asyncio

nest_asyncio.apply()

## SQL Database

The SQL database in this example will be created in memory and will contain three columns: the city name, the city's population, and the state the city is located in. The table creation and the information for each city is shown in the snippets below.

In [3]:
!pip install llama-index


Collecting llama-index
  Downloading llama_index-0.12.23-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-agent-openai<0.5.0,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.6-py3-none-any.whl.metadata (727 bytes)
Collecting llama-index-cli<0.5.0,>=0.4.1 (from llama-index)
  Downloading llama_index_cli-0.4.1-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-core<0.13.0,>=0.12.23 (from llama-index)
  Downloading llama_index_core-0.12.23.post2-py3-none-any.whl.metadata (2.5 kB)
Collecting llama-index-embeddings-openai<0.4.0,>=0.3.0 (from llama-index)
  Downloading llama_index_embeddings_openai-0.3.1-py3-none-any.whl.metadata (684 bytes)
Collecting llama-index-indices-managed-llama-cloud>=0.4.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.6.8-py3-none-any.whl.metadata (3.6 kB)
Collecting llama-index-llms-openai<0.4.0,>=0.3.0 (from llama-index)
  Downloading llama_index_llms_openai-0.3.25-py3-none-any.whl.metadata (3.3 kB)


In [4]:
!pip install llama-index-llms-anthropic
!pip install llama-index-core


Collecting llama-index-llms-anthropic
  Downloading llama_index_llms_anthropic-0.6.10-py3-none-any.whl.metadata (5.8 kB)
Collecting anthropic>=0.49.0 (from anthropic[bedrock,vertex]>=0.49.0->llama-index-llms-anthropic)
  Downloading anthropic-0.49.0-py3-none-any.whl.metadata (24 kB)
Collecting boto3>=1.28.57 (from anthropic[bedrock,vertex]>=0.49.0->llama-index-llms-anthropic)
  Downloading boto3-1.37.11-py3-none-any.whl.metadata (6.7 kB)
Collecting botocore>=1.31.57 (from anthropic[bedrock,vertex]>=0.49.0->llama-index-llms-anthropic)
  Downloading botocore-1.37.11-py3-none-any.whl.metadata (5.7 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3>=1.28.57->anthropic[bedrock,vertex]>=0.49.0->llama-index-llms-anthropic)
  Downloading jmespath-1.0.1-py3-none-any.whl.metadata (7.6 kB)
Collecting s3transfer<0.12.0,>=0.11.0 (from boto3>=1.28.57->anthropic[bedrock,vertex]>=0.49.0->llama-index-llms-anthropic)
  Downloading s3transfer-0.11.4-py3-none-any.whl.metadata (1.7 kB)
Downloading llama_ind

In [5]:
from llama_index.core import SQLDatabase, Settings
from llama_index.llms.openai import OpenAI
#from llama_index.llms.anthropic import Anthropic
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
)

Settings.llm = OpenAI("gpt-3.5-turbo")


engine = create_engine("sqlite:///:memory:", future=True)
metadata_obj = MetaData()

# create city SQL table
table_name = "city_stats"
city_stats_table = Table(
    table_name,
    metadata_obj,
    Column("city_name", String(16), primary_key=True),
    Column("population", Integer),
    Column("state", String(16), nullable=False),
)

metadata_obj.create_all(engine)

In [6]:
from sqlalchemy import insert

rows = [
    {"city_name": "New York City", "population": 8336000, "state": "New York"},
    {"city_name": "Los Angeles", "population": 3822000, "state": "California"},
    {"city_name": "Chicago", "population": 2665000, "state": "Illinois"},
    {"city_name": "Houston", "population": 2303000, "state": "Texas"},
    {"city_name": "Miami", "population": 449514, "state": "Florida"},
    {"city_name": "Seattle", "population": 749256, "state": "Washington"},
]
for row in rows:
    stmt = insert(city_stats_table).values(**row)
    with engine.begin() as connection:
        cursor = connection.execute(stmt)

with engine.connect() as connection:
    cursor = connection.exec_driver_sql("SELECT * FROM city_stats")
    print(cursor.fetchall())

[('New York City', 8336000, 'New York'), ('Los Angeles', 3822000, 'California'), ('Chicago', 2665000, 'Illinois'), ('Houston', 2303000, 'Texas'), ('Miami', 449514, 'Florida'), ('Seattle', 749256, 'Washington')]


Create a query engine based on SQL database.

In [7]:
from llama_index.core.query_engine import NLSQLTableQueryEngine

sql_database = SQLDatabase(engine, include_tables=["city_stats"])
sql_query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["city_stats"]
)

In [8]:
import openai

openai.api_key = os.environ["OPENAI_API_KEY"]
response = openai.embeddings.create(
    input="This is a test sentence.",
    model="text-embedding-ada-002"
)

print(response)

CreateEmbeddingResponse(data=[Embedding(embedding=[-0.0011391325388103724, -0.003206387162208557, 0.002380132209509611, -0.004501554183661938, -0.010328996926546097, 0.012922565452754498, -0.005491119809448719, -0.0029864837415516376, -0.007327961269766092, -0.03365817293524742, 0.014099695719778538, 0.03262333199381828, -0.010037948377430439, -0.0006289887242019176, 0.0022782650776207447, -0.003689851611852646, 0.03202829882502556, -0.0006471792585216463, 0.024538645520806313, -0.006260782480239868, 0.005086885765194893, 0.0016549356514587998, -0.009313560090959072, 0.013245952315628529, -0.02629787288606167, -0.015548471361398697, 0.009863318875432014, -0.022675933316349983, 0.01716540940105915, 0.00252565648406744, 0.0061217257753014565, -0.014423083513975143, -0.004886385519057512, -0.016428085044026375, 0.007237412501126528, 0.0007579394732601941, 0.0025353580713272095, -0.01888583041727543, 0.012728532776236534, -0.017385313287377357, 0.019066927954554558, 0.007243880536407232, 0

## LlamaCloud Index

Create an index and a query engine around the index you've created.

In [9]:
!pip uninstall -y llama-cloud
!pip install llama-cloud


Found existing installation: llama-cloud 0.1.14
Uninstalling llama-cloud-0.1.14:
  Successfully uninstalled llama-cloud-0.1.14
Collecting llama-cloud
  Using cached llama_cloud-0.1.14-py3-none-any.whl.metadata (902 bytes)
Using cached llama_cloud-0.1.14-py3-none-any.whl (261 kB)
Installing collected packages: llama-cloud
Successfully installed llama-cloud-0.1.14


In [10]:
from llama_index.indices.managed.llama_cloud import LlamaCloudIndex

index = LlamaCloudIndex(
    name="dailydoseofds-2025-03-12",
    project_name="Default",
    organization_id="32a9b69b-a901-41a8-8ff1-6807d15dcfed",
    api_key=os.environ["LLAMA_INDEX_API_KEY"]
)



llama_cloud_query_engine = index.as_query_engine()

Create a query engine tool around these query engines.

In [11]:
from llama_index.core.tools import QueryEngineTool

sql_tool = QueryEngineTool.from_defaults(
    query_engine=sql_query_engine,
    description=(
        "Useful for translating a natural language query into a SQL query over"
        " a table containing: city_stats, containing the population/state of"
        " each city located in the USA."
    ),
    name="sql_tool"
)

cities = ["New York City", "Los Angeles", "Chicago", "Houston", "Miami", "Seattle"]
llama_cloud_tool = QueryEngineTool.from_defaults(
    query_engine=llama_cloud_query_engine,
    description=(
        f"Useful for answering semantic questions about certain cities in the US."
    ),
    name="llama_cloud_tool"
)

## Creating an Agent Around the Query Engines

We'll create a workflow that acts as an agent around the two query engines. In this workflow, we need four events:
1. `GatherToolsEvent`: Gets all tools that need to be called (which is determined by the LLM).
2. `ToolCallEvent`: An individual tool call. Multiple of these events will be triggered at the same time.
3. `ToolCallEventResult`: Gets result from a tool call.
4. `GatherEvent`: Returned from dispatcher that triggers the `ToolCallEvent`.

This workflow consists of the following steps:
1. `chat()`: Appends the message to the chat history. This chat history is fed into the LLM, along with the given tools, and the LLM determines which tools to call. This returns a `GatherToolsEvent`.
2. `dispatch_calls()`: Triggers a `ToolCallEvent` for each tool call given in the `GatherToolsEvent` using `send_event()`. Returns a `GatherEvent` with the number of tool calls.
3. `call_tool()`: Calls an individual tool. This step will run multiple times if there is more than one tool call. This step calls the tool and appends the result as a chat message to the chat history. It returns a `ToolCallEventResult` with the result of the tool call.
4. `gather()`: Gathers the results from all tool calls using `collect_events()`. Waits for all tool calls to finish, then feeds chat history (following all tool calls) into the LLM. Returns the response from the LLM.

In [12]:
from typing import Dict, List, Any, Optional
from llama_index.core.workflow import Context


from llama_index.core.tools import BaseTool
from llama_index.core.llms import ChatMessage
from llama_index.core.llms.llm import ToolSelection, LLM
from llama_index.core.workflow import (
    Workflow,
    Event,
    StartEvent,
    StopEvent,
    step,
)
from llama_index.core.base.response.schema import Response
from llama_index.core.tools import FunctionTool


class InputEvent(Event):
    """Input event."""

class GatherToolsEvent(Event):
    """Gather Tools Event"""

    tool_calls: Any

class ToolCallEvent(Event):
    """Tool Call event"""

    tool_call: ToolSelection

class ToolCallEventResult(Event):
    """Tool call event result."""

    msg: ChatMessage

class RouterOutputAgentWorkflow(Workflow):
    """Custom router output agent workflow."""

    def __init__(self,
        tools: List[BaseTool],
        timeout: Optional[float] = 10.0,
        disable_validation: bool = False,
        verbose: bool = False,
        llm: Optional[LLM] = None,
        chat_history: Optional[List[ChatMessage]] = None,
    ):
        """Constructor."""

        super().__init__(timeout=timeout, disable_validation=disable_validation, verbose=verbose)

        self.tools: List[BaseTool] = tools
        self.tools_dict: Optional[Dict[str, BaseTool]] = {tool.metadata.name: tool for tool in self.tools}
        self.llm: LLM = llm or OpenAI(temperature=0, model="gpt-3.5-turbo")
        self.chat_history: List[ChatMessage] = chat_history or []


    def reset(self) -> None:
        """Resets Chat History"""

        self.chat_history = []

    @step()
    async def prepare_chat(self, ev: StartEvent) -> InputEvent:
        message = ev.get("message")
        if message is None:
            raise ValueError("'message' field is required.")

        # add msg to chat history
        chat_history = self.chat_history
        chat_history.append(ChatMessage(role="user", content=message))
        return InputEvent()

    @step()
    async def chat(self, ev: InputEvent) -> GatherToolsEvent | StopEvent:
        """Appends msg to chat history, then gets tool calls."""

        # Put msg into LLM with tools included
        chat_res = await self.llm.achat_with_tools(
            self.tools,
            chat_history=self.chat_history,
            verbose=self._verbose,
            allow_parallel_tool_calls=True
        )
        tool_calls = self.llm.get_tool_calls_from_response(chat_res, error_on_no_tool_call=False)

        ai_message = chat_res.message
        self.chat_history.append(ai_message)
        if self._verbose:
            print(f"Chat message: {ai_message.content}")

        # no tool calls, return chat message.
        if not tool_calls:
            return StopEvent(result=ai_message.content)

        return GatherToolsEvent(tool_calls=tool_calls)

    @step(pass_context=True)
    async def dispatch_calls(self, ctx: Context, ev: GatherToolsEvent) -> ToolCallEvent:
        """Dispatches calls."""

        tool_calls = ev.tool_calls
        await ctx.set("num_tool_calls", len(tool_calls))

        # trigger tool call events
        for tool_call in tool_calls:
            ctx.send_event(ToolCallEvent(tool_call=tool_call))

        return None

    @step()
    async def call_tool(self, ev: ToolCallEvent) -> ToolCallEventResult:
        """Calls tool."""

        tool_call = ev.tool_call

        # get tool ID and function call
        id_ = tool_call.tool_id

        if self._verbose:
            print(f"Calling function {tool_call.tool_name} with msg {tool_call.tool_kwargs}")

        # call function and put result into a chat message
        tool = self.tools_dict[tool_call.tool_name]
        output = await tool.acall(**tool_call.tool_kwargs)
        msg = ChatMessage(
            name=tool_call.tool_name,
            content=str(output),
            role="tool",
            additional_kwargs={
                "tool_call_id": id_,
                "name": tool_call.tool_name
            }
        )

        return ToolCallEventResult(msg=msg)

    @step(pass_context=True)
    async def gather(self, ctx: Context, ev: ToolCallEventResult) -> StopEvent | None:
        """Gathers tool calls."""
        # wait for all tool call events to finish.
        tool_events = ctx.collect_events(ev, [ToolCallEventResult] * await ctx.get("num_tool_calls"))
        if not tool_events:
            return None

        for tool_event in tool_events:
            # append tool call chat messages to history
            self.chat_history.append(tool_event.msg)

        # # after all tool calls finish, pass input event back, restart agent loop
        return InputEvent()

Create the workflow instance.

In [13]:
wf = RouterOutputAgentWorkflow(tools=[sql_tool, llama_cloud_tool], verbose=True, timeout=120)

#### Visualize Workflow

In [14]:
!pip install llama-index-utils-workflow


Collecting llama-index-utils-workflow
  Downloading llama_index_utils_workflow-0.3.0-py3-none-any.whl.metadata (665 bytes)
Collecting pyvis<0.4.0,>=0.3.2 (from llama-index-utils-workflow)
  Downloading pyvis-0.3.2-py3-none-any.whl.metadata (1.7 kB)
Collecting jedi>=0.16 (from ipython>=5.3.0->pyvis<0.4.0,>=0.3.2->llama-index-utils-workflow)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading llama_index_utils_workflow-0.3.0-py3-none-any.whl (2.8 kB)
Downloading pyvis-0.3.2-py3-none-any.whl (756 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m756.0/756.0 kB[0m [31m21.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m37.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi, pyvis, llama-index-utils-workflow
Successfully installed jedi-0.19.2 llama-index-utils-workflow-0.3.0 pyvis-0.3.2


In [15]:
#from llama_index.core.workflow import draw_all_possible_flows
from llama_index.utils.workflow import draw_all_possible_flows
draw_all_possible_flows(RouterOutputAgentWorkflow)

<class 'NoneType'>
<class '__main__.ToolCallEventResult'>
<class '__main__.GatherToolsEvent'>
<class 'llama_index.core.workflow.events.StopEvent'>
<class '__main__.ToolCallEvent'>
<class 'llama_index.core.workflow.events.StopEvent'>
<class '__main__.InputEvent'>
workflow_all_flows.html


In [16]:
#Donwload the workflow HTML file
import os
print(os.getcwd())
from google.colab import files
files.download("/content/workflow_all_flows.html")


/content


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Example Queries

In [17]:
from IPython.display import display, Markdown

result = await wf.run(message="Which city has the highest population?")
display(Markdown(result))

Running step prepare_chat
Step prepare_chat produced event InputEvent
Running step chat
Chat message: None
Step chat produced event GatherToolsEvent
Running step dispatch_calls
Step dispatch_calls produced no event
Running step call_tool
Calling function sql_tool with msg {'input': 'SELECT city FROM city_stats ORDER BY population DESC LIMIT 1'}
Step call_tool produced event ToolCallEventResult
Running step gather
Step gather produced event InputEvent
Running step chat
Chat message: New York City has the highest population among all cities in the USA.
Step chat produced event StopEvent


New York City has the highest population among all cities in the USA.

In [18]:
result = await wf.run(message="What state is Houston located in?")
display(Markdown(result))

Running step prepare_chat
Step prepare_chat produced event InputEvent
Running step chat
Chat message: None
Step chat produced event GatherToolsEvent
Running step dispatch_calls
Step dispatch_calls produced no event
Running step call_tool
Calling function llama_cloud_tool with msg {'input': 'What state is Houston located in?'}
Step call_tool produced event ToolCallEventResult
Running step gather
Step gather produced event InputEvent
Running step chat
Chat message: Houston is located in the state of Texas.
Step chat produced event StopEvent


Houston is located in the state of Texas.

In [19]:
result = await wf.run(message="Where is the Space Needle located?")
display(Markdown(result))

Running step prepare_chat
Step prepare_chat produced event InputEvent
Running step chat
Chat message: None
Step chat produced event GatherToolsEvent
Running step dispatch_calls
Step dispatch_calls produced no event
Running step call_tool
Calling function llama_cloud_tool with msg {'input': 'Where is the Space Needle located?'}
Step call_tool produced event ToolCallEventResult
Running step gather
Step gather produced event InputEvent
Running step chat
Chat message: The Space Needle is located in Seattle.
Step chat produced event StopEvent


The Space Needle is located in Seattle.

In [20]:
result = await wf.run(message="List all of the places to visit in Miami.")
display(Markdown(result))

Running step prepare_chat
Step prepare_chat produced event InputEvent
Running step chat
Chat message: None
Step chat produced event GatherToolsEvent
Running step dispatch_calls
Step dispatch_calls produced no event
Running step call_tool
Calling function llama_cloud_tool with msg {'input': 'List all of the places to visit in Miami'}
Step call_tool produced event ToolCallEventResult
Running step gather
Step gather produced event InputEvent
Running step chat
Chat message: Here is a list of places to visit in Miami:
1. South Beach
2. Lincoln Road
3. Bayside Marketplace
4. Downtown Miami
5. Brickell City Centre
6. Art Deco District in Miami Beach
7. Bayfront Park
8. Museum Park
9. Tropical Park
10. Peacock Park
11. Virginia Key
12. Watson Island
13. Zoo Miami
14. Jungle Island
15. Miami Seaquarium
16. Monkey Jungle
17. Coral Castle
18. Charles Deering Estate
19. Fairchild Tropical Botanic Garden
20. Key Biscayne
Step chat produced event StopEvent


Here is a list of places to visit in Miami:
1. South Beach
2. Lincoln Road
3. Bayside Marketplace
4. Downtown Miami
5. Brickell City Centre
6. Art Deco District in Miami Beach
7. Bayfront Park
8. Museum Park
9. Tropical Park
10. Peacock Park
11. Virginia Key
12. Watson Island
13. Zoo Miami
14. Jungle Island
15. Miami Seaquarium
16. Monkey Jungle
17. Coral Castle
18. Charles Deering Estate
19. Fairchild Tropical Botanic Garden
20. Key Biscayne

In [None]:
result = await wf.run(message="How do people in Chicago get around?")
display(Markdown(result))

Running step prepare_chat
Step prepare_chat produced event InputEvent
Running step chat
Chat message: None
Step chat produced event GatherToolsEvent
Running step dispatch_calls
Step dispatch_calls produced no event
Running step call_tool
Calling function llama_cloud_tool with msg {'input': 'How do people in Chicago get around?'}
Step call_tool produced event ToolCallEventResult
Running step gather
Step gather produced event InputEvent
Running step chat
Chat message: People in Chicago get around using various transportation options such as public transit systems like the Chicago Transit Authority (CTA) buses and trains, Metra commuter rail, and Pace buses. Additionally, residents can use bike-sharing systems like Divvy and electric scooter-sharing programs. Chicago also has a well-connected network of expressways and highways for commuters who prefer to drive. The city is a major transportation hub with access to passenger and freight rail services, as well as two major airports for air

People in Chicago get around using various transportation options such as public transit systems like the Chicago Transit Authority (CTA) buses and trains, Metra commuter rail, and Pace buses. Additionally, residents can use bike-sharing systems like Divvy and electric scooter-sharing programs. Chicago also has a well-connected network of expressways and highways for commuters who prefer to drive. The city is a major transportation hub with access to passenger and freight rail services, as well as two major airports for air travel.

In [None]:
result = await wf.run(message="What is the historical name of Los Angeles?")
display(Markdown(result))

Running step prepare_chat
Step prepare_chat produced event InputEvent
Running step chat
Chat message: None
Step chat produced event GatherToolsEvent
Running step dispatch_calls
Step dispatch_calls produced no event
Running step call_tool
Calling function llama_cloud_tool with msg {'input': 'What is the historical name of Los Angeles?'}
Step call_tool produced event ToolCallEventResult
Running step gather
Step gather produced event InputEvent
Running step chat
Chat message: The historical name of Los Angeles is El Pueblo de Nuestra Señora la Reina de los Ángeles.
Step chat produced event StopEvent


The historical name of Los Angeles is El Pueblo de Nuestra Señora la Reina de los Ángeles.