# The LlamaIndex Framework

## Introduction to LlamaIndex

LlamaIndex is a complete toolkit for creating LLM-powered agents over your data using indexes and workflows.

LlamaIndex has some key benefits over smolagents:
- **Clear Workflow System**: Workflows help break down how agents should make decisions step by step using an event-driven and async-first syntax. This helps us clearly compose and organize our logic.
- **Advanced Document Parsing with LlamaParse**
- **Many Ready-to-Use Components**: LlamaIndex has been around for a while, so it works with lots of other frameworks.
- **LlamaHub**: is a registry of hundreds of these components, agents, and tools that we can use within LlamaIndex.

## Introduction to LlamaHub

LlamaHub is a registry of hundreds of integrations, agents, and tools that we can use within LlamaIndex.

In [None]:
!pip install llama-index-llms-huggingface-api llama-index-embeddings-huggingface

Once installed, we can use the HuggingFace Inference API.

In [None]:
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
import os
from dotenv import load_dotenv

load_dotenv()

hf_token = os.getenv("HF_TOKEN")

llm = HuggingFaceInferenceAPI(
    model_name="Qwen/Qwen2.5-Coder-32B-Instruct",
    temperature=0.7,
    max_tokens=100,
    token=hf_token
)

In [None]:
response = llm.complete("Hello, how are you?")
print(response)

## Components in LlamaIndex

For alll components in LlamaIndex, we will focuus on the `QueryEngine` component which can be used as a RAG tool for an agent..

In [None]:
!pip install -qU llama-index datasets llama-index-callbacks-arize-phoenix arize-phoenix llama-index-vector-stores-chroma llama-index-llms-huggingface-api llama-index-embeddings-huggingface

### Creating a RAG Pipeline using components

There are five key stages within RAG:
-  **Loading**: this refers to getting our data from where it lives - whether it's text files, PDFs, another website, a database, or an API - into our workflow.
- **Indexing**: this means creating a data structure that allows for querying the data. For LLMs, this nearly always means creating vector embeddings. Indexing can also refer to numerous other metadata strategies to make it easy to accurately find contextually rellevant data based on properties.
- **Storing** once our data is indexed we will want to store our index, as well as other metadata, to avoid having to re-index it.
- **Querying**: for any given indexing strategy, there are many ways we can utilize LLMs and LlamaIndex data structures to query, including sub-queries, multi-step queries and hybrid strategies.
- **Evaluation**: a critical step in any fllow is checking how effective it is relative to other strategies, or when we make changes.

#### Setting up the persona database

We will be using personas from the [dvilasuero/finepersonas-v0.1-tiny](https://huggingface.co/datasets/dvilasuero/finepersonas-v0.1-tiny) dataset. This dataset contains 5K personas that will be attending the party!

In [None]:
from datasets import load_dataset
from pathlib import Path

dataset = load_dataset("dvilasuero/finepersonas-v0.1-tiny", split="train")

Path('data').mkdir(parents=True, exist_ok=True)
for i, persona in enumerate(dataset)::
    with open(Path('data') / f"persona_{i}.txt", 'w') as f:
        f.write(persona['persona'])

Now we have a local directory with all the personas to use.

#### Loading and embedding documents

There are three ways to load data into LlamaIndex
- `SimpleDirectoryReader`: A built-in loader for various file types from a local directory.
- `LlamaParse`: LlamaParse, LlamaIndex's official tool for PDF parsing, available as a managed API.
- `LlamaHub`: A registry of hundreds of data-loading libraries to ingest data from any source.

The simplest way to load data is with `SimpleDirectoryReader`.

In [None]:
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader('data')
documents = reader.load_data()
len(documents)

Now we have a list of `Document` objects. After loading the documents, we need to break them into smaller pieces called `Node` objects. A `Node` is a chunk of text from the original document that is easier for LLMs to work with, while it still has references to the original `Document` object.

Then, we can use the `IngestionPipeline` to create nodes from the documents and prepare them for the `QueryEngine`. We will use the `SentenceSplitter` to split the documents into smaller chunks and the `HuggingFaceEmbedding` to embed the chunks.

In [None]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.ingestion import IngestionPipeline

# Create the pipeline with transformations
pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(),
        HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
    ]
)

# Run the pipeline sync or async
nodes = await pipeline.arun(documents=documents[:10])
len(nodes)

We have created a list of `Node` objects.

#### Storing and indexing documents

After creating our `Node` objects we need to index them to make them searchable, but before that, we need a place to store our data. Since we are using an ingestion pipeline, we can directly attach a vector store to the pipeline to populate it. In this case, we will use `Chroma` to store our documents.

We will run the pipeline again with the vector store attached. The `IngestionPipeline` caches the operations so this should be fast.

In [None]:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore

# Create a vector store
db = chromadb.PersistentClient(path='./chroma_db')
chroma_collection = db.get_or_create_collection('agent')
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Attach the vector store to the pipeline
pipeline = IngestionPipeline(
    transformations=[
        SentenceSplitter(),
        HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
    ],
    vector_store=vector_store
)

nodes = await pipeline.arun(documents=documents[:10])
len(nodes)

Next, we can create a `VectorStoreIndex` from the vector store and use it to query the documents by passing the vector store and embedding model to the `from_vector_store()` method. Make sure we must use the same embedding model during ingestion to ensure consistency.

In [None]:
from llama_index.core import VectorStoreIndex
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

embed_model = HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store,
    embed_model=embed_model
)

All information is automatically persisted within the `ChromaVectorStore` object and the passed directory path.

#### Querying a VectorStoreIndex with prompts and LLMs

Before we can query our index, we need to convert it to a query interface. The most common conversion options are
- `as_retriever`: For basic document retrieval, returning a list of `NodeWithScore` objects with similarity scores
- `as_query_engine`: For single question-answer interactions, returning a written response
- `as_chat_engine`: For conversational interactions that maintain memory across multiple messages, returning a written response using chat history and indexed context


We will focus on the query engine since it is more common for anget-like interactions. We also pass in an LLM to the query engine to use for the response.

In [None]:
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
import nest_asyncio

# required to run the query engine
nest_asyncio.apply()

llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
query_engine = index.as_query_engine(
    llm=llm,
    response_mode='tree_summarize'
)

In [None]:
response = query_engine.query(
    "Respond using a persona that describes author and travel experiences?"
)
response

#### Response processing

Under the hood, the query engine doesn't only use the LLM to answer the question but also uses a `ResponseSynthesizer` as a strategy to process the response. This is customizable and there are three main options for the `response_mode`:
- `refine` - create and refine an answer by sequentially going through each retrieved text chunk. This makes a separate LLM call per Node/retrieved chunk.
- `compact` - (default) similar to refining but concatenating the chunks beforehand, resulting in fewer LLM calls.
- `tree_summarize` - create a detailed answer by going through each retrieved text chunk and creating a tree structure of the answer.

#### Evaluation and observability

LlamaIndex provides **built-in evaluation tools to access response quality**. These evaluators leverage LLMs to analyze responses across different dimensions:
- `FaithfulnessEvaluator` - evaluates the faithfulness of the answer by checking if the answer is supported by the context.
- `AnswerRelevancyEvaluator` - evaluates the relevance of the answer by checking if the answer is relevant to the question.
- `CorrectnessEvaluator` - evaluates the correctness of the answer by checking if the answer is correct.

In [None]:
from llama_index.core.evaluation import FaithfulnessEvaluator

# query index
evaluator = FaithfulnessEvaluator(llm=llm)
eval_result = evaluator.evaluate_response(response=response)
eval_result.passing

Even without direct evaluation, we can gain insights into how our system is performing through observability. This is useful when we build more complex workflows and want to understand how each component is performing.

If one of these LLM based evaluators does not give enough context, we can check the response using the Arize Phoenix tool, after creating an account at [LlamaTrace](https://llamatrace.com/login) and generating an API key:

In [None]:
import llama_index
import os
from google.colab import userdata

os.environ['OTEL_EXPORTER_OTLP_HEADERS'] = f"api_key={userdata.get(PHOENIX_API_KEY)}"

llama_index.set_global_handler(
    'arize_phoenix',
    endpoint='https://llamatrace.com/v1/traces'
)

Now we can query the index and see the response in the Arize Phoenix tool:

In [None]:
response = query_engine.query(
    "What is the name of the someone that is interested in AI and techhnology?"
)
response

## Using Tools in LlamaIndex

Clear tool interfaces are easier for LLMs to use.

There are four main types of tools in LlamaIndex:
- `FunctionTool` - convert any Python function into a tool that an agent can use.
- `QueryEngineTool` - let agents use query engines. Since agents are built on query engines, they can also use other agents as tools.
- `Toolspecs` - sets of tools created by the community, which often include tools for specific services like Gmail.
- `UtilityTools` - special tools that help handle large amount of data from other tools.

In [None]:
!pip install -qU llama-index llama-index-vector-stores-chroma llama-index-llms-huggingface-api llama-index-embeddings-huggingface llama-index-tools-google

### Creating a FunctionTool

A `FunctionTool` provides a simple way to wrap any Python function and make it available to an agent. We can pass either a synchronous or asynchronous function to the tool, along with optional `name` and `description` parameters. The `name` and `description` are particularly important as they help the agent understand when and how to use the tool effectively

In [None]:
from llama_index.core.tools import FunctionTool

def get_weather(location: str) -> str:
    """Useful for getting the weather for a given location."""
    print(f"Getting weather for {location}")
    return f"The weather in {location} is sunny."


tool = FunctionTool.from_defaults(
    get_weather,
    name="my_weather_tool",
    description="Useful for getting the weather for a given location."
)

tool.call("Houston")

### Creating a QueryEngineTool

The `QueryEngine` we defined in the previous section can be easily transformed into a tool using the `QueryEngineTool` class.

In [None]:
import chromadb

from llama_index.core import VectorStoreIndex
from llama_index.core.tools import QueryEngineTool
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.chroma import ChromaVectorStore


# Load the chroma db created in previous section
db = chromadb.PersistentClient(path='./chroma_db')
chroma_collection = db.get_or_create_collection('agent')
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

embed_model = HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")

index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store,
    embed_model=embed_model
)

query_engine = index.as_query_engine(llm=llm)

tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name='personas',
    description="Descriptions for various types of personas",
)

In [None]:
await tool.acall(
    "Responds about research on the impact of AI on the future of work and society?"
)

### Creating Toolspecs

`ToolSepcs` is a collection of tools that work together. A `ToolSpec` combines related tools for specific purposes.

For example, we can load the `ToolSpec` from Google:

In [None]:
from llama_index.tools.google import GmailToolSpec

tool_spec = GmailToolSpec()
tool_spec_list = tool_spec.to_tool_list()
tool_spec_list

To get a more detailed view of the tools, we can take a look at the `metadata` of each tool:

In [None]:
[(tool.metadata.name, tool.metadata.description) for tool in tool_spec_list]

#### Model Context Protocol (MCP) in LlamaIndex

LlamaIndex also allows using MCP tools through a `ToolSpec` on the LlamaHub

In [None]:
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec

# Assume there is a mcp server running on 127.0.0.1:8000
mcp_client = BasicMCPClient("http://127.0.0.1:8000/sse")
mcp_tool = McpToolSpec(client=mcp_client)

# async
tools = await mcp_tool_spec.to_tool_list_async()

In [None]:
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI

agent = FunctionAgent(
    name="Agent",
    description="Some description",
    llm=OpenAI(model="gpt-4o"),
    tools=tools,
    system_prompt="You are a helpful assistant.",
)

resp = await agent.run("What is the weather in Tokyo?")

## Using Agents in LlamaIndex

LlamaIndex supports three main types of reasoning agents:
- **Functio Calling Agents** - these work with AI models that can call specific functions.
- **ReAct Agents** - these can work with any AI that does chat or text endpoint and deal with complex reasoning tasks.
- **Advanced Custom Agents** - these use more complex methods to deal with more complex tasks and workflows.

In [None]:
!pip install llama-index llama-index-vector-stores-chroma llama-index-llms-huggingface-api llama-index-embeddings-huggingface

### Initializing Agents

To create an agent, we start by providing it with a set of functions/tools that define its capabilities.

ReAct agents are also good at complex reasoning tasks and can work with any LLM that has chat or text completion capabilities. They are more verbose, and show the reasoning behind certain actions that they take.

We start by initializing an agent and using the basic `AgentWorkflow` class to create an agent.

In [None]:
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.core.agent.workflow import AgentWorkflow, ToolCallResult, AgentStream

# define sample tools
# type annotations, function names, and docstrings, are all included in parsed schemas
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b

def subtract(a: int, b: int) -> int:
    """Subtract two numbers"""
    return a - b


def multiply(a: int, b: int) -> int:
    """Multiply two numbers"""
    return a * b


def divide(a: int, b: int) -> int:
    """Divide two numbers"""
    return a / b

# initialize llm
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")

# initialize agent
agent = AgentWorkflow.from_tools_or_functions(
    llm=llm,
    tools_or_functions=[add, subtract, multiply, divide],
    system_prompt="You are a math agent that can add, subtract, multiply, and divide numbers using provided tools.",
)

Then, we can run the agent and get the response and reasoning behind the tool calls.

In [None]:
handler = agent.run("What is (2 + 2) * 2?")

async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult)::
        print()
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):
        # Showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

**Agents are stateless by default**, add remembering past interactions is opt-in using a `Context` object.

This may be useful if we want to use an agent that needs to remember previous interactions, like a chatbot that maintains context across multiple messages or a task manager that needs to track progress over time.

In [None]:
# remembering state
from llama_index.core.workflow import Context

ctx = Context(agent)

response = await agent.run("My name is Bin.", ctx=ctx)
response = await agent.run("What is my name?", ctx=ctx)
response

### Creating RAG Agents with QueryEngineTools

**Agentic RAG is a powerful way to use agents to answer questions about our data**. We can pass various tools to help answer questions. However, instead of answering question on top of documents automatically, the agent needs to decode to use any other tools to answer the question.

We will wrap `QueryEngine` as a tool for an agent. When doing so, we need to define a name a description. The LLM will use this information to correctly use the tool.

In [None]:
import chromadb

from llama_index.core import VectorStoreIndex
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core.tools import QueryEngineTool
from llama_index.vector_stores.chroma import ChromaVectorStore

# Create a vector store
db = chromadb.PersistentClient(path='./chroma_db')
chroma_collection = db.get_or_create_collection('agent')
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

embed_model = HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store,
    embed_model=embed_model
)

# Create a query engine tool
query_engine = index.as_query_engine(llm=llm)
query_engine_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name='personas',
    description="Descriptions for various types of personas",
    return_direct=False
)


# Create a RAG agent
query_engine_agent = AgentWorkflow.from_tools_or_functions(
    llm=llm,
    tools_or_functions=[query_engine_tool],
    system_prompt="You are a helpful assistant that has access to a database containing persona descriptions."
)

Then we can get the response and reasoning behind the tool calls.

In [None]:
handler = query_engine_agent.run(
    "Search the database for 'science fiction' and return some persona descriptions."
)

async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult)::
        print()
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):
        # Showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

### Creating Multi-Agent Systems

The `AgentWorkflow` class also directly supports multi-agent systems. By giving each agent a name and description, the system maintains a single active speaker, with each agent having the ability to hand off to another agent.

By narrowing the scope of each agent, we can help increase their general accuracy when responding to user messages.

**Agents in LlamaIndex can also directly be used as tools** for other agents, for more complex and custom scenarios.

In [None]:
from llama_index.core.agent.workflow import AgentWorkflow, ReActAgent

# Define some tools
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b

In [None]:
# Create agent configs
# NOTE: we can use FunctionAgent or ReActAgent here.
# FunctionAgent works for LLMs with a function calling API.
# ReActAgent works for any LLM.
calculator_agent = ReActAgent(
    name="calculator",
    description="Performs basic arithmetic operations",
    system_prompt="You are a calculator assistant. Use your tools for any math operation.",
    tools=[add, subtract],
    llm=llm,
)

query_agent = ReActAgent(
    name="info_lookup",
    description="Looks up information about XYZ",
    system_prompt="Use your tool to query a RAG system to answer information about XYZ",
    tools=[query_engine_tool],
    llm=llm,
)

# Create and run the workflow
agent = AgentWorkflow(
    agents=[calculator_agent, query_agent],
    root_agent='calculator'
)

In [None]:
response = await agent.run(user_msg="Can you add 5 and 3?")

In [None]:
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

## Creating Agentic Workflows in LlamaIndex

A workflow in LlamaIndex provides a structured way to organize our code into sequential and manageable steps.

Such a workflow is created by defining `Steps` which are triggered by `Events`, and themselves emit `Events` to trigger further steps.

Benefits of workflows:
- clear organization of code into discrete steps
- event-driven architecture for flexible control flow
- type-safe communication between steps
- built-in state management
- support for both simple and complex agent interactions

In [None]:
!pip install -qU llama-index llama-index-vector-stores-chroma llama-index-utils-workflow llama-index-llms-huggingface-api pyvis

### Basic Workflow Creation

We can create a single-step workflow by defining a class that inherits from `Workflow` and decorating our functions with `@step`. We will also need to add `StartEvent` and `StopEvent`, which are special events that are used to indicate the start and end of the workflow.

In [None]:
from llama_index.core.workflow import StartEvent, StopEvent, Workflow, step

class MyWorkflow(Workflow):
    @step
    async def my_step(self, ev: StartEvent) -> StopEvent:
        # do something here
        return StopEvent(result="Hello, world!")


w = MyWorkflow(timeout=10, verbose=False)
result = await w.run()
result

### Connecting Multiple Steps

To connect multiple steps, we **create custom events that carry data between steps**. To do so, we need to add an `Event` that is passed between the steps and transfers the output of the first step to the second step.

In [None]:
from llama_index.core.workflow import Event

class ProcessingEvent(Event):
    intermediate_result: str


class MultiStepWorkflow(Workflow):
    @step
    async def step_one(self, ev: StartEvent) -> ProcessingEvent:
        # Process initial data
        return ProcessingEvent(intermediate_result="Step 1 complete")

    @step
    async def step_two(self, ev: ProcessingEvent) -> StopEvent:
        # Use the intermediate result
        final_result = f"Finished processing: {ev.intermediate_result}"
        return StopEvent(result=final_result)


w = MultiStepWorkflow(timeout=10, verbose=False)
result = await w.run()
result

The type hinting is important here, as it ensures that the workflow is executed correctly.

### Loops and Branches

The type hinting allows us to create branches, loops, and joins to facilitate more complex workflows. We can create a loop by using the union operator `|`.

In [None]:
from llama_index.core.workflow import Event
import random


class ProcessingEvent(Event):
    intermediate_result: str

class LoopEvent(Event):
    loop_output: str


class MultiStepWorkflow(Workflow):
    @step
    async def step_one(self, ev: StartEvent | LoopEvent) -> ProcessingEvent | LoopEvent:
        if random.randint(0, 1) == 0:
            print("Bad thing happened")
            return LoopEvent(loop_output="Back to step one.")
        else:
            print("Good thing happened")
            return ProcessingEvent(intermediate_result="First step complete.")

    @step
    async def step_two(self, ev: ProcessingEvent) -> StopEvent:
        # Use the intermediate result
        final_result = f"Finished processing: {ev.intermediate_result}"
        return StopEvent(result=final_result)


w = MultiStepWorkflow(verbose=False)
result = await w.run()
result

### Drawing Workflows

In [None]:
from llama_index.utils.workflow import draw_all_possible_flows

draw_all_possible_flows(w)

### State Management

State management is useful when we want to keep track of the state of the workflow, so that every step has access to the same state. Instead of passing the event information between steps, we can use the `Context` type hint to pass information between steps. This may be useful for long running workflows, where we want to store information between steps.

In [None]:
from llama_index.core.workflow import Context, StartEvent, StopEvent

class ProcessingEvent(Event):
    intermediate_result: str


class MultiStepWorkflow(Workflow):
    @step
    async def step_one(self, ev: StartEvent, ctx: Context) -> ProcessingEvent:
        # Process initial data
        await ctx.set("query", "What is the capital of France?")
        return ProcessingEvent(intermediate_result="Step 1 complete")

    @step
    async def step_two(self, ev: ProcessingEvent, ctx: Context) -> StopEvent:
        # Use the intermediate result
        query = await ctx.get("query")
        print(f"Query: {query}")

        final_result = f"Finished processing: {ev.intermediate_result}"
        return StopEvent(result=final_result)


w = MultiStepWorkflow(timeout=10, verbose=False)
result = await w.run()
result

### Automating Workflows with Multi-Agent Workflows

Instead of manual workflow creation, we can use the `AgentWorkflow` class to create **a multi-agent workflow**.

The `AgentWorkflow` usess Workflow Agents to allow us to create a system of one or more agents that can collaborate and hand off tasks to each other based on their specialized capabilities. This enables building complex agent systems where different agents handle different aspects of a task.

In [None]:
from llama_index.core.agent.workflow import AgentWorkflow, ReActAgent
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI

# Define some tools
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")

In [None]:
# we can pass functions directly without FunctionTool -- the fn/docstring are parsed for the name/description
multiply_agent = ReActAgent(
    name='multiply_agent',
    description='Is able to multiply two intergers',
    system_prompt='A helpful assistant that can use a tool to multiply numbers.',
    tools=[multiply],
    llm=llm,
)

addition_agent = ReActAgent(
    name='addition_agent',
    description='Is able to add two intergers',
    system_prompt='A helpful assistant that can use a tool to add numbers.',
    tools=[add],
    llm=llm,
)

# Create the workflow
workflow = AgentWorkflow(
    agents=[multiply_agent, addition_agent],
    root_agent='multiply_agent',
)

In [None]:
response = await workflow.run(user_msg="Can you add 5 and 3?")
response