# Agents in LlamaIndex

This notebook is part of the [Hugging Face Agents Course](https://www.hf.co/learn/agents-course), a free Course from beginner to expert, where you learn to build Agents.

![Agents course share](https://huggingface.co/datasets/agents-course/course-images/resolve/main/en/communication/share.png)

## Let's install the dependencies

We will install the dependencies for this unit.

In [3]:
!pip install -U  \
    llama-index \
    datasets \
    llama-index-callbacks-arize-phoenix \
    llama-index-vector-stores-chroma \
    llama-index-embeddings-ollama \
    llama-index-llms-ollama \
    # llama-index-llms-huggingface-api \
    # llama-index==0.10.38 \

Defaulting to user installation because normal site-packages is not writeable


## Initialising agents

Let's start by initialising an agent. We will use the basic `AgentWorkflow` class to create an agent.

In [4]:
# from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
from llama_index.core.agent.workflow import AgentWorkflow, ToolCallResult, AgentStream
from llama_index.llms.ollama import Ollama 

def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two numbers"""
    return a - b


def multiply(a: int, b: int) -> int:
    """Multiply two numbers"""
    return a * b


def divide(a: int, b: int) -> int:
    """Divide two numbers"""
    return a / b


# llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
llm = Ollama(model="myaniu/qwen2.5-1m:7b")


agent = AgentWorkflow.from_tools_or_functions(
    tools_or_functions=[subtract, multiply, divide, add],
    llm=llm,
    system_prompt="You are a math agent that can add, subtract, multiply, and divide numbers using provided tools.",
)

Then, we can run the agent and get the response and reasoning behind the tool calls.

In [5]:
handler = agent.run("What is (2 + 2) * 2?")
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp


Called tool:  add {'a': 2, 'b': 2} => 4

Called tool:  multiply {'a': 4, 'b': 2} => 8
The result of (2 + 2) * 2 is 8.

AgentOutput(response=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={'tool_calls': []}, blocks=[TextBlock(block_type='text', text='The result of (2 + 2) * 2 is 8.')]), tool_calls=[ToolCallResult(tool_name='add', tool_kwargs={'a': 2, 'b': 2}, tool_id='add', tool_output=ToolOutput(content='4', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 2, 'b': 2}}, raw_output=4, is_error=False), return_direct=False), ToolCallResult(tool_name='multiply', tool_kwargs={'a': 4, 'b': 2}, tool_id='multiply', tool_output=ToolOutput(content='8', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 4, 'b': 2}}, raw_output=8, is_error=False), return_direct=False)], raw={'model': 'myaniu/qwen2.5-1m:7b', 'created_at': '2025-03-13T13:49:36.612066566Z', 'done': True, 'done_reason': 'stop', 'total_duration': 424664877, 'load_duration': 10738624, 'prompt_eval_count': 473, 'prompt_eval_duration': 16000000, 'eval_count': 17, 'eval_duration': 382000000, 'message': Message(role

In a similar fashion, we can pass state and context to the agent.


In [6]:
from llama_index.core.workflow import Context

ctx = Context(agent)

response = await agent.run("My name is Bob.", ctx=ctx)
response = await agent.run("What was my name again?", ctx=ctx)
response

AgentOutput(response=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={'tool_calls': []}, blocks=[TextBlock(block_type='text', text="You mentioned your name is Bob. Is there something specific you'd like to know or do related to that name or any numbers associated with it?")]), tool_calls=[], raw={'model': 'myaniu/qwen2.5-1m:7b', 'created_at': '2025-03-13T13:49:37.802685571Z', 'done': True, 'done_reason': 'stop', 'total_duration': 723787947, 'load_duration': 14653674, 'prompt_eval_count': 439, 'prompt_eval_duration': 9000000, 'eval_count': 30, 'eval_duration': 688000000, 'message': Message(role='assistant', content="You mentioned your name is Bob. Is there something specific you'd like to know or do related to that name or any numbers associated with it?", images=None, tool_calls=None), 'usage': {'prompt_tokens': 439, 'completion_tokens': 30, 'total_tokens': 469}}, current_agent_name='Agent')

## Creating RAG Agents with QueryEngineTools

Let's now re-use the `QueryEngine` we defined in the [previous unit on tools](/tools.ipynb) and convert it into a `QueryEngineTool`. We will pass it to the `AgentWorkflow` class to create a RAG agent.

In [7]:
import chromadb

from llama_index.core import VectorStoreIndex
# from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
# from llama_index.embeddings.huggingface_api import HuggingFaceInferenceAPIEmbedding
from llama_index.embeddings.ollama import OllamaEmbedding

from llama_index.core.tools import QueryEngineTool
from llama_index.vector_stores.chroma import ChromaVectorStore

# Create a vector store
db = chromadb.PersistentClient(path="./alfred_chroma_db")
chroma_collection = db.get_or_create_collection("alfred")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

# Create a query engine
# embed_model = HuggingFaceInferenceAPIEmbedding(model_name="BAAI/bge-small-en-v1.5")
embed_model = OllamaEmbedding(model_name="qllama/bge-small-en-v1.5:f16")

# llm = HuggingFaceInferenceAPI(model_name="Qwen/Qwen2.5-Coder-32B-Instruct")
llm = Ollama(model="myaniu/qwen2.5-1m:7b")

index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, embed_model=embed_model
)
query_engine = index.as_query_engine(llm=llm)
query_engine_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="personas",
    description="descriptions for various types of personas",
    return_direct=False,
)

# Create a RAG agent
query_engine_agent = AgentWorkflow.from_tools_or_functions(
    tools_or_functions=[query_engine_tool],
    llm=llm,
    system_prompt="You are a helpful assistant that has access to a database containing persona descriptions. ",
)

And, we can once more get the response and reasoning behind the tool calls.

In [8]:
handler = query_engine_agent.run(
    "Search the database for 'science fiction' and return some persona descriptions."
)
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp


Called tool:  personas {'input': 'science fiction'} => The provided context focuses on an anthropologist or cultural expert specializing in Cypriot culture. There is no indication or mention of any connection to science fiction within this context. Therefore, based on the given information, it's not possible to link the described person directly to science fiction topics.
It appears that our database does not contain a persona specifically tailored for 'science fiction'. The provided description is related to an anthropologist specializing in Cypriot culture. If you're interested in personas or descriptions connected to science fiction, please provide more details so I can search accordingly. Alternatively, we could create a custom persona based on your requirements. Would you like to proceed with this option?

AgentOutput(response=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={'tool_calls': []}, blocks=[TextBlock(block_type='text', text="It appears that our database does not contain a persona specifically tailored for 'science fiction'. The provided description is related to an anthropologist specializing in Cypriot culture. If you're interested in personas or descriptions connected to science fiction, please provide more details so I can search accordingly. Alternatively, we could create a custom persona based on your requirements. Would you like to proceed with this option?")]), tool_calls=[ToolCallResult(tool_name='personas', tool_kwargs={'input': 'science fiction'}, tool_id='personas', tool_output=ToolOutput(content="The provided context focuses on an anthropologist or cultural expert specializing in Cypriot culture. There is no indication or mention of any connection to science fiction within this context. Therefore, based on the given information, it's not po

## Creating multi-agent systems

We can also create multi-agent systems by passing multiple agents to the `AgentWorkflow` class.

In [9]:
from llama_index.core.agent.workflow import (
    AgentWorkflow,
    ReActAgent,
)


# Define some tools
def add(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b


def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b


# Create agent configs
# NOTE: we can use FunctionAgent or ReActAgent here.
# FunctionAgent works for LLMs with a function calling API.
# ReActAgent works for any LLM.
calculator_agent = ReActAgent(
    name="calculator",
    description="Performs basic arithmetic operations",
    system_prompt="You are a calculator assistant. Use your tools for any math operation.",
    tools=[add, subtract],
    llm=llm,
)

query_agent = ReActAgent(
    name="info_lookup",
    description="Looks up information about XYZ",
    system_prompt="Use your tool to query a RAG system to answer information about XYZ",
    tools=[query_engine_tool],
    llm=llm,
)

# Create and run the workflow
agent = AgentWorkflow(agents=[calculator_agent, query_agent], root_agent="calculator")

# Run the system
handler = agent.run(user_msg="Can you add 5 and 3?")

In [10]:
async for ev in handler.stream_events():
    if isinstance(ev, ToolCallResult):
        print("")
        print("Called tool: ", ev.tool_name, ev.tool_kwargs, "=>", ev.tool_output)
    elif isinstance(ev, AgentStream):  # showing the thought process
        print(ev.delta, end="", flush=True)

resp = await handler
resp

Thought: The user wants to know the sum of 5 and 3. I will use the 'add' tool for this purpose.
Action: add
Action Input: {"a": 5, "b": 3}
Called tool:  add {'a': 5, 'b': 3} => 8
Thought: I can answer without using any more tools. I'll use the user's language to answer
Answer: The sum of 5 and 3 is 8.

AgentOutput(response=ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={'tool_calls': []}, blocks=[TextBlock(block_type='text', text='The sum of 5 and 3 is 8.')]), tool_calls=[ToolCallResult(tool_name='add', tool_kwargs={'a': 5, 'b': 3}, tool_id='1b8923d3-ff1c-426e-8296-66a4709f88f1', tool_output=ToolOutput(content='8', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 5, 'b': 3}}, raw_output=8, is_error=False), return_direct=False)], raw={'model': 'myaniu/qwen2.5-1m:7b', 'created_at': '2025-03-13T13:49:45.561173895Z', 'done': True, 'done_reason': 'stop', 'total_duration': 852743669, 'load_duration': 11866284, 'prompt_eval_count': 791, 'prompt_eval_duration': 12000000, 'eval_count': 36, 'eval_duration': 815000000, 'message': Message(role='assistant', content='', images=None, tool_calls=None), 'usage': {'prompt_tokens': 791, 'completion_tokens': 36, 'total_tokens': 827}}, current_agent_name='calculator')