[Referenece YouTube Video by LlamaIndex](https://www.youtube.com/watch?v=T0bgevj0vto)

### Gemini Summary

Four concepts of agents:
- **Level 1: Tool Use** - This is the most simple level where the agent can decide what tool to pick and the arguments to use with that tool.
- **Level 2: Reasoning Loop with Conversation Memory** - This level includes some sort of reasoning loop with conversation memory. This means the agent can maintain the state of earlier conversation and use it to inform future responses.
- **Level 3: Combine Tool Use with Reasoning Loop and Memory** - This level combines tool use with a reasoning loop and memory. The agent can select a set of tools to use and then loop back to see if the task has been solved.
- **Level 4: Fancy Reasoning Loop, Memory, and Tool Use** - This is the most advanced level where the agent can not only plan the next step but also plan out an entire query plan to achieve the task at hand.

Also defined the different modules and linking them together. The modules include:
- Agent Input
- React Prompt to LLM
- React Output Parser
- Run Tool
- Process Agent Response


# Building an Agent around a Query Pipeline


In this cookbook we show you how to build an agent around a query pipeline.

Agents offer the ability to do complex, sequential reasoning on top of any query DAG that you have setup. Conceptually this is also one of the ways you can add a "loop" to the graph.

In this tutorial we show you how to build a full ReAct agent that can do tool picking from scratch.

We will be using LlamaIndex v0.10 - https://blog.llamaindex.ai/llamaindex-v0-10-838e735948f8

In [13]:
# pip install --upgrade llama-index

In [7]:
# pip install llama-index-core

In [2]:
pip show llama-index

Name: llama-index
Version: 0.10.24
Summary: Interface between LLMs and your data
Home-page: https://llamaindex.ai
Author: Jerry Liu
Author-email: jerry@llamaindex.ai
License: MIT
Location: c:\Users\vibud\miniconda3\Lib\site-packages
Requires: llama-index-agent-openai, llama-index-cli, llama-index-core, llama-index-embeddings-openai, llama-index-indices-managed-llama-cloud, llama-index-legacy, llama-index-llms-openai, llama-index-multi-modal-llms-openai, llama-index-program-openai, llama-index-question-gen-openai, llama-index-readers-file, llama-index-readers-llama-parse
Required-by: llama-hub
Note: you may need to restart the kernel to use updated packages.


In [3]:
pip show llama-index-core

Name: llama-index-core
Version: 0.10.24.post1
Summary: Interface between LLMs and your data
Home-page: https://llamaindex.ai
Author: Jerry Liu
Author-email: jerry@llamaindex.ai
License: MIT
Location: c:\Users\vibud\miniconda3\Lib\site-packages
Requires: aiohttp, dataclasses-json, deprecated, dirtyjson, fsspec, httpx, llamaindex-py-client, nest-asyncio, networkx, nltk, numpy, openai, pandas, pillow, PyYAML, requests, SQLAlchemy, tenacity, tiktoken, tqdm, typing-extensions, typing-inspect
Required-by: llama-index, llama-index-agent-openai, llama-index-cli, llama-index-embeddings-huggingface, llama-index-embeddings-openai, llama-index-indices-managed-llama-cloud, llama-index-llms-openai, llama-index-multi-modal-llms-openai, llama-index-program-openai, llama-index-question-gen-openai, llama-index-readers-file, llama-index-readers-llama-parse, llama-index-vector-stores-chroma, llama-index-vector-stores-faiss, llama-parse
Note: you may need to restart the kernel to use updated packages.


In [5]:
from llama_index.core.query_pipeline import QueryPipeline

## Setup

### Setup Data

We use the chinook database as sample data. [Source](https://www.sqlitetutorial.net/sqlite-sample-database/).

In [8]:
%pip install llama-index-llms-openai

Note: you may need to restart the kernel to use updated packages.


In [9]:
!curl "https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip" -O ./chinook.zip
!unzip -o ./chinook.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  298k  100  298k    0     0   339k      0 --:--:-- --:--:-- --:--:--  339k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: .
'unzip' is not recognized as an internal or external command,
operable program or batch file.


In [11]:
from llama_index.core import SQLDatabase
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    column,
)

engine = create_engine("sqlite:///chinook.db")
sql_database = SQLDatabase(engine)

### Setup Calback Manager

We setup a global callback manager (helps in case you want to plug in downstream observability integrations).

In [12]:
# define global callback setting
from llama_index.core.settings import Settings
from llama_index.core.callbacks import CallbackManager

callback_manager = CallbackManager()
Settings.callback_manager = callback_manager

## Setup Text-to-SQL Query Engine / Tool

Now we setup a simple text-to-SQL tool: given a query, translate text to SQL, execute against database, and get back a result.

In [13]:
from llama_index.core.query_engine import NLSQLTableQueryEngine
from llama_index.core.tools import QueryEngineTool

sql_query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["albums", "tracks", "artists"],
    verbose=True,
)
sql_tool = QueryEngineTool.from_defaults(
    query_engine=sql_query_engine,
    name="sql_tool",
    description=(
        "Useful for translating a natural language query into a SQL query"
    ),
)

ValueError: 
******
Could not load OpenAI model. If you intended to use OpenAI, please check your OPENAI_API_KEY.
Original error:
No API key found for OpenAI.
Please set either the OPENAI_API_KEY environment variable or openai.api_key prior to initialization.
API keys can be found or created at https://platform.openai.com/account/api-keys

To disable the LLM entirely, set llm=None.
******

## Setup ReAct Agent Pipeline

We now setup a ReAct pipeline for a single step using our Query Pipeline syntax. This is a multi-part process that does the following:
1. Takes in agent inputs
2. Calls ReAct prompt using LLM to generate next action/tool (or returns a response).
3. If tool/action is selected, call tool pipeline to execute tool + collect response.
4. If response is generated, get response.

Throughout this we'll use a variety of agent-specific query components. Unlike normal query pipelines, these are specifically designed for query pipelines that are used in a `QueryPipelineAgentWorker`:
- An `AgentInputComponent` that allows you to convert the agent inputs (Task, state dictionary) into a set of inputs for the query pipeline.
- An `AgentFnComponent`: a general processor that allows you to take in the current Task, state, as well as any arbitrary inputs, and returns an output. In this cookbook we define a function component to format the ReAct prompt. However, you can put this anywhere.
- [Not used in this notebook] An `CustomAgentComponent`: similar to `AgentFnComponent`, you can implement `_run_component` to define your own logic, with access to Task and state. It is more verbose but more flexible than `AgentFnComponent` (e.g. you can define init variables, and callbacks are in the base class).

Note that any function passed into `AgentFnComponent` and `AgentInputComponent` MUST include `task` and `state` as input variables, as these are inputs passed from the agent.

Note that the output of an agentic query pipeline MUST be `Tuple[AgentChatResponse, bool]`. You'll see this below.

In [None]:
from llama_index.core.query_pipeline import QueryPipeline as QP

qp = QP(verbose=True)

### Define Agent Input Component

Here we define the agent input component, called at the beginning of every agent step. Besides passing along the input, we also do initialization/state modification.

In [None]:
from llama_index.core.agent.react.types import (
    ActionReasoningStep,
    ObservationReasoningStep,
    ResponseReasoningStep,
)
from llama_index.core.agent import Task, AgentChatResponse
from llama_index.core.query_pipeline import (
    AgentInputComponent,
    AgentFnComponent,
    CustomAgentComponent,
    QueryComponent,
    ToolRunnerComponent,
)
from llama_index.core.llms import MessageRole
from typing import Dict, Any, Optional, Tuple, List, cast


## Agent Input Component
## This is the component that produces agent inputs to the rest of the components
## Can also put initialization logic here.
def agent_input_fn(task: Task, state: Dict[str, Any]) -> Dict[str, Any]:
    """Agent input function.

    Returns:
        A Dictionary of output keys and values. If you are specifying
        src_key when defining links between this component and other
        components, make sure the src_key matches the specified output_key.

    """
    # initialize current_reasoning
    if "current_reasoning" not in state:
        state["current_reasoning"] = []
    reasoning_step = ObservationReasoningStep(observation=task.input)
    state["current_reasoning"].append(reasoning_step)
    return {"input": task.input}


agent_input_component = AgentInputComponent(fn=agent_input_fn)

### Define Agent Prompt

Here we define the agent component that generates a ReAct prompt, and after the output is generated from the LLM, parses into a structured object.

In [None]:
from llama_index.core.agent import ReActChatFormatter
from llama_index.core.query_pipeline import InputComponent, Link
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool


## define prompt function
def react_prompt_fn(
    task: Task, state: Dict[str, Any], input: str, tools: List[BaseTool]
) -> List[ChatMessage]:
    # Add input to reasoning
    chat_formatter = ReActChatFormatter()
    return chat_formatter.format(
        tools,
        chat_history=task.memory.get() + state["memory"].get_all(),
        current_reasoning=state["current_reasoning"],
    )


react_prompt_component = AgentFnComponent(
    fn=react_prompt_fn, partial_dict={"tools": [sql_tool]}
)

In [None]:
from llama_index.core.llms.generic_utils import messages_to_prompt

chat_formatter = ReActChatFormatter()
msgs = chat_formatter.format(
    [sql_tool],
    chat_history=[],
    current_reasoning=[]
)
print(messages_to_prompt(msgs))

system: 
You are designed to help with a variety of tasks, from answering questions     to providing summaries to other types of analyses.

## Tools
You have access to a wide variety of tools. You are responsible for using
the tools in any sequence you deem appropriate to complete the task at hand.
This may require breaking the task into subtasks and using different tools
to complete each subtask.

You have access to the following tools:
> Tool Name: sql_tool
Tool Description: Useful for translating a natural language query into a SQL query
Tool Args: {"type": "object", "properties": {"input": {"title": "Input", "type": "string"}}, "required": ["input"]}


## Output Format
To answer the question, please use the following format.

```
Thought: I need to use a tool to help me answer the question.
Action: tool name (one of sql_tool) if using a tool.
Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {"input": "hello world", "num_beams": 5})
```

Please ALW

### Define Agent Output Parser + Tool Pipeline

Once the LLM gives an output, we have a decision tree:
1. If an answer is given, then we're done. Process the output
2. If an action is given, we need to execute the specified tool with the specified args, and then process the output.

Tool calling can be done via the `ToolRunnerComponent` module. This is a simple wrapper module that takes in a list of tools, and can be "executed" with the specified tool name (every tool has a name) and tool action.

We implement this overall module `OutputAgentComponent` that subclasses `CustomAgentComponent`.

Note: we also implement `sub_query_components` to pass through higher-level callback managers to the tool runner submodule.

In [None]:
from typing import Set, Optional
from llama_index.core.agent.react.output_parser import ReActOutputParser
from llama_index.core.llms import ChatResponse
from llama_index.core.agent.types import Task


def parse_react_output_fn(
    task: Task, state: Dict[str, Any], chat_response: ChatResponse
):
    """Parse ReAct output into a reasoning step."""
    output_parser = ReActOutputParser()
    reasoning_step = output_parser.parse(chat_response.message.content)
    return {"done": reasoning_step.is_done, "reasoning_step": reasoning_step}


parse_react_output = AgentFnComponent(fn=parse_react_output_fn)


def run_tool_fn(
    task: Task, state: Dict[str, Any], reasoning_step: ActionReasoningStep
):
    """Run tool and process tool output."""
    tool_runner_component = ToolRunnerComponent(
        [sql_tool], callback_manager=task.callback_manager
    )
    tool_output = tool_runner_component.run_component(
        tool_name=reasoning_step.action,
        tool_input=reasoning_step.action_input,
    )
    observation_step = ObservationReasoningStep(observation=str(tool_output))
    state["current_reasoning"].append(observation_step)
    # TODO: get output

    return {"response_str": observation_step.get_content(), "is_done": False}


run_tool = AgentFnComponent(fn=run_tool_fn)


def process_response_fn(
    task: Task, state: Dict[str, Any], response_step: ResponseReasoningStep
):
    """Process response."""
    state["current_reasoning"].append(response_step)
    response_str = response_step.response
    # Now that we're done with this step, put into memory
    state["memory"].put(ChatMessage(content=task.input, role=MessageRole.USER))
    state["memory"].put(
        ChatMessage(content=response_str, role=MessageRole.ASSISTANT)
    )

    return {"response_str": response_str, "is_done": True}


process_response = AgentFnComponent(fn=process_response_fn)


def process_agent_response_fn(
    task: Task, state: Dict[str, Any], response_dict: dict
):
    """Process agent response."""
    return (
        AgentChatResponse(response_dict["response_str"]),
        response_dict["is_done"],
    )


process_agent_response = AgentFnComponent(fn=process_agent_response_fn)

### Stitch together Agent Query Pipeline

We can now stitch together the top-level agent pipeline: agent_input -> react_prompt -> llm -> react_output.

The last component is the if-else component that calls sub-components.

In [None]:
from llama_index.core.query_pipeline import QueryPipeline as QP
from llama_index.llms.openai import OpenAI

qp.add_modules(
    {
        "agent_input": agent_input_component,
        "react_prompt": react_prompt_component,
        "llm": OpenAI(model="gpt-4-1106-preview"),
        "react_output_parser": parse_react_output,
        "run_tool": run_tool,
        "process_response": process_response,
        "process_agent_response": process_agent_response,
    }
)

In [None]:
# link input to react prompt to parsed out response (either tool action/input or observation)
qp.add_chain(["agent_input", "react_prompt", "llm", "react_output_parser"])

# add conditional link from react output to tool call (if not done)
qp.add_link(
    "react_output_parser",
    "run_tool",
    condition_fn=lambda x: not x["done"],
    input_fn=lambda x: x["reasoning_step"],
)
# add conditional link from react output to final response processing (if done)
qp.add_link(
    "react_output_parser",
    "process_response",
    condition_fn=lambda x: x["done"],
    input_fn=lambda x: x["reasoning_step"],
)

# whether response processing or tool output processing, add link to final agent response
qp.add_link("process_response", "process_agent_response")
qp.add_link("run_tool", "process_agent_response")

### Visualize Query Pipeline

In [None]:
from pyvis.network import Network

net = Network(notebook=True, cdn_resources="in_line", directed=True)
net.from_nx(qp.clean_dag)
net.show("agent_dag.html")

agent_dag.html


### Setup Agent Worker around Text-to-SQL Query Pipeline

This is our way to setup an agent around a text-to-SQL Query Pipeline

In [None]:
from llama_index.core.agent import QueryPipelineAgentWorker, AgentRunner
from llama_index.core.callbacks import CallbackManager

agent_worker = QueryPipelineAgentWorker(qp)
agent = AgentRunner(
    agent_worker, callback_manager=CallbackManager([]), verbose=True
)

### Run the Agent

Let's try the agent on some sample queries.

In [None]:
# start task
task = agent.create_task(
    "What are some tracks from the artist AC/DC? Limit it to 3"
)

In [None]:
step_output = agent.run_step(task.task_id)

> Running step e9a672f8-4db0-4ee6-8db8-c39a3545a0f7. Step input: What are some tracks from the artist AC/DC? Limit it to 3
[1;3;38;2;155;135;227m> Running module agent_input with input: 
state: {'sources': [], 'memory': ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'), chat_store=SimpleChatSto...
task: task_id='ed2d3d39-5611-43d4-ab31-9f12b8ea0972' input='What are some tracks from the artist AC/DC? Limit it to 3' memory=ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method ...

[0m[1;3;38;2;155;135;227m> Running module react_prompt with input: 
input: What are some tracks from the artist AC/DC? Limit it to 3

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions     to providing summaries to other types of an

In [None]:
step_output.is_last

False

In [None]:
step_output = agent.run_step(task.task_id)

> Running step 103a02cd-a729-45c9-92d4-32091c215b4b. Step input: None
[1;3;38;2;155;135;227m> Running module agent_input with input: 
state: {'sources': [], 'memory': ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'), chat_store=SimpleChatSto...
task: task_id='ed2d3d39-5611-43d4-ab31-9f12b8ea0972' input='What are some tracks from the artist AC/DC? Limit it to 3' memory=ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method ...

[0m[1;3;38;2;155;135;227m> Running module react_prompt with input: 
input: What are some tracks from the artist AC/DC? Limit it to 3

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions     to providing summaries to other types of analyses.\n\n## Too...

[0m[1;3;38;2;155;135;227m> Ru

In [None]:
step_output.is_last

True

In [None]:
response = agent.finalize_response(task.task_id)

In [None]:
print(str(response))

The top 3 tracks by AC/DC are "For Those About To Rock (We Salute You)", "Put The Finger On You", and "Let's Get It Up".


In [None]:
# run this e2e
agent.reset()
response = agent.chat(
    "What are some tracks from the artist AC/DC? Limit it to 3"
)

> Running step 3ec42899-be2d-4bea-88ed-acc50867ccb7. Step input: What are some tracks from the artist AC/DC? Limit it to 3
[1;3;38;2;155;135;227m> Running module agent_input with input: 
state: {'sources': [], 'memory': ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'), chat_store=SimpleChatSto...
task: task_id='c4276071-d7f3-4be4-8604-556819744ef9' input='What are some tracks from the artist AC/DC? Limit it to 3' memory=ChatMemoryBuffer(token_limit=3000, tokenizer_fn=functools.partial(<bound method ...

[0m[1;3;38;2;155;135;227m> Running module react_prompt with input: 
input: What are some tracks from the artist AC/DC? Limit it to 3

[0m[1;3;38;2;155;135;227m> Running module llm with input: 
messages: [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='\nYou are designed to help with a variety of tasks, from answering questions     to providing summaries to other types of an

In [None]:
print(str(response))

The top 3 tracks by AC/DC are "For Those About To Rock (We Salute You)", "Put The Finger On You", and "Let's Get It Up".
