# Agents with LlamaIndex II - 

Sources [1](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/), [2](https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/agents/), [3](https://docs.llamaindex.ai/en/stable/examples/agent/custom_agent/), [4](https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent/), [5](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/agent_runner/), [6](https://medium.com/llamaindex-blog/data-agents-eed797d7972f), [LlamaHub Tools](https://llamahub.ai/?tab=tools)  
    
--- 
### Agents

An "agent" is an automated reasoning and decision engine. It takes in a user input/query and can make internal decisions for executing that query in order to return the correct result. The key agent components can include, but are not limited to:

+ Breaking down a complex question into smaller ones
+ Choosing an external Tool to use + coming up with parameters for calling the Tool
+ Planning out a set of tasks
+ Storing previously completed tasks in a memory module

LlamaIndex provides a comprehensive framework for building agents. This includes the following components:

+ Using agents with tools at a high-level to build agentic RAG and workflow automation use cases **(Previous Notebook)**
+ Low-level components for building and debugging agents
+ Core agent ingredients that can be used as standalone modules: query planning, tool use, and more **(Previous Notebook)**

### Tools

Having proper tool abstractions is at the core of building data agents. Defining a set of Tools is similar to defining any API interface, with the exception that these Tools are meant for agent rather than human use. We allow users to define both a Tool as well as a ToolSpec containing a series of functions under the hood.

When using an agent or LLM with function calling, the tool selected (and the arguments written for that tool) rely strongly on the tool name and description of the tools purpose and arguments. Spending time tuning these parameters can result in larges changes in how the LLM calls these tools.

A Tool implements a very generic interface - simply define __call__ and also return some basic metadata (name, description, function schema).
LlamaIndex offer a few different types of Tools:

+ FunctionTool: A function tool allows users to easily convert any user-defined function into a Tool. It can also auto-infer the function schema. **(Previous Notebook)**
+ QueryEngineTool: A tool that wraps an existing query engine. Note: since our agent abstractions inherit from BaseQueryEngine, these tools can also wrap other agents. **(Previous Notebook)**
+ Community contributed ToolSpecs that define one or more tools around a single service (like Gmail): see [LlamaHub](https://llamahub.ai/)
+ Utility tools for wrapping other tools to handle returning large amounts of data from a tool: see [OnDemandLoaderTool](https://docs.llamaindex.ai/en/stable/examples/tools/OnDemandLoaderTool/)

### Use Cases

The scope of possible use cases for agents is vast and ever-expanding. That said, here are some practical use cases that can deliver immediate value.

+ Agentic RAG: Build a context-augmented research assistant over your data that not only answers simple questions, but complex research tasks. (Previous Notebook)
+ SQL Agent: A subset of the above is a "text-to-SQL assistant" that can interact with a structured database. 
+ Workflow Assistant: Build an agent that can operate over common workflow tools like email, calendar.
+ Coding Assistant: Build an agent that can operate over code.

## Installing Packages

In [None]:
%pip install -qU pip

%pip install -qU openai
%pip install -qU llama-index
%pip install -qU llama_hub
%pip install -qU llama-index-experimental
%pip install llama-index-agent-openai
%pip install llama-index-readers-web
%pip install llama-index-tools-google

#%pip install -q pypdf
#%pip install -q docx2txt

## Importing Packages

In [None]:
import os
import openai

#os.environ["OPENAI_API_KEY"] = "<the key>"
openai.api_key = os.environ["OPENAI_API_KEY"]

import sys
import shutil
import glob
import logging
from pathlib import Path
from IPython.display import Image

import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import llama_index
from llama_index.llms.openai import OpenAI

## Llamaindex readers
from llama_index.core import SimpleDirectoryReader

## LlamaIndex Index Types
from llama_index.core import ListIndex
from llama_index.core import VectorStoreIndex
from llama_index.core import TreeIndex
from llama_index.core import KeywordTableIndex
from llama_index.core import SimpleKeywordTableIndex
from llama_index.core import DocumentSummaryIndex
from llama_index.core import SummaryIndex
from llama_index.core import KnowledgeGraphIndex
from llama_index.experimental.query_engine import PandasQueryEngine

## LlamaIndex Context Managers
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.schema import Node

## LlamaIndex Templates
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.base.llms.types import ChatMessage, MessageRole

## LlamaIndex Agents
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent

## LlamaIndex Callbacks
from llama_index.core.callbacks import CallbackManager
from llama_index.core.callbacks import LlamaDebugHandler

import nest_asyncio
nest_asyncio.apply()

In [None]:
import logging

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

In [None]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

model="gpt-4o"
#model="gpt-4o-mini"

Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.llm = OpenAI(temperature=0, 
                      model=model, 
                      #max_tokens=512
                      PRESENCE_PENALTY=-2,
                      TOP_P=1,
                     )

llm = Settings.llm

---

## 1- Function Tool  
A function tool is a simple wrapper around any existing function (both sync and async are supported!).  

In [None]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def subtract(x: int, y: int) -> int: 
    """Subtract the second number from the first number."""
    return (x - y)

def multiply(x: int, y: int) -> int: 
    """Multiply one number by the other number."""
    return (x * y)

def uppercase(x: str) -> str: 
    """Return the input string in uppercase."""
    return (x.upper())

def square(x: int) -> int: 
    """Return the square of the give number."""
    return (x + x)

add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
multiply_tool = FunctionTool.from_defaults(fn=multiply)
uppercase_tool = FunctionTool.from_defaults(fn=uppercase)
square_tool = FunctionTool.from_defaults(fn=square)

In [None]:
response = llm.predict_and_call(
    [add_tool, 
     subtract_tool, 
     multiply_tool, 
     uppercase_tool,
     square_tool,
    ], 
    "Tell me the output of 3 - 12 ", 
    verbose=True
)
print(str(response))

### There is no criticism on the functioning of the tool; tools are called by similarity of query vs docstring

In [None]:
response = llm.predict_and_call(
    [add_tool, 
     subtract_tool, 
     multiply_tool, 
     uppercase_tool,
     square_tool,
    ], 
    "Tell me how much is three to the power of two", 
    verbose=True
)
print(str(response))

---

## 2- [Coding Assistant and Interpreter](https://llamahub.ai/l/tools/llama-index-tools-code-interpreter?from=)  
[Github](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-code-interpreter/examples/code_interpreter.ipynb)

In [None]:
%pip install -q llama-index-tools-code_interpreter

In [None]:
from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.code_interpreter import CodeInterpreterToolSpec

code_spec = CodeInterpreterToolSpec()
tools = code_spec.to_tool_list()
agent = OpenAIAgent.from_tools(tools, verbose=True)

In [None]:
print(agent.chat("Can you help me write some python code to pass to the code_interpreter tool"))

In [None]:
print(agent.chat("Can you check if there is a file named ../../Data/california_housing_train.csv?"))

In [None]:
print(agent.chat("Please do"))

In [None]:
print(agent.chat(
        """Can you write and execute code to tell me its columns?"""
    )
)

In [None]:
print(agent.chat("In which country are these latitudes and longitudes?"))

In [None]:
print(agent.chat("Can you make a graph of the median income per total rooms?"))

In [None]:
print(agent.chat("I cant see the plot - can you save it locally with file name `output.png`?"))

---

## 3- [SQL Agent](https://llamahub.ai/l/tools/llama-index-tools-database?from=)  
[GitHub](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-database/examples/database.ipynb), [article](https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo/)

In [None]:
%pip install -q llama-index-tools-database
%pip install -q sqlalchemy

In [None]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.database import DatabaseToolSpec
from llama_index.core import SQLDatabase
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    insert,
    text,
)

In [None]:
engine = create_engine("sqlite:///:memory:")
metadata_obj = MetaData()

In [None]:
# create city SQL table
table_name = "city_stats"
city_stats_table = Table(
    table_name,
    metadata_obj,
    Column("city_name", String(16), primary_key=True),
    Column("population", Integer),
    Column("country", String(16), nullable=False),
)
metadata_obj.create_all(engine)

#### Define SQL Database   

We first define our SQLDatabase abstraction (a light wrapper around SQLAlchemy).

In [None]:
sql_database = SQLDatabase(engine, include_tables=["city_stats"])

#### We add some testing data to our SQL database

In [None]:
rows = [
    {"city_name": "Toronto", "population": 2930000, "country": "Canada"},
    {"city_name": "Tokyo", "population": 13960000, "country": "Japan"},
    {"city_name": "Chicago", "population": 2679000, "country": "United States",},
    {"city_name": "Seoul", "population": 9776000, "country": "South Korea"},
]
for row in rows:
    stmt = insert(city_stats_table).values(**row)
    with engine.begin() as connection:
        cursor = connection.execute(stmt)

In [None]:
# view current table
stmt = select(
    city_stats_table.c.city_name,
    city_stats_table.c.population,
    city_stats_table.c.country,
).select_from(city_stats_table)

with engine.connect() as connection:
    results = connection.execute(stmt).fetchall()
    print(results)

#### Query Index  
We first show how we can execute a raw SQL query, which directly executes over the table.  

In [None]:
with engine.connect() as con:
    rows = con.execute(text("SELECT city_name from city_stats"))
    for row in rows:
        print(row)

#### Part 1: Text-to-SQL Query Engine¶

Once we have constructed our SQL database, we can use the NLSQLTableQueryEngine to construct natural language queries that are synthesized into SQL queries.

Note that we need to specify the tables we want to use with this query engine. If we don't the query engine will pull all the schema context, which could overflow the context window of the LLM.

This query engine should be used in any case where you can specify the tables you want to query over beforehand, or the total size of all the table schema plus the rest of the prompt fits your context window.

In [None]:
from llama_index.core.query_engine import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database, tables=["city_stats"], llm=llm
)
query_str = "Which city has the highest population?"
response = query_engine.query(query_str)

In [None]:
from IPython.display import Markdown
display(Markdown(f"<b>{response}</b>"))

#### Part 2: Query-Time Retrieval of Tables for Text-to-SQL

If we don't know ahead of time which table we would like to use, and the total size of the table schema overflows your context window size, we should store the table schema in an index so that during query time we can retrieve the right schema.

The way we can do this is using the SQLTableNodeMapping object, which takes in a SQLDatabase and produces a Node object for each SQLTableSchema object passed into the ObjectIndex constructor.  

In [None]:
from llama_index.core.indices.struct_store.sql_query import (
    SQLTableRetrieverQueryEngine,
)
from llama_index.core.objects import (
    SQLTableNodeMapping,
    ObjectIndex,
    SQLTableSchema,
)
#from llama_index.core import VectorStoreIndex

# set Logging to DEBUG for more detailed outputs
table_node_mapping = SQLTableNodeMapping(sql_database)
table_schema_objs = [
    (SQLTableSchema(table_name="city_stats"))
]  # add a SQLTableSchema for each table

obj_index = ObjectIndex.from_objects(
    table_schema_objs,
    table_node_mapping,
    VectorStoreIndex,
)
query_engine = SQLTableRetrieverQueryEngine(
    sql_database, obj_index.as_retriever(similarity_top_k=1)
)

#### Now we can take our SQLTableRetrieverQueryEngine and query it for our response.

In [None]:
response = query_engine.query("Which city has the highest population?")
display(Markdown(f"<b>{response}</b>"))

In [None]:
# you can also fetch the raw result from SQLAlchemy!
response.metadata["result"]

#### You could also add additional context information for each table schema you define.

In [None]:
# manually set context text
city_stats_text = (
    "This table gives information regarding the population and country of a"
    " given city.\nThe user will query with codewords, where 'foo' corresponds"
    " to population and 'bar'corresponds to city."
)

table_node_mapping = SQLTableNodeMapping(sql_database)
table_schema_objs = [
    (SQLTableSchema(table_name="city_stats", context_str=city_stats_text))
]

#### Part 3: Text-to-SQL Retriever

So far our text-to-SQL capability is packaged in a query engine and consists of both retrieval and synthesis.

You can use the SQL retriever on its own. We show you some different parameters you can try, and also show how to plug it into our RetrieverQueryEngine to get roughly the same results.


In [None]:
from llama_index.core.retrievers import NLSQLRetriever

# default retrieval (return_raw=True)
nl_sql_retriever = NLSQLRetriever(
    sql_database, tables=["city_stats"], return_raw=True
)

In [None]:
results = nl_sql_retriever.retrieve(
    "Return the top 5 cities (along with their populations) with the highest population."
)

In [None]:
from llama_index.core.response.notebook_utils import display_source_node

for n in results:
    display_source_node(n)

In [None]:
# default retrieval (return_raw=False)
nl_sql_retriever = NLSQLRetriever(
    sql_database, tables=["city_stats"], return_raw=False
)

In [None]:
results = nl_sql_retriever.retrieve(
    "Return the top 5 cities (along with their populations) with the highest population."
)

In [None]:
# NOTE: all the content is in the metadata
for n in results:
    display_source_node(n, show_source_metadata=True)

#### Plug into our RetrieverQueryEngine

We compose our SQL Retriever with our standard RetrieverQueryEngine to synthesize a response. The result is roughly similar to our packaged Text-to-SQL query engines.

In [None]:
from llama_index.core.query_engine import RetrieverQueryEngine

query_engine = RetrieverQueryEngine.from_args(nl_sql_retriever)

In [None]:
response = query_engine.query(
    "Return the top 5 cities (along with their populations) with the highest population."
)

In [None]:
print(str(response))

#### Using Database Tool

This tool connects to a database (using SQLAlchemy under the hood) and allows an Agent to query the database and get information about the tables.

In [None]:
db_tools = DatabaseToolSpec(sql_database)
agent = OpenAIAgent.from_tools(db_tools.to_tool_list())

agent.chat("What tables does this database contain").response

In [None]:
print(agent.chat("Describe the first table").response)

In [None]:
print(agent.chat("Retrieve the first row of that table").response)

---

## 4- Community Tools

### A) [Wikipedia Tool](https://llamahub.ai/l/tools/llama-index-tools-wikipedia?from=tools)  
[Github](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-wikipedia/examples/wikipedia.ipynb)

In [None]:
%pip install -q llama-index-tools-wikipedia

In [None]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.wikipedia import WikipediaToolSpec

tool_spec = WikipediaToolSpec()

agent = OpenAIAgent.from_tools(tool_spec.to_tool_list()) #, verbose=True,)

response= agent.chat("Who is Ben Afflecks spouse?")

#print(response.sources)
print(response.response)

### B) [Arxiv Tool](https://llamahub.ai/l/tools/llama-index-tools-arxiv?from=tools)  
[Github](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-arxiv)  

In [None]:
%pip install -q llama-index-tools-arxiv

In [None]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.arxiv.base import ArxivToolSpec

arxiv_tool = ArxivToolSpec()

agent = OpenAIAgent.from_tools(arxiv_tool.to_tool_list(), verbose=True,)

print(agent.chat("Whats are the newest discoveries on Prompt Engineering?"))

### C) [DuckDuckGoSearch Tool](https://llamahub.ai/l/tools/llama-index-tools-duckduckgo?from=tools)  
[Github](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-duckduckgo/examples/duckduckgo_search.ipynb)  

In [None]:
%pip install -q llama-index-tools-duckduckgo

In [None]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec

tool_spec = DuckDuckGoSearchToolSpec()

agent = OpenAIAgent.from_tools(DuckDuckGoSearchToolSpec().to_tool_list(), verbose=True,)
response = agent.chat("what are the latest developments in machine learning")

In [None]:
print(response.response)

In [None]:
agent.chat_history.clear()
print(agent.chat("Who is Goethe?"))

### D) [Yahoo Finance Tool](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/tools/)  
[Github](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-yahoo-finance)  

In [None]:
%pip install -q llama-index-tools-yahoo-finance

In [None]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.yahoo_finance import YahooFinanceToolSpec

tool_spec = YahooFinanceToolSpec()
agent = OpenAIAgent.from_tools(tool_spec.to_tool_list())

print(agent.chat("What is the price of Apple stock?").response)

In [None]:
print(agent.chat("What is the latest news about Apple?").response)

### E) [Open Weather Map Tool](https://llamahub.ai/l/tools/llama-index-tools-weather?from=tools)  
[Github](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-weather)  

In [None]:
%pip install -q llama-index-tools-weather
%pip install -q pyowm

In [None]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.weather import OpenWeatherMapToolSpec

os.environ["OPENWEATHER_API_KEY"]="your key"

## https://openweathermap.org/api
tool_spec = OpenWeatherMapToolSpec(key=os.environ["OPENWEATHER_API_KEY"])

agent = OpenAIAgent.from_tools(tool_spec.to_tool_list())

agent.chat("What is the temperature like in Paris?")

In [None]:
agent.chat("What is the temperature in Vienna today?")

## 5- [OnDemandLoaderTool](https://docs.llamaindex.ai/en/stable/examples/tools/OnDemandLoaderTool/)  

OnDemandLoaderTool is a powerful agent tool that allows for "on-demand" data querying from any data source on LlamaHub.  

This tool takes in a BaseReader data loader, and when called will 1) load data, 2) index data, and 3) query the data.  

In this walkthrough, we show how to use the OnDemandLoaderTool to convert our Wikipedia data loader into an accessible search tool for a LangChain agent.

In [None]:
%pip install -q llama-index-readers-wikipedia

In [None]:
from llama_index.core.tools.ondemand_loader_tool import OnDemandLoaderTool
from llama_index.readers.wikipedia import WikipediaReader
from typing import List

from pydantic import BaseModel

#### Define Tool¶

First it is defined the WikipediaReader. Note that the load_data interface to WikipediaReader takes in a list of pages.   
By default, this queries the Wikipedia search endpoint which will autosuggest the relevant pages.  
Then it wraps it into our OnDemandLoaderTool.  
By default since we don't specify the index_cls, a simple vector store index is initialized.  

In [None]:
reader = WikipediaReader()

tool = OnDemandLoaderTool.from_defaults(
    reader,
    name="Wikipedia Tool",
    description="A tool for loading and querying articles from Wikipedia",
)

# run tool by itself
tool(["Berlin"], query_str="What's the arts and culture scene in Berlin?")

In [None]:
#from llama_index.indices.vector_store import VectorStoreIndex

reader = WikipediaReader()
tool = OnDemandLoaderTool.from_defaults(
    reader=reader,
    index_cls=VectorStoreIndex,
    index_kwargs={"dim": 768},
    name="Wikipedia",
    description="A tool for querying Wikipedia articles.",
)

In [None]:
response = tool(["City of Vienna"], query_str="What are the best universities in Vienna?")

In [None]:
response.content

## Also check [These Notebooks](https://github.com/run-llama/llama-hub/tree/main/llama_hub/tools/notebooks)