# Agents with LlamaIndex II - 

Sources [1](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/), [2](https://docs.llamaindex.ai/en/stable/understanding/putting_it_all_together/agents/), [3](https://docs.llamaindex.ai/en/stable/examples/agent/custom_agent/), [4](https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent/), [5](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/agent_runner/), [6](https://medium.com/llamaindex-blog/data-agents-eed797d7972f), [LlamaHub Tools](https://llamahub.ai/?tab=tools)  
    
--- 
### Agents

An "agent" is an automated reasoning and decision engine. It takes in a user input/query and can make internal decisions for executing that query in order to return the correct result. The key agent components can include, but are not limited to:

+ Breaking down a complex question into smaller ones
+ Choosing an external Tool to use + coming up with parameters for calling the Tool
+ Planning out a set of tasks
+ Storing previously completed tasks in a memory module

LlamaIndex provides a comprehensive framework for building agents. This includes the following components:

+ Using agents with tools at a high-level to build agentic RAG and workflow automation use cases **(Previous Notebook)**
+ Low-level components for building and debugging agents
+ Core agent ingredients that can be used as standalone modules: query planning, tool use, and more **(Previous Notebook)**

### Tools

Having proper tool abstractions is at the core of building data agents. Defining a set of Tools is similar to defining any API interface, with the exception that these Tools are meant for agent rather than human use. We allow users to define both a Tool as well as a ToolSpec containing a series of functions under the hood.

When using an agent or LLM with function calling, the tool selected (and the arguments written for that tool) rely strongly on the tool name and description of the tools purpose and arguments. Spending time tuning these parameters can result in larges changes in how the LLM calls these tools.

A Tool implements a very generic interface - simply define __call__ and also return some basic metadata (name, description, function schema).
LlamaIndex offer a few different types of Tools:

+ FunctionTool: A function tool allows users to easily convert any user-defined function into a Tool. It can also auto-infer the function schema. **(Previous Notebook)**
+ QueryEngineTool: A tool that wraps an existing query engine. Note: since our agent abstractions inherit from BaseQueryEngine, these tools can also wrap other agents. **(Previous Notebook)**
+ Community contributed ToolSpecs that define one or more tools around a single service (like Gmail): see [LlamaHub](https://llamahub.ai/)
+ Utility tools for wrapping other tools to handle returning large amounts of data from a tool: see [OnDemandLoaderTool](https://docs.llamaindex.ai/en/stable/examples/tools/OnDemandLoaderTool/)

### Use Cases

The scope of possible use cases for agents is vast and ever-expanding. That said, here are some practical use cases that can deliver immediate value.

+ Agentic RAG: Build a context-augmented research assistant over your data that not only answers simple questions, but complex research tasks. (Previous Notebook)
+ SQL Agent: A subset of the above is a "text-to-SQL assistant" that can interact with a structured database. 
+ Workflow Assistant: Build an agent that can operate over common workflow tools like email, calendar.
+ Coding Assistant: Build an agent that can operate over code.

## Installing Packages

In [5]:
#%pip install -qU openai
#%pip install -qU llama-index
#%pip install -qU llama_hub
#%pip install -qU llama-index-experimental
#%pip install llama-index-agent-openai
#%pip install llama-index-readers-web
#%pip install llama-index-tools-google


#!pip install -q pypdf
#!pip install -q docx2txt

## Importing Packages

In [6]:
import os
import openai

#os.environ["OPENAI_API_KEY"] = "<the key>"
openai.api_key = os.environ["OPENAI_API_KEY"]

import sys
import shutil
import glob
import logging
from pathlib import Path
from IPython.display import Image

import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import llama_index
from llama_index.llms.openai import OpenAI

## Llamaindex readers
from llama_index.core import SimpleDirectoryReader

## LlamaIndex Index Types
from llama_index.core import ListIndex
from llama_index.core import VectorStoreIndex
from llama_index.core import TreeIndex
from llama_index.core import KeywordTableIndex
from llama_index.core import SimpleKeywordTableIndex
from llama_index.core import DocumentSummaryIndex
from llama_index.core import SummaryIndex
from llama_index.core import KnowledgeGraphIndex
from llama_index.experimental.query_engine import PandasQueryEngine

## LlamaIndex Context Managers
from llama_index.core import StorageContext
from llama_index.core import load_index_from_storage
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.schema import Node

## LlamaIndex Templates
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts import ChatPromptTemplate
from llama_index.core.base.llms.types import ChatMessage, MessageRole

## LlamaIndex Agents
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import ReActAgent

## LlamaIndex Callbacks
from llama_index.core.callbacks import CallbackManager
from llama_index.core.callbacks import LlamaDebugHandler

import nest_asyncio
nest_asyncio.apply()

In [7]:
import logging

#logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
#logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

#### Defining Models

In [24]:
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

model="gpt-4o"
#model="gpt-4o-mini"

Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.llm = OpenAI(temperature=0, 
                      model=model, 
                      #max_tokens=512
                      PRESENCE_PENALTY=-2,
                      TOP_P=1,
                     )

llm = Settings.llm

---

## 1- Function Tool  
A function tool is a simple wrapper around any existing function (both sync and async are supported!).  

In [25]:
from llama_index.core.tools import FunctionTool

def add(x: int, y: int) -> int:
    """Adds two integers together."""
    return x + y

def subtract(x: int, y: int) -> int: 
    """Subtract the second number from the first number."""
    return (x - y)

def multiply(x: int, y: int) -> int: 
    """Multiply one number by the other number."""
    return (x * y)

def uppercase(x: str) -> str: 
    """Return the input string in uppercase."""
    return (x.upper())

def square(x: int) -> int: 
    """Return the square of the give number."""
    return (x + x)

add_tool = FunctionTool.from_defaults(fn=add)
subtract_tool = FunctionTool.from_defaults(fn=subtract)
multiply_tool = FunctionTool.from_defaults(fn=multiply)
uppercase_tool = FunctionTool.from_defaults(fn=uppercase)
square_tool = FunctionTool.from_defaults(fn=square)

In [26]:
response = llm.predict_and_call(
    [add_tool, 
     subtract_tool, 
     multiply_tool, 
     uppercase_tool,
     square_tool,
    ], 
    "Tell me the output of 3 - 12 ", 
    verbose=True
)
print(str(response))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: subtract with args: {"x": 3, "y": 12}
=== Function Output ===
-9
-9


### There is no criticism on the functioning of the tool; tools are called by similarity of query vs docstring

In [27]:
response = llm.predict_and_call(
    [add_tool, 
     subtract_tool, 
     multiply_tool, 
     uppercase_tool,
     square_tool,
    ], 
    "Tell me how much is three to the power of two", 
    verbose=True
)
print(str(response))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: square with args: {"x": 3}
=== Function Output ===
6
6


---

## 2- [Coding Assistant and Interpreter](https://llamahub.ai/l/tools/llama-index-tools-code-interpreter?from=)  
[Github](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-code-interpreter/examples/code_interpreter.ipynb)

In [28]:
%pip install -q llama-index-tools-code_interpreter

Note: you may need to restart the kernel to use updated packages.


In [29]:
from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.code_interpreter import CodeInterpreterToolSpec

code_spec = CodeInterpreterToolSpec()
tools = code_spec.to_tool_list()
agent = OpenAIAgent.from_tools(tools, verbose=True)

In [30]:
print(agent.chat("Can you help me write some python code to pass to the code_interpreter tool"))

Added user message to memory: Can you help me write some python code to pass to the code_interpreter tool
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Of course! What would you like the Python code to do? If you have a specific task or problem in mind, please let me know, and I can help you write the code for it.


In [31]:
print(agent.chat("Can you check if there is a file named ../../Data/california_housing_train.csv?"))

Added user message to memory: Can you check if there is a file named ../../Data/california_housing_train.csv?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Certainly! Here's a Python code snippet that checks if a file named `california_housing_train.csv` exists in the `../../Data/` directory:

```python
import os

# Define the file path
file_path = '../../Data/california_housing_train.csv'

# Check if the file exists
file_exists = os.path.isfile(file_path)

# Output the result
file_exists
```

Would you like me to run this code to check for the file's existence?


In [32]:
print(agent.chat("Please do"))

Added user message to memory: Please do
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_interpreter with args: {"code":"import os\n\n# Define the file path\nfile_path = '../../Data/california_housing_train.csv'\n\n# Check if the file exists\nfile_exists = os.path.isfile(file_path)\n\n# Output the result\nfile_exists"}
Got output: StdOut:
b''
StdErr:
b''

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_interpreter with args: {"code":"import os\n\n# Define the file path\nfile_path = '../../Data/california_housing_train.csv'\n\n# Check if the file exists\nfile_exists = os.path.isfile(file_path)\n\n# Print the result\nif file_exists:\n    print('The file

In [33]:
print(agent.chat(
        """Can you write and execute code to tell me its columns?"""
    )
)

Added user message to memory: Can you write and execute code to tell me its columns?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_interpreter with args: {"code":"import pandas as pd\n\n# Load the CSV file\nfile_path = '../../Data/california_housing_train.csv'\ndata = pd.read_csv(file_path)\n\n# Get the columns of the DataFrame\ncolumns = data.columns.tolist()\n\n# Print the columns\ncolumns"}
Got output: StdOut:
b''
StdErr:
b''

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_interpreter with args: {"code":"import pandas as pd\n\n# Load the CSV file\nfile_path = '../../Data/california_housing_train.csv'\ndata = pd.read_csv(file_path)\n\n# Get the 

In [34]:
print(agent.chat("In which country are these latitudes and longitudes?"))

Added user message to memory: In which country are these latitudes and longitudes?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The latitudes and longitudes in the `california_housing_train.csv` file are likely to be in the United States, specifically in the state of California. This is inferred from the context of the dataset name, which suggests it pertains to housing data in California.


In [35]:
print(agent.chat("Can you make a graph of the median income per total rooms?"))

Added user message to memory: Can you make a graph of the median income per total rooms?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_interpreter with args: {"code":"import matplotlib.pyplot as plt\n\n# Plot median income vs total rooms\nplt.figure(figsize=(10, 6))\nplt.scatter(data['total_rooms'], data['median_income'], alpha=0.5)\nplt.title('Median Income vs Total Rooms')\nplt.xlabel('Total Rooms')\nplt.ylabel('Median Income')\nplt.grid(True)\nplt.show()"}
Got output: StdOut:
b''
StdErr:
b'Traceback (most recent call last):\n  File "<string>", line 5, in <module>\nNameError: name \'data\' is not defined\n'

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling funct

In [36]:
print(agent.chat("I cant see the plot - can you save it locally with file name `output.png`?"))

Added user message to memory: I cant see the plot - can you save it locally with file name `output.png`?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_interpreter with args: {"code":"# Save the plot to a file\nplt.figure(figsize=(10, 6))\nplt.scatter(data['total_rooms'], data['median_income'], alpha=0.5)\nplt.title('Median Income vs Total Rooms')\nplt.xlabel('Total Rooms')\nplt.ylabel('Median Income')\nplt.grid(True)\nplt.savefig('output.png')"}
Got output: StdOut:
b''
StdErr:
b'Traceback (most recent call last):\n  File "<string>", line 2, in <module>\nNameError: name \'plt\' is not defined\n'

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: code_inter

---

## 3- [SQL Agent](https://llamahub.ai/l/tools/llama-index-tools-database?from=)  
[GitHub](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-database/examples/database.ipynb), [article](https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo/)

In [37]:
%pip install -q llama-index-tools-database
%pip install -q sqlalchemy

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [38]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.database import DatabaseToolSpec
from llama_index.core import SQLDatabase
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    insert,
    text,
)

In [39]:
engine = create_engine("sqlite:///:memory:")
metadata_obj = MetaData()

In [40]:
# create city SQL table
table_name = "city_stats"
city_stats_table = Table(
    table_name,
    metadata_obj,
    Column("city_name", String(16), primary_key=True),
    Column("population", Integer),
    Column("country", String(16), nullable=False),
)
metadata_obj.create_all(engine)

#### Define SQL Database   

We first define our SQLDatabase abstraction (a light wrapper around SQLAlchemy).

In [41]:
sql_database = SQLDatabase(engine, include_tables=["city_stats"])

#### We add some testing data to our SQL database

In [42]:
rows = [
    {"city_name": "Toronto", "population": 2930000, "country": "Canada"},
    {"city_name": "Tokyo", "population": 13960000, "country": "Japan"},
    {"city_name": "Chicago", "population": 2679000, "country": "United States",},
    {"city_name": "Seoul", "population": 9776000, "country": "South Korea"},
]
for row in rows:
    stmt = insert(city_stats_table).values(**row)
    with engine.begin() as connection:
        cursor = connection.execute(stmt)

In [43]:
# view current table
stmt = select(
    city_stats_table.c.city_name,
    city_stats_table.c.population,
    city_stats_table.c.country,
).select_from(city_stats_table)

with engine.connect() as connection:
    results = connection.execute(stmt).fetchall()
    print(results)

[('Toronto', 2930000, 'Canada'), ('Tokyo', 13960000, 'Japan'), ('Chicago', 2679000, 'United States'), ('Seoul', 9776000, 'South Korea')]


#### Query Index  
We first show how we can execute a raw SQL query, which directly executes over the table.  

In [44]:
with engine.connect() as con:
    rows = con.execute(text("SELECT city_name from city_stats"))
    for row in rows:
        print(row)

('Chicago',)
('Seoul',)
('Tokyo',)
('Toronto',)


#### Part 1: Text-to-SQL Query Engine¶

Once we have constructed our SQL database, we can use the NLSQLTableQueryEngine to construct natural language queries that are synthesized into SQL queries.

Note that we need to specify the tables we want to use with this query engine. If we don't the query engine will pull all the schema context, which could overflow the context window of the LLM.

This query engine should be used in any case where you can specify the tables you want to query over beforehand, or the total size of all the table schema plus the rest of the prompt fits your context window.

In [45]:
from llama_index.core.query_engine import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database, tables=["city_stats"], llm=llm
)
query_str = "Which city has the highest population?"
response = query_engine.query(query_str)

INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [46]:
from IPython.display import Markdown
display(Markdown(f"<b>{response}</b>"))

<b>The city with the highest population is Tokyo, with a population of 13,960,000.</b>

#### Part 2: Query-Time Retrieval of Tables for Text-to-SQL

If we don't know ahead of time which table we would like to use, and the total size of the table schema overflows your context window size, we should store the table schema in an index so that during query time we can retrieve the right schema.

The way we can do this is using the SQLTableNodeMapping object, which takes in a SQLDatabase and produces a Node object for each SQLTableSchema object passed into the ObjectIndex constructor.  

In [47]:
from llama_index.core.indices.struct_store.sql_query import (
    SQLTableRetrieverQueryEngine,
)
from llama_index.core.objects import (
    SQLTableNodeMapping,
    ObjectIndex,
    SQLTableSchema,
)
#from llama_index.core import VectorStoreIndex

# set Logging to DEBUG for more detailed outputs
table_node_mapping = SQLTableNodeMapping(sql_database)
table_schema_objs = [
    (SQLTableSchema(table_name="city_stats"))
]  # add a SQLTableSchema for each table

obj_index = ObjectIndex.from_objects(
    table_schema_objs,
    table_node_mapping,
    VectorStoreIndex,
)
query_engine = SQLTableRetrieverQueryEngine(
    sql_database, obj_index.as_retriever(similarity_top_k=1)
)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


#### Now we can take our SQLTableRetrieverQueryEngine and query it for our response.

In [48]:
response = query_engine.query("Which city has the highest population?")
display(Markdown(f"<b>{response}</b>"))

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


<b>The city with the highest population is Tokyo, with a population of 13,960,000.</b>

In [49]:
# you can also fetch the raw result from SQLAlchemy!
response.metadata["result"]

[('Tokyo', 13960000)]

#### You could also add additional context information for each table schema you define.

In [50]:
# manually set context text
city_stats_text = (
    "This table gives information regarding the population and country of a"
    " given city.\nThe user will query with codewords, where 'foo' corresponds"
    " to population and 'bar'corresponds to city."
)

table_node_mapping = SQLTableNodeMapping(sql_database)
table_schema_objs = [
    (SQLTableSchema(table_name="city_stats", context_str=city_stats_text))
]

#### Part 3: Text-to-SQL Retriever

So far our text-to-SQL capability is packaged in a query engine and consists of both retrieval and synthesis.

You can use the SQL retriever on its own. We show you some different parameters you can try, and also show how to plug it into our RetrieverQueryEngine to get roughly the same results.


In [51]:
from llama_index.core.retrievers import NLSQLRetriever

# default retrieval (return_raw=True)
nl_sql_retriever = NLSQLRetriever(
    sql_database, tables=["city_stats"], return_raw=True
)

In [52]:
results = nl_sql_retriever.retrieve(
    "Return the top 5 cities (along with their populations) with the highest population."
)

INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [53]:
from llama_index.core.response.notebook_utils import display_source_node

for n in results:
    display_source_node(n)

**Node ID:** 5a845067-09e9-4d13-8379-e3a7eca90ed3<br>**Similarity:** None<br>**Text:** [('Tokyo', 13960000), ('Seoul', 9776000), ('Toronto', 2930000), ('Chicago', 2679000)]<br>

In [54]:
# default retrieval (return_raw=False)
nl_sql_retriever = NLSQLRetriever(
    sql_database, tables=["city_stats"], return_raw=False
)

In [55]:
results = nl_sql_retriever.retrieve(
    "Return the top 5 cities (along with their populations) with the highest population."
)

INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [56]:
# NOTE: all the content is in the metadata
for n in results:
    display_source_node(n, show_source_metadata=True)

**Node ID:** 8bd08b58-e4ef-49a2-a603-3a9efa3faf8a<br>**Similarity:** None<br>**Text:** <br>**Metadata:** {'city_name': 'Tokyo', 'population': 13960000}<br>

**Node ID:** b7142c9a-5030-4178-80d8-e8eaf51776e1<br>**Similarity:** None<br>**Text:** <br>**Metadata:** {'city_name': 'Seoul', 'population': 9776000}<br>

**Node ID:** 570e1535-a8be-4f17-b79c-cf278c276864<br>**Similarity:** None<br>**Text:** <br>**Metadata:** {'city_name': 'Toronto', 'population': 2930000}<br>

**Node ID:** 5c70d1d8-788b-44e7-acd1-be1ddb5b3efb<br>**Similarity:** None<br>**Text:** <br>**Metadata:** {'city_name': 'Chicago', 'population': 2679000}<br>

#### Plug into our RetrieverQueryEngine

We compose our SQL Retriever with our standard RetrieverQueryEngine to synthesize a response. The result is roughly similar to our packaged Text-to-SQL query engines.

In [57]:
from llama_index.core.query_engine import RetrieverQueryEngine

query_engine = RetrieverQueryEngine.from_args(nl_sql_retriever)

In [58]:
response = query_engine.query(
    "Return the top 5 cities (along with their populations) with the highest population."
)

INFO:llama_index.core.indices.struct_store.sql_retriever:> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
> Table desc str: Table 'city_stats' has columns: city_name (VARCHAR(16)), population (INTEGER), country (VARCHAR(16)), .
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


In [59]:
print(str(response))

1. Tokyo - 13,960,000  
2. Seoul - 9,776,000  
3. Toronto - 2,930,000  
4. Chicago - 2,679,000  


#### Using Database Tool

This tool connects to a database (using SQLAlchemy under the hood) and allows an Agent to query the database and get information about the tables.

In [60]:
db_tools = DatabaseToolSpec(sql_database)
agent = OpenAIAgent.from_tools(db_tools.to_tool_list())

agent.chat("What tables does this database contain").response

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


'The database contains a table named `city_stats`.'

In [61]:
print(agent.chat("Describe the first table").response)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The `city_stats` table has the following structure:

- `city_name` (VARCHAR(16), NOT NULL): The name of the city, serving as the primary key.
- `population` (INTEGER): The population of the city.
- `country` (VARCHAR(16), NOT NULL): The country where the city is located.


In [62]:
print(agent.chat("Retrieve the first row of that table").response)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The first row of the `city_stats` table contains the following data:

- City Name: Toronto
- Population: 2,930,000
- Country: Canada


---

## 4- Community Tools

### A) [Wikipedia Tool](https://llamahub.ai/l/tools/llama-index-tools-wikipedia?from=tools)  
[Github](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-wikipedia/examples/wikipedia.ipynb)

In [63]:
%pip install -q llama-index-tools-wikipedia

Note: you may need to restart the kernel to use updated packages.


In [64]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.wikipedia import WikipediaToolSpec

tool_spec = WikipediaToolSpec()

agent = OpenAIAgent.from_tools(tool_spec.to_tool_list()) #, verbose=True,)

response= agent.chat("Who is Ben Afflecks spouse?")

#print(response.sources)
print(response.response)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Ben Affleck's spouse is Jennifer Lopez. They got married in July 2022.


### B) [Arxiv Tool](https://llamahub.ai/l/tools/llama-index-tools-arxiv?from=tools)  
[Github](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-arxiv)  

In [65]:
%pip install -q llama-index-tools-arxiv

Note: you may need to restart the kernel to use updated packages.


In [66]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.arxiv.base import ArxivToolSpec

arxiv_tool = ArxivToolSpec()

agent = OpenAIAgent.from_tools(arxiv_tool.to_tool_list(), verbose=True,)

print(agent.chat("Whats are the newest discoveries on Prompt Engineering?"))

Added user message to memory: Whats are the newest discoveries on Prompt Engineering?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: arxiv_query with args: {"query":"Prompt Engineering","sort_by":"recent"}
INFO:arxiv:Requesting page (first: True, try: 0): https://export.arxiv.org/api/query?search_query=Prompt+Engineering&id_list=&sortBy=submittedDate&sortOrder=descending&start=0&max_results=100
Requesting page (first: True, try: 0): https://export.arxiv.org/api/query?search_query=Prompt+Engineering&id_list=&sortBy=submittedDate&sortOrder=descending&start=0&max_results=100
INFO:arxiv:Got first page: 100 of 98739 total results
Got first page: 100 of 98739 total results
Got output: [Document(id_='9f64bfde-5ffb-4a50-9e16-09b76991b20c', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[]

### C) [DuckDuckGoSearch Tool](https://llamahub.ai/l/tools/llama-index-tools-duckduckgo?from=tools)  
[Github](https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/tools/llama-index-tools-duckduckgo/examples/duckduckgo_search.ipynb)  

In [67]:
%pip install -q llama-index-tools-duckduckgo

Note: you may need to restart the kernel to use updated packages.


In [68]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.duckduckgo import DuckDuckGoSearchToolSpec

tool_spec = DuckDuckGoSearchToolSpec()

agent = OpenAIAgent.from_tools(DuckDuckGoSearchToolSpec().to_tool_list(), verbose=True,)
response = agent.chat("what are the latest developments in machine learning")

Added user message to memory: what are the latest developments in machine learning
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: duckduckgo_full_search with args: {"query":"latest developments in machine learning","max_results":5}
INFO:primp:response: https://duckduckgo.com/?q=latest+developments+in+machine+learning 200 18991
response: https://duckduckgo.com/?q=latest+developments+in+machine+learning 200 18991
INFO:primp:response: https://links.duckduckgo.com/d.js?q=latest+developments+in+machine+learning&kl=wt-wt&l=wt-wt&p=&s=0&df=&vqd=4-146566751269094499179007859444480573369&bing_market=wt-WT&ex=-1 200 11621
response: https://links.duckduckgo.com/d.js?q=latest+developments+in+machine+learning&kl=wt-wt&l=wt-wt&p=&s=0&df=&vqd=4-146566751269094499179007859444480573369&bing_market=wt-WT&ex=-1 200 11621
Got output: [{'titl

In [69]:
print(response.response)

Here are some of the latest developments in machine learning:

1. **Generative AI**: Models like DALL-E 3, Stable Diffusion, and Google's Imagen are producing highly realistic synthetic images and videos. This has significant implications for content creation, entertainment, and art. [Read more](https://waleed.dev/Blog/latest-developments-in-machine-learning-2024-update).

2. **AI and Machine Learning Trends for 2024**: The trends reflect a deepening sophistication and caution in AI development, focusing on ethics, safety, and the evolving regulatory landscape. One of the key trends is Multimodal AI. [Read more](https://www.techtarget.com/searchenterpriseai/tip/9-top-AI-and-machine-learning-trends).

3. **Trending Machine Learning Research**: Papers With Code highlights trending machine learning research and provides the code to implement it. This includes the latest research developments, libraries, methods, and datasets. [Explore more](https://paperswithcode.com/).

4. **MIT News on 

In [70]:
agent.chat_history.clear()
print(agent.chat("Who is Goethe?"))

Added user message to memory: Who is Goethe?
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
=== Calling Function ===
Calling function: duckduckgo_instant_search with args: {"query":"Goethe"}
INFO:primp:response: https://api.duckduckgo.com/?q=what+is+Goethe&format=json 200 19169
response: https://api.duckduckgo.com/?q=what+is+Goethe&format=json 200 19169
INFO:primp:response: https://api.duckduckgo.com/?q=Goethe&format=json 200 6629
response: https://api.duckduckgo.com/?q=Goethe&format=json 200 6629
Got output: [{'icon': None, 'text': "Johann Wolfgang von Goethe was a German polymath, who is widely regarded as the greatest and most influential writer in the German language. His work has had a profound and wide-ranging influence on Western literary, political, and philosophical thought from the late 18th century to the present day. A poet, playwright, novelist, scien

### D) [Yahoo Finance Tool](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/tools/)  
[Github](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-yahoo-finance)  

In [71]:
%pip install -q llama-index-tools-yahoo-finance

Note: you may need to restart the kernel to use updated packages.


In [72]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.yahoo_finance import YahooFinanceToolSpec

tool_spec = YahooFinanceToolSpec()
agent = OpenAIAgent.from_tools(tool_spec.to_tool_list())

print(agent.chat("What is the price of Apple stock?").response)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
The current price of Apple Inc. (AAPL) stock is $242.84.


In [73]:
print(agent.chat("What is the latest news about Apple?").response)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
Here are the latest news headlines about Apple Inc. (AAPL):

1. **Apple Inc. (AAPL) to Launch In-House Modem and AI-Powered iPhone SE in 2024, Reducing Reliance on Qualcomm** - Apple plans to introduce its own modem and an AI-powered iPhone SE next year.

2. **Apple co-founder Steve Jobs reminded Gap's former CEO why micromanaging is OK** - A reflection on Steve Jobs' management style and its impact.

3. **Dow Jones Futures: Nasdaq Near 20,000 But Risks Mount; Nvidia, Meta In Buy Zones** - A broader market update mentioning Apple in the context of Nasdaq's performance.

4. **Prediction: This Artificial Intelligence (AI) Company Will Be Acquired in 20

### E) [Open Weather Map Tool](https://llamahub.ai/l/tools/llama-index-tools-weather?from=tools)  
[Github](https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-weather)  

In [79]:
%pip install -q llama-index-tools-weather
%pip install -q pyowm

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [80]:
#from llama_index.agent.openai import OpenAIAgent
from llama_index.tools.weather import OpenWeatherMapToolSpec

os.environ["OPENWEATHER_API_KEY"]="your key"

## https://openweathermap.org/api
tool_spec = OpenWeatherMapToolSpec(key=os.environ["OPENWEATHER_API_KEY"])

agent = OpenAIAgent.from_tools(tool_spec.to_tool_list())

agent.chat("What is the temperature like in Paris?")

AgentChatResponse(response="In Paris, the current weather is as follows:\n- Detailed status: moderate rain\n- Wind speed: 7.2 m/s, direction: 290°\n- Humidity: 77%\n- Temperature: \n  - Current: 14.88°C\n  - High: 16.64°C\n  - Low: 12.78°C\n  - Feels like: 14.43°C\n- Rain: {'1h': 0.12}\n- Heat index: None\n- Cloud cover: 75%", sources=[ToolOutput(content='[Document(id_=\'09ac82fa-8c96-472c-bac3-31727d21cc3b\', embedding=None, metadata={\'weather from\': \'Paris\'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="In Paris, the current weather is as follows:\\nDetailed status: moderate rain\\nWind speed: 7.2 m/s, direction: 290°\\nHumidity: 77%\\nTemperature: \\n  - Current: 14.88°C\\n  - High: 16.64°C\\n  - Low: 12.78°C\\n  - Feels like: 14.43°C\\nRain: {\'1h\': 0.12}\\nHeat index: None\\nCloud cover: 75%", start_char_idx=None, end_char_idx=None, text_template=\'{metadata_str}\\n\\n{content}\', metadata_template=\'{key}: {value}\', metadata_sepera

In [85]:
agent.chat("What is the temperature in Vienna today?")

AgentChatResponse(response="In Vienna, the current weather is as follows:\n- Detailed status: light rain\n- Wind speed: 2.57 m/s, direction: 210°\n- Humidity: 86%\n- Temperature: \n  - Current: 17.13°C\n  - High: 18.93°C\n  - Low: 15.64°C\n  - Feels like: 17.14°C\n- Rain: {'1h': 0.23}\n- Heat index: None\n- Cloud cover: 20%", sources=[ToolOutput(content='[Document(id_=\'5153a3c3-5dad-4193-b3e5-15d3e0bc10a6\', embedding=None, metadata={\'weather from\': \'Vienna\'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="In Vienna, the current weather is as follows:\\nDetailed status: light rain\\nWind speed: 2.57 m/s, direction: 210°\\nHumidity: 86%\\nTemperature: \\n  - Current: 17.13°C\\n  - High: 18.93°C\\n  - Low: 15.64°C\\n  - Feels like: 17.14°C\\nRain: {\'1h\': 0.23}\\nHeat index: None\\nCloud cover: 20%", start_char_idx=None, end_char_idx=None, text_template=\'{metadata_str}\\n\\n{content}\', metadata_template=\'{key}: {value}\', metadata_seperat

## 5- [OnDemandLoaderTool](https://docs.llamaindex.ai/en/stable/examples/tools/OnDemandLoaderTool/)  

OnDemandLoaderTool is a powerful agent tool that allows for "on-demand" data querying from any data source on LlamaHub.  

This tool takes in a BaseReader data loader, and when called will 1) load data, 2) index data, and 3) query the data.  

In this walkthrough, we show how to use the OnDemandLoaderTool to convert our Wikipedia data loader into an accessible search tool for a LangChain agent.

In [86]:
%pip install -q llama-index-readers-wikipedia

Note: you may need to restart the kernel to use updated packages.


In [87]:
from llama_index.core.tools.ondemand_loader_tool import OnDemandLoaderTool
from llama_index.readers.wikipedia import WikipediaReader
from typing import List

from pydantic import BaseModel

#### Define Tool¶

First it is defined the WikipediaReader. Note that the load_data interface to WikipediaReader takes in a list of pages.   
By default, this queries the Wikipedia search endpoint which will autosuggest the relevant pages.  
Then it wraps it into our OnDemandLoaderTool.  
By default since we don't specify the index_cls, a simple vector store index is initialized.  

In [88]:
reader = WikipediaReader()

tool = OnDemandLoaderTool.from_defaults(
    reader,
    name="Wikipedia Tool",
    description="A tool for loading and querying articles from Wikipedia",
)

# run tool by itself
tool(["Berlin"], query_str="What's the arts and culture scene in Berlin?")

ToolOutput(content="Berlin has a rich and diverse arts and culture scene. The city is known for its vibrant nightlife, with a history of being a hub for punk music and culture, and a premier nightlife venue for people from all over the world. It has many nightclubs, and is known for its uninhibited parties. Berlin also has a long history of gay culture and hosts a number of queer clubs and festivals.\n\nThe city is home to 44 theaters and stages, including the Deutsches Theater, the Volksbühne, and the Berliner Ensemble. It also has three major opera houses: the Deutsche Oper, the Berlin State Opera, and the Komische Oper. The city's main venue for musical theater performances are the Theater am Potsdamer Platz and Theater des Westens. Contemporary dance can be seen at the Radialsystem V.\n\nBerlin also hosts the Berlin International Film Festival, which is considered to be the largest publicly attended film festival in the world. Other cultural festivals include the Karneval der Kultu

In [89]:
#from llama_index.indices.vector_store import VectorStoreIndex

reader = WikipediaReader()
tool = OnDemandLoaderTool.from_defaults(
    reader=reader,
    index_cls=VectorStoreIndex,
    index_kwargs={"dim": 768},
    name="Wikipedia",
    description="A tool for querying Wikipedia articles.",
)

In [90]:
response = tool(["City of Vienna"], query_str="What are the best universities in Vienna?")

In [91]:
response.content

'The context information lists several universities in Vienna, including the Academy of Fine Arts Vienna, Central European University, Diplomatic Academy of Vienna, Medical University of Vienna, PEF Private University of Management Vienna, University of Applied Arts Vienna, University of Applied Sciences Campus Vienna, University of Music and Performing Arts Vienna, University of Veterinary Medicine Vienna, University of Vienna, Vienna University of Economics and Business, University of Natural Resources and Life Sciences, Vienna, University of Applied Sciences Technikum Wien, TU Wien, Webster Vienna Private University, and Sigmund Freud Private University. However, it does not rank them or specify which ones are considered the best.'

## Also check [These Notebooks](https://github.com/run-llama/llama-hub/tree/main/llama_hub/tools/notebooks)