<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/agent/openai_agent_query_cookbook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OpenAI Assistant Advanced Retrieval Cookbook


In this notebook, we try out OpenAI Assistant API for advanced retrieval tasks, by plugging in a variety of query engine tools and datasets. The wrapper abstraction we use is our `OpenAIAssistantAgent` class, which allows us to plug in custom tools. We explore how `OpenAIAssistant` can complement/replace existing workflows solved by our retrievers/query engines through its agent execution + function calling loop.

- Joint QA + Summarization
- Auto retrieval 
- Joint SQL and vector search

In [None]:
!pip install llama-index

In [1]:
import nest_asyncio

nest_asyncio.apply()

## Joint QA and Summarization

In this section we show how we can get the Assistant agent to both answer fact-based questions and summarization questions. This is something that the in-house retrieval tool struggles to accomplish.

### Load Data

In [1]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2023-11-11 09:40:13--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 2606:50c0:8002::154, 2606:50c0:8003::154, 2606:50c0:8000::154, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|2606:50c0:8002::154|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2023-11-11 09:40:14 (8.24 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



In [2]:
from llama_index import SimpleDirectoryReader

# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

### Setup Vector + Summary Indexes/Query Engines/Tools

In [3]:
from llama_index.llms import OpenAI
from llama_index import ServiceContext, StorageContext, SummaryIndex, VectorStoreIndex

# initialize service context (set chunk size)
llm = OpenAI()
service_context = ServiceContext.from_defaults(chunk_size=1024, llm=llm)
nodes = service_context.node_parser.get_nodes_from_documents(documents)

# initialize storage context (by default it's in-memory)
storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

# Define Summary Index and Vector Index over Same Data
summary_index = SummaryIndex(nodes, storage_context=storage_context)
vector_index = VectorStoreIndex(nodes, storage_context=storage_context)

# define query engines
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
vector_query_engine = vector_index.as_query_engine()

In [8]:
from llama_index.tools.query_engine import QueryEngineTool


summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    name="summary_tool",
    description=(
        "Useful for summarization questions related to the author's life"
    ),
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    name="vector_tool",
    description=(
        "Useful for retrieving specific context to answer specific questions about the author's life"
    ),
)

### Define Assistant Agent

In [9]:
from llama_index.agent import OpenAIAssistantAgent

agent = OpenAIAssistantAgent.from_new(
    name="QA bot",
    instructions="You are a bot designed to answer questions given a bio of an author",
    openai_tools=[],
    tools=[summary_tool, vector_tool],
    verbose=True,
    run_retrieve_sleep_time=1.0,
)

In [10]:
response = agent.chat("What is the summary of the author's life?")
print(str(response))

=== Calling Function ===
Calling function: summary_tool with args: {"input":"The author's life"}
Got output: The author's life involved a diverse range of experiences and pursuits. They started their journey with an interest in writing and programming, which began before college. They explored different fields such as philosophy and AI during their academic years. Eventually, they became fascinated with Lisp and focused on Lisp hacking. Later on, they pursued art and attended art school. They also worked at a technology company and gained valuable insights. The author's life took a turn when they became interested in the World Wide Web and started a company called Viaweb, which eventually led to the creation of Y Combinator. They also engaged in various projects, including building a new dialect of Lisp called Arc, publishing essays online, and creating a programming language called Bel. Throughout their journey, they experienced personal challenges and losses, but found their work eng

In [11]:
response = agent.query("What did the author do after RICS?")
print(str(response))

=== Calling Function ===
Calling function: vector_tool with args: {"input":"What did the author do after RICS?"}
Got output: After RISD, the author did freelance work for a group that did projects for customers.
[{'tool_call_id': 'call_cSbd7c5ci4W0tvZznkHXjmRd', 'output': 'After RISD, the author did freelance work for a group that did projects for customers.'}]
After RISD (Rhode Island School of Design), the author engaged in freelance work for a group that carried out projects for various customers.


## AutoRetrieval from a Vector Database

Our existing "auto-retrieval" capabilities (in `VectorIndexAutoRetriever`) allow an LLM to infer the right query parameters for a vector database - including both the query string and metadata filter.

Since the Assistant API can call functions + infer function parameters, we explore its capabilities in performing auto-retrieval here.

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.

In [12]:
import pinecone
import os

api_key = os.environ["PINECONE_API_KEY"]
pinecone.init(api_key=api_key, environment="us-west1-gcp")

  from tqdm.autonotebook import tqdm


In [13]:
# dimensions are for text-embedding-ada-002
try:
    pinecone.create_index(
        "quickstart", dimension=1536, metric="euclidean", pod_type="p1"
    )
except Exception:
    # most likely index already exists
    pass

In [14]:
pinecone_index = pinecone.Index("quickstart")

In [15]:
# Optional: delete data in your pinecone index
pinecone_index.delete(deleteAll=True, namespace="test")

{}

In [16]:
from llama_index import VectorStoreIndex, StorageContext
from llama_index.vector_stores import PineconeVectorStore

In [17]:
from llama_index.schema import TextNode

nodes = [
    TextNode(
        text=(
            "Michael Jordan is a retired professional basketball player,"
            " widely regarded as one of the greatest basketball players of all"
            " time."
        ),
        metadata={
            "category": "Sports",
            "country": "United States",
        },
    ),
    TextNode(
        text=(
            "Angelina Jolie is an American actress, filmmaker, and"
            " humanitarian. She has received numerous awards for her acting"
            " and is known for her philanthropic work."
        ),
        metadata={
            "category": "Entertainment",
            "country": "United States",
        },
    ),
    TextNode(
        text=(
            "Elon Musk is a business magnate, industrial designer, and"
            " engineer. He is the founder, CEO, and lead designer of SpaceX,"
            " Tesla, Inc., Neuralink, and The Boring Company."
        ),
        metadata={
            "category": "Business",
            "country": "United States",
        },
    ),
    TextNode(
        text=(
            "Rihanna is a Barbadian singer, actress, and businesswoman. She"
            " has achieved significant success in the music industry and is"
            " known for her versatile musical style."
        ),
        metadata={
            "category": "Music",
            "country": "Barbados",
        },
    ),
    TextNode(
        text=(
            "Cristiano Ronaldo is a Portuguese professional footballer who is"
            " considered one of the greatest football players of all time. He"
            " has won numerous awards and set multiple records during his"
            " career."
        ),
        metadata={
            "category": "Sports",
            "country": "Portugal",
        },
    ),
]

In [18]:
vector_store = PineconeVectorStore(
    pinecone_index=pinecone_index, namespace="test"
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [19]:
index = VectorStoreIndex(nodes, storage_context=storage_context)

Upserted vectors:   0%|          | 0/5 [00:00<?, ?it/s]

#### Define Function Tool

Here we define the function interface, which is passed to OpenAI to perform auto-retrieval.

We were not able to get OpenAI to work with nested pydantic objects or tuples as arguments,
so we converted the metadata filter keys and values into lists for the function API to work with.

In [30]:
# define function tool
from llama_index.tools import FunctionTool
from llama_index.vector_stores.types import (
    VectorStoreInfo,
    MetadataInfo,
    ExactMatchFilter,
    MetadataFilters,
)
from llama_index.retrievers import VectorIndexRetriever
from llama_index.query_engine import RetrieverQueryEngine

from typing import List, Tuple, Any
from pydantic import BaseModel, Field

# hardcode top k for now
top_k = 3

# define vector store info describing schema of vector store
vector_store_info = VectorStoreInfo(
    content_info="brief biography of celebrities",
    metadata_info=[
        MetadataInfo(
            name="category",
            type="str",
            description=(
                "Category of the celebrity, one of [Sports, Entertainment,"
                " Business, Music]"
            ),
        ),
        MetadataInfo(
            name="country",
            type="str",
            description=(
                "Country of the celebrity, one of [United States, Barbados,"
                " Portugal]"
            ),
        ),
    ],
)


# define pydantic model for auto-retrieval function
class AutoRetrieveModel(BaseModel):
    query: str = Field(..., description="natural language query string")
    filter_key_list: List[str] = Field(
        ..., description="List of metadata filter field names"
    )
    filter_value_list: List[str] = Field(
        ...,
        description=(
            "List of metadata filter field values (corresponding to names"
            " specified in filter_key_list)"
        ),
    )


def auto_retrieve_fn(
    query: str, filter_key_list: List[str], filter_value_list: List[str]
):
    """Auto retrieval function.

    Performs auto-retrieval from a vector database, and then applies a set of filters.

    """
    query = query or "Query"

    exact_match_filters = [
        ExactMatchFilter(key=k, value=v)
        for k, v in zip(filter_key_list, filter_value_list)
    ]
    retriever = VectorIndexRetriever(
        index,
        filters=MetadataFilters(filters=exact_match_filters),
        top_k=top_k,
    )
    results = retriever.retrieve(query)
    return [r.get_content() for r in results]
    # print([t.get_content() for t in tmp])
    # raise Exception
    
    # query_engine = RetrieverQueryEngine.from_args(retriever)

    # response = query_engine.query(query)
    # return str(response)


description = f"""\
Use this tool to look up biographical information about celebrities.
The vector database schema is given below:
{vector_store_info.json()}
"""

auto_retrieve_tool = FunctionTool.from_defaults(
    fn=auto_retrieve_fn,
    name="celebrity_bios",
    description=description,
    fn_schema=AutoRetrieveModel,
)

In [31]:
auto_retrieve_fn("celebrity from the United States", filter_key_list=["country"], filter_value_list=["United States"])

['Angelina Jolie is an American actress, filmmaker, and humanitarian. She has received numerous awards for her acting and is known for her philanthropic work.',
 'Michael Jordan is a retired professional basketball player, widely regarded as one of the greatest basketball players of all time.']

#### Initialize Agent

In [32]:
from llama_index.agent import OpenAIAssistantAgent

agent = OpenAIAssistantAgent.from_new(
    name="Celebrity bot",
    instructions="You are a bot designed to answer questions about celebrities.",
    tools=[auto_retrieve_tool],
    verbose=True
)

In [33]:
response = agent.chat("Tell me about two celebrities from the United States. ")
print(str(response))

=== Calling Function ===
Calling function: celebrity_bios with args: {"query": "Tell me about a celebrity from the United States", "filter_key_list": ["country"], "filter_value_list": ["United States"]}
Got output: ['Angelina Jolie is an American actress, filmmaker, and humanitarian. She has received numerous awards for her acting and is known for her philanthropic work.', 'Michael Jordan is a retired professional basketball player, widely regarded as one of the greatest basketball players of all time.']
=== Calling Function ===
Calling function: celebrity_bios with args: {"query": "Tell me about a celebrity from the United States", "filter_key_list": ["country"], "filter_value_list": ["United States"]}
Got output: ['Angelina Jolie is an American actress, filmmaker, and humanitarian. She has received numerous awards for her acting and is known for her philanthropic work.', 'Michael Jordan is a retired professional basketball player, widely regarded as one of the greatest basketball pla

## Joint Text-to-SQL and Semantic Search

This is currenty handled by our `SQLAutoVectorQueryEngine`.

Let's try implementing this by giving our `OpenAIAssistantAgent` access to two query tools: SQL and Vector search.

#### Load and Index Structured Data

We load sample structured datapoints into a SQL db and index it.

In [34]:
from sqlalchemy import (
    create_engine,
    MetaData,
    Table,
    Column,
    String,
    Integer,
    select,
    column,
)
from llama_index import SQLDatabase, SQLStructStoreIndex

engine = create_engine("sqlite:///:memory:", future=True)
metadata_obj = MetaData()

In [35]:
# create city SQL table
table_name = "city_stats"
city_stats_table = Table(
    table_name,
    metadata_obj,
    Column("city_name", String(16), primary_key=True),
    Column("population", Integer),
    Column("country", String(16), nullable=False),
)

metadata_obj.create_all(engine)

In [36]:
# print tables
metadata_obj.tables.keys()

dict_keys(['city_stats'])

In [37]:
from sqlalchemy import insert

rows = [
    {"city_name": "Toronto", "population": 2930000, "country": "Canada"},
    {"city_name": "Tokyo", "population": 13960000, "country": "Japan"},
    {"city_name": "Berlin", "population": 3645000, "country": "Germany"},
]
for row in rows:
    stmt = insert(city_stats_table).values(**row)
    with engine.begin() as connection:
        cursor = connection.execute(stmt)

In [38]:
with engine.connect() as connection:
    cursor = connection.exec_driver_sql("SELECT * FROM city_stats")
    print(cursor.fetchall())

[('Toronto', 2930000, 'Canada'), ('Tokyo', 13960000, 'Japan'), ('Berlin', 3645000, 'Germany')]


In [39]:
sql_database = SQLDatabase(engine, include_tables=["city_stats"])

In [40]:
from llama_index.indices.struct_store.sql_query import NLSQLTableQueryEngine

In [41]:
query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["city_stats"],
)

#### Load and Index Unstructured Data

We load unstructured data into a vector index backed by Pinecone

In [None]:
# install wikipedia python package
!pip install wikipedia


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3.1[0m[39;49m -> [0m[32;49m23.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [42]:
from llama_index import (
    WikipediaReader,
    SimpleDirectoryReader,
    VectorStoreIndex,
)

In [43]:
cities = ["Toronto", "Berlin", "Tokyo"]
wiki_docs = WikipediaReader().load_data(pages=cities)

In [44]:
from llama_index.node_parser import SimpleNodeParser
from llama_index import ServiceContext
from llama_index.storage import StorageContext
from llama_index.text_splitter import TokenTextSplitter
from llama_index.llms import OpenAI

# define node parser and LLM
chunk_size = 1024
llm = OpenAI(temperature=0, model="gpt-4")
service_context = ServiceContext.from_defaults(chunk_size=chunk_size, llm=llm)
text_splitter = TokenTextSplitter(chunk_size=chunk_size)
node_parser = SimpleNodeParser.from_defaults(text_splitter=text_splitter)

# use default in-memory store
storage_context = StorageContext.from_defaults()
vector_index = VectorStoreIndex([], storage_context=storage_context)

In [45]:
# Insert documents into vector index
# Each document has metadata of the city attached
for city, wiki_doc in zip(cities, wiki_docs):
    nodes = node_parser.get_nodes_from_documents([wiki_doc])
    # add metadata to each node
    for node in nodes:
        node.metadata = {"title": city}
    vector_index.insert_nodes(nodes)

#### Define Query Engines / Tools

In [46]:
from llama_index.tools.query_engine import QueryEngineTool

In [47]:
sql_tool = QueryEngineTool.from_defaults(
    query_engine=query_engine,
    name="sql_tool",
    description=(
        "Useful for translating a natural language query into a SQL query over"
        " a table containing: city_stats, containing the population/country of"
        " each city"
    ),
)
vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_index.as_query_engine(similarity_top_k=2),
    name="vector_tool",
    description=(
        f"Useful for answering semantic questions about different cities"
    ),
)

#### Initialize Agent

In [49]:
from llama_index.agent import OpenAIAssistantAgent

agent = OpenAIAssistantAgent.from_new(
    name="City bot",
    instructions="You are a bot designed to answer questions about cities (both unstructured and structured data)",
    tools=[sql_tool, vector_tool],
    verbose=True
)

In [50]:
# NOTE: gpt-3.5 gives the wrong answer, but gpt-4 is able to reason over both loops
response = agent.chat(
    "Tell me about the arts and culture of the city with the highest"
    " population"
)
print(str(response))

=== Calling Function ===
Calling function: sql_tool with args: {"input":"SELECT city_name FROM city_stats ORDER BY population DESC LIMIT 1"}
Got output: The city with the highest population is Tokyo.
[{'tool_call_id': 'call_nS2NGJRsEiSn3zeIu0TqSiIc', 'output': 'The city with the highest population is Tokyo.'}]
=== Calling Function ===
Calling function: vector_tool with args: {"input":"Describe the arts and culture of Tokyo."}
Got output: Tokyo has a vibrant arts and culture scene. The city is home to numerous museums, including the Tokyo National Museum, which specializes in traditional Japanese art, the National Museum of Western Art, and the Edo-Tokyo Museum. There are also theaters for traditional forms of Japanese drama, such as the National Noh Theatre and the Kabuki-za. Tokyo hosts a variety of musical performances, ranging from symphony orchestras to modern pop and rock concerts. The city is known for its festivals, including the Sannō, Sanja, and Kanda Festivals, which feature 

In [51]:
response = agent.chat("Tell me about the history of Berlin")
print(str(response))

=== Calling Function ===
Calling function: vector_tool with args: {"input":"Describe the history of Berlin."}
Got output: Berlin has a rich and diverse history. It was first documented in the 13th century and became the capital of the Margraviate of Brandenburg, Kingdom of Prussia, German Empire, Weimar Republic, and Nazi Germany. After World War II, the city was divided, with West Berlin becoming a part of West Germany and East Berlin becoming the capital of East Germany. Following German reunification in 1990, Berlin once again became the capital of all of Germany. Throughout its history, Berlin has been a center of culture, politics, media, and science. It has experienced periods of economic boom, such as during the Gründerzeit era, and has been a hub for artistic and intellectual movements. Today, Berlin is known for its vibrant cultural scene, high-tech industries, and world-renowned universities. It is also a popular tourist destination, with numerous landmarks, museums, and fest

In [52]:
response = agent.chat(
    "Can you give me the country corresponding to each city?"
)
print(str(response))

=== Calling Function ===
Calling function: sql_tool with args: {"input":"SELECT city_name, country FROM city_stats"}
Got output: The cities in the query results are Toronto, Tokyo, and Berlin. The corresponding countries are Canada, Japan, and Germany respectively.
[{'tool_call_id': 'call_szXcIbO7xtvYhW7s6ylTVDI4', 'output': 'The cities in the query results are Toronto, Tokyo, and Berlin. The corresponding countries are Canada, Japan, and Germany respectively.'}]
For the cities mentioned:

- Toronto is in Canada.
- Tokyo is in Japan.
- Berlin is in Germany.
