This notebook was inpired by [this LlamaIndex notebook](https://colab.research.google.com/drive/1GyPRMiwxS7rKxKpRt4r-ckYfmAw2GxdQ?usp=sharing)

Making some changes to it with the only intention of trying ideas and learning.

Notice that I am assuming you have the relevant API_KEYs as environmental variables.

In [7]:
from bubls.utils.data.download import download_file_from_url
from bubls.utils.indexing import create_index_from_path
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import QueryEngineTool, ToolMetadata
import os

## Defining global variables

In [2]:
METADATA = {
    "lyft_10k": {
        "source_url": "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf",
        "file_name": "lyft_10k_2021.pdf",
        "save_data_to": os.path.join(os.environ["DATA_DIR"], "lyft_10k"),
        "persist_index_to": os.path.join(os.environ["PERSIST_DIR"], "lyft_10k"),
        },
    "uber_10k": {
        "source_url": "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf",
        "file_name": "uber_10k_2021.pdf",
        "save_data_to": os.path.join(os.environ["DATA_DIR"], "uber_10k"),
        "persist_index_to": os.path.join(os.environ["PERSIST_DIR"], "uber_10k"),
        },
}

## Ingest Data
- Download Information
- Create&Persist or Load Index 
- Create Query Engine

In [3]:
# parser = LlamaParse(result_type="markdown")

In [4]:
query_engine_dict = {}
for k, md in METADATA.items():
    download_file_from_url(md["source_url"], md["file_name"], md["save_data_to"])
    index = create_index_from_path(
        md["persist_index_to"],
        md["save_data_to"],
        # {".pdf": parser}
    )
    query_engine_dict[k] = index.as_query_engine(similarity_top_k=3)

Loading Index


Loading Index


## Define tool

In [5]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=query_engine_dict[k],
        metadata=ToolMetadata(
            name=k,
            description=(
                f"Provides information about {k} financials for year 2021. "
            ),
        ),
    )
    for k in METADATA.keys()
]

## Agent

In [6]:
llm_gpt4 = OpenAI(model="gpt-4")
gpt4_agent_worker = FunctionCallingAgentWorker.from_tools(query_engine_tools, llm=llm_gpt4, verbose=True, allow_parallel_tool_calls=False)
gpt4_agent = AgentRunner(gpt4_agent_worker)
response = gpt4_agent.chat("Compare the revenue growth of Uber and Lyft in 2021.")

Added user message to memory: Compare the revenue growth of Uber and Lyft in 2021.


=== Calling Function ===
Calling function: uber_10k with args: {"input": "What was Uber's revenue growth in 2021?"}
=== Function Output ===
Uber's revenue grew by 57% in 2021.
=== Calling Function ===
Calling function: lyft_10k with args: {"input": "What was Lyft's revenue growth in 2021?"}
=== Function Output ===
Lyft's revenue increased by 36% in 2021 compared to the prior year.
=== LLM Response ===
In 2021, Uber's revenue grew by 57%, while Lyft's revenue increased by 36%. Therefore, Uber had a higher revenue growth compared to Lyft in the same year.
