This notebook was inpired by this LlamaIndex notebook:

https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/agent/coa_agent.ipynb

Making some changes to it with the only intention of trying ideas and learning.

Notice that I am assuming you have the relevant API_KEYs as environmental variables.

In [2]:
%pip install llama-index-packs-agents-coa

In [9]:
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)
from llama_index.core.agent import AgentRunner
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import QueryEngineTool, ToolMetadata
import os
import subprocess
from llama_index.core import Settings
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.packs.agents_coa import CoAAgentWorker
import nest_asyncio
nest_asyncio.apply()


## LlamaIndex Settings

In [2]:
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small", embed_batch_size=256
)
Settings.llm = OpenAI(model="gpt-4-turbo", temperature=0.1)


## Defining global variables

In [3]:
COMPANIES = ["uber", "lyft"]
DATA_DIRS = {}
PERSIST_DIRS = {}
for c in COMPANIES:
    DATA_DIRS[c] = os.path.join(os.environ["DATA_DIR"], f"{c}")
    PERSIST_DIRS[c] = os.path.join(os.environ["PERSIST_DIR"], f"{c}")


## Download data
You can access more files by usinig the next url format

https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/{`COMPANY_NAME`}_10q_{`MONTH`}_{`YEAR`}.pdf'

In [4]:
for c in COMPANIES:
    if not os.path.exists(DATA_DIRS[c]):
        os.mkdir(DATA_DIRS[c])
        command = f"wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/{c}_2021.pdf' -O '{DATA_DIRS[c]}/{c}_2021.pdf'"        
        subprocess.run(command, shell=True)

# Create index and query engine for each company individually

In [5]:

# file_extractor = {
#     ".pdf": LlamaParse(
#         result_type="markdown",
#         api_key="llx-...",
#     )
# }

query_engine_dict = {}
for c in COMPANIES:
    if not os.path.exists(PERSIST_DIRS[c]):
        print("Creating Index")
        # load the documents and create the index
        # documents = SimpleDirectoryReader(DATA_DIRS[c], file_extractor=file_extractor).load_data()
        documents = SimpleDirectoryReader(DATA_DIRS[c]).load_data()
        index = VectorStoreIndex.from_documents(documents)
        # store it for later
        index.storage_context.persist(persist_dir=PERSIST_DIRS[c])
    else:
        print("Loading Index")
        # load the existing index
        storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIRS[c])
        index = load_index_from_storage(storage_context)
    
    query_engine_dict[c] = index.as_query_engine(similarity_top_k=3)
    

Loading Index
Loading Index


## Define tool

In [6]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=query_engine_dict[c],
        metadata=ToolMetadata(
            name=f"{c}_10k",
            description=(
                f"Provides information about {c} financials for year 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    )
    for c in COMPANIES
]

## Agent

Steps
- The tools get parsed into python-like definitions
- The agent is prompted to generate a CoA plan
- The function calls are parsed out of the plan and executed
- The values in the plan are filled in
- The agent generates a final response

In [10]:
worker = CoAAgentWorker.from_tools(
    tools=query_engine_tools,
    llm=Settings.llm,
    verbose=True,
)
agent = AgentRunner(worker)
response = agent.chat("How did Ubers revenue growth compare to Lyfts in 2021?")

==== Available Parsed Functions ====
def uber_10k(input: string):
   """Provides information about uber financials for year 2021. Use a detailed plain text question as input to the tool."""
    ...
def lyft_10k(input: string):
   """Provides information about lyft financials for year 2021. Use a detailed plain text question as input to the tool."""
    ...


==== Generated Chain of Abstraction ====
To compare Uber's revenue growth to Lyft's in 2021, we need to obtain the revenue growth figures for both companies for that year.

1. Retrieve Uber's revenue growth for 2021 by querying the Uber financial tool with a specific question about revenue growth. This can be done using the function call:
   [FUNC uber_10k("What was Uber's revenue growth in 2021?") = y1]

2. Similarly, retrieve Lyft's revenue growth for 2021 by querying the Lyft financial tool with a specific question about revenue growth. This can be done using the function call:
   [FUNC lyft_10k("What was Lyft's revenue growth in 2021?") = y2]

3. After obtaining both y1 and y2, compare the values to determine which company had higher revenue growth in 2021. This comparison does not require a function call but involves analyzing the outputs y1 and y2 to see which is greater.
==== Executing uber_10k with inputs ["What was Uber's revenue growth in 2021?"] ====
==== Executing lyft_10k 

In [11]:
print(str(response))

In 2021, Uber's revenue growth was higher than Lyft's. Uber's revenue grew by 57%, increasing from $11,139 million in 2020 to $17,455 million. In contrast, Lyft's revenue increased by 36% compared to the prior year.


## How the prompting of Chain of Abstraction works

How does this actually work?

So, under the hood we are prompting the LLM to first output the CoA, then we parse it and run functions, then we refine all that into a final output.

First, we parse the tools into python-like function defintions by parsing `tool.metadata.fn_schema_str`, along with the tool name and description.

You can find that code in the [utils](https://notebooks.githubusercontent.com/view/ipynb?browser=firefox&bypass_fastly=true&color_mode=auto&commit=7b52057b717451a801c583fae7efe4c4ad167455&device=unknown_device&docs_host=https%3A%2F%2Fdocs.github.com&enc_url=68747470733a2f2f7261772e67697468756275736572636f6e74656e742e636f6d2f72756e2d6c6c616d612f6c6c616d615f696e6465782f376235323035376237313734353161383031633538336661653765666534633461643136373435352f646f63732f646f63732f6578616d706c65732f6167656e742f636f615f6167656e742e6970796e62&logged_in=true&nwo=run-llama%2Fllama_index&path=docs%2Fdocs%2Fexamples%2Fagent%2Fcoa_agent.ipynb&platform=linux&repository_id=560704231&repository_type=Repository&version=125).

What this looks like is we have a prompt like this:

REASONING_PROMPT_TEMPALTE = """Generate an abstract plan of reasoning using placeholders for the specific values and function calls needed.
The placeholders should be labeled y1, y2, etc.
Function calls should be represented as inline strings like [FUNC {{function_name}}({{input1}}, {{input2}}, ...) = {{output_placeholder}}].
Assume someone will read the plan after the functions have been executed in order to make a final response.
Not every question will require function calls to answer.
If you do invoke a function, only use the available functions, do not make up functions.

Example:
-----------
Available functions:
\`\`\`python
def add(a: int, b: int) -> int:
    \"\"\"Add two numbers together.\"\"\"
    ...

def multiply(a: int, b: int) -> int:
    \"\"\"Multiply two numbers together.\"\"\"
    ...
\`\`\`

Question:
Sally has 3 apples and buys 2 more. Then magically, a wizard casts a spell that multiplies the number of apples by 3. How many apples does Sally have now?

Abstract plan of reasoning:
After buying the apples, Sally has [FUNC add(3, 2) = y1] apples. Then, the wizard casts a spell to multiply the number of apples by 3, resulting in [FUNC multiply(y1, 3) = y2] apples.

Your Turn:
-----------
Available functions:
\`\`\`python
{functions}
\`\`\`

Question:
{question}

Abstract plan of reasoning:
"""

This will generate the chain-of-abstraction reasoning.

Then, the reasoning is parsed using the [output parser](https://notebooks.githubusercontent.com/view/ipynb?browser=firefox&bypass_fastly=true&color_mode=auto&commit=7b52057b717451a801c583fae7efe4c4ad167455&device=unknown_device&docs_host=https%3A%2F%2Fdocs.github.com&enc_url=68747470733a2f2f7261772e67697468756275736572636f6e74656e742e636f6d2f72756e2d6c6c616d612f6c6c616d615f696e6465782f376235323035376237313734353161383031633538336661653765666534633461643136373435352f646f63732f646f63732f6578616d706c65732f6167656e742f636f615f6167656e742e6970796e62&logged_in=true&nwo=run-llama%2Fllama_index&path=docs%2Fdocs%2Fexamples%2Fagent%2Fcoa_agent.ipynb&platform=linux&repository_id=560704231&repository_type=Repository&version=125).

After calling the functions and filling in values, we give the LLM a chance to refine the response, using this prompt:

REFINE_REASONING_PROMPT_TEMPALTE = """Generate a response to a question by using a previous abstract plan of reasoning. Use the previous reasoning as context to write a response to the question.

Example:
-----------
Question: 
Sally has 3 apples and buys 2 more. Then magically, a wizard casts a spell that multiplies the number of apples by 3. How many apples does Sally have now?

Previous reasoning:
After buying the apples, Sally has [FUNC add(3, 2) = 5] apples. Then, the wizard casts a spell to multiply the number of apples by 3, resulting in [FUNC multiply(5, 3) = 15] apples.

Response:
After the wizard casts the spell, Sally has 15 apples.

Your Turn:
-----------
Question:
{question}

Previous reasoning:
{prev_reasoning}

Response:
"""

