<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/openai_agent_context_retrieval.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="在Colab中打开"/></a>


# 增强上下文的OpenAI代理

这是一个使用增强上下文的OpenAI代理的示例。在这个示例中，我们将展示如何使用上下文信息来增强代理的性能。


在本教程中，我们将向您展示如何使用我们的`ContextRetrieverOpenAIAgent`实现来构建一个基于OpenAI函数API的代理，并存储/索引任意数量的工具。我们的索引/检索模块有助于消除因提示中包含太多函数而产生的复杂性。


## 初始设置


在这里，我们设置了一个ContextRetrieverOpenAIAgent。该代理将在调用任何工具之前先进行检索。这可以帮助将代理的工具选择和回答能力与上下文联系起来。


如果您在colab上打开这个笔记本，您可能需要安装LlamaIndex 🦙。


In [None]:
%pip install llama-index-agent-openai-legacy

In [None]:
!pip install llama-index

In [None]:
import json
from typing import Sequence

from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    StorageContext,
    load_index_from_storage,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata

In [None]:
try:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/march"
    )
    march_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/june"
    )
    june_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/sept"
    )
    sept_index = load_index_from_storage(storage_context)

    index_loaded = True
except:
    index_loaded = False

# 下载数据


In [None]:
!mkdir -p 'data/10q/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_march_2022.pdf' -O 'data/10q/uber_10q_march_2022.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_june_2022.pdf' -O 'data/10q/uber_10q_june_2022.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_sept_2022.pdf' -O 'data/10q/uber_10q_sept_2022.pdf'

In [None]:
# 在三个数据源之间构建索引if not index_loaded:    # 加载数据    march_docs = SimpleDirectoryReader(        input_files=["./data/10q/uber_10q_march_2022.pdf"]    ).load_data()    june_docs = SimpleDirectoryReader(        input_files=["./data/10q/uber_10q_june_2022.pdf"]    ).load_data()    sept_docs = SimpleDirectoryReader(        input_files=["./data/10q/uber_10q_sept_2022.pdf"]    ).load_data()    # 构建索引    march_index = VectorStoreIndex.from_documents(march_docs)    june_index = VectorStoreIndex.from_documents(june_docs)    sept_index = VectorStoreIndex.from_documents(sept_docs)    # 持久化索引    march_index.storage_context.persist(persist_dir="./storage/march")    june_index.storage_context.persist(persist_dir="./storage/june")    sept_index.storage_context.persist(persist_dir="./storage/sept")

In [None]:
march_engine = march_index.as_query_engine(similarity_top_k=3)
june_engine = june_index.as_query_engine(similarity_top_k=3)
sept_engine = sept_index.as_query_engine(similarity_top_k=3)

In [None]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=march_engine,
        metadata=ToolMetadata(
            name="uber_march_10q",
            description=(
                "Provides information about Uber 10Q filings for March 2022. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=june_engine,
        metadata=ToolMetadata(
            name="uber_june_10q",
            description=(
                "Provides information about Uber financials for June 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=sept_engine,
        metadata=ToolMetadata(
            name="uber_sept_10q",
            description=(
                "Provides information about Uber financials for Sept 2021. "
                "Use a detailed plain text question as input to the tool."
            ),
        ),
    ),
]

### 尝试上下文增强代理

在这里，我们将在不同的设置中使用上下文来增强我们的代理：
- 玩具上下文：我们定义一些缩写，这些缩写映射到金融术语（例如，R=收入）。我们将这些作为上下文提供给代理。


In [None]:
from llama_index.core import Document
from llama_index.agent.openai_legacy import ContextRetrieverOpenAIAgent

In [None]:
# 玩具索引 - 存储缩写列表texts = [    "缩写：X = 收入",    "缩写：YZ = 风险因素",    "缩写：Z = 成本",]docs = [Document(text=t) for t in texts]context_index = VectorStoreIndex.from_documents(docs)

In [None]:
context_agent = ContextRetrieverOpenAIAgent.from_tools_and_retriever(
    query_engine_tools,
    context_index.as_retriever(similarity_top_k=1),
    verbose=True,
)

In [None]:
response = context_agent.chat("What is the YZ of March 2022?")

[33;1m[1;3mContext information is below.
---------------------
Abbreviation: YZ = Risk Factors
---------------------
Given the context information and not prior knowledge, either pick the corresponding tool or answer the function: What is the YZ of March 2022?

[0m=== Calling Function ===
Calling function: uber_march_10q with args: {
  "input": "Risk Factors"
}
Got output: 
•The COVID-19 pandemic and the impact of actions to mitigate the pandemic have adversely affected and may continue to adversely affect parts of our business.
•Our business would be adversely affected if Drivers were classified as employees, workers or quasi-employees instead of independent contractors.
•The mobility, delivery, and logistics industries are highly competitive, with well-established and low-cost alternatives that have been available for decades, low barriers to entry, low switching costs, and well-capitalized competitors in nearly every major geographic region.
•To remain competitive in certain mark

In [None]:
print(str(response))

The risk factors for Uber in March 2022 include:

1. The adverse impact of the COVID-19 pandemic and actions taken to mitigate it on Uber's business.
2. The potential adverse effect on Uber's business if drivers are classified as employees instead of independent contractors.
3. Intense competition in the mobility, delivery, and logistics industries, with low-cost alternatives and well-capitalized competitors.
4. The need to lower fares, offer driver incentives, and provide consumer discounts and promotions to remain competitive in certain markets.
5. Uber's history of significant losses and the expectation of increased operating expenses in the future, which may affect profitability.
6. The importance of attracting and maintaining a critical mass of drivers, consumers, merchants, shippers, and carriers to keep the platform appealing.
7. The significance of maintaining and enhancing Uber's brand and reputation, as negative publicity could harm the business.
8. The potential impact of ec

In [None]:
context_agent.chat("What is the X and Z in September 2022?")

### 使用Uber 10-Q作为背景，使用计算器作为工具


In [None]:
from llama_index.core.tools import BaseTool,FunctionTooldef magic_formula(revenue: int, cost: int) -> int:    """对营收和成本运行MAGIC_FORMULA。"""    return revenue - costmagic_tool = FunctionTool.from_defaults(fn=magic_formula, name="magic_formula")

In [None]:
context_agent = ContextRetrieverOpenAIAgent.from_tools_and_retriever(
    [magic_tool], sept_index.as_retriever(similarity_top_k=3), verbose=True
)

In [None]:
response = context_agent.chat(
    "Can you run MAGIC_FORMULA on Uber's revenue and cost?"
)

[33;1m[1;3mContext information is below.
---------------------
Three Months Ended September 30, Nine Months Ended September 30,
2021 2022 2021 2022
Revenue 100 % 100 % 100 % 100 %
Costs and expenses
Cost of revenue, exclusive of depreciation and amortization shown separately
below 50 % 62 % 53 % 62 %
Operations and support 10 % 7 % 11 % 8 %
Sales and marketing 24 % 14 % 30 % 16 %
Research and development 10 % 9 % 13 % 9 %
General and administrative 13 % 11 % 15 % 10 %
Depreciation and amortization 4 % 3 % 6 % 3 %
Total costs and expenses 112 % 106 % 128 % 107 %
Loss from operations (12)% (6)% (28)% (7)%
Interest expense (3)% (2)% (3)% (2)%
Other income (expense), net (38)% (6)% 16 % (34)%
Loss before income taxes and income (loss) from equity method
investments (52)% (14)% (16)% (43)%
Provision for (benefit from) income taxes (2)% 1 % (3)% — %
Income (loss) from equity method investments — % — % — % — %
Net loss including non-controlling interests (50)% (14)% (12)% (42)%
Less: net in

In [None]:
print(response)

The result of running MAGIC_FORMULA on Uber's revenue and cost is -1690.
