# 语言代理树搜索

[LATS（Language Agent Tree Search）](https://arxiv.org/pdf/2310.04406.pdf)由周等人提出，将LLM的规划、行动和推理能力结合在蒙特卡洛树搜索框架中，实现了受外部反馈和自我反思引导的深思熟虑和适应性问题解决。

我们已将此代理实现为LlamaPack - 您可以通过pip安装它以立即运行，或者调用`download_llama_pack`来加载该包。


## 设置


In [None]:
%pip install llama-index-agent-lats
%pip install llama-index-program-openai
%pip install llama-index-llms-openai
%pip install llama-index-embeddings-openai
%pip install llama-index-core llama-index-readers-file

### 定义全局设置


In [None]:
import os

os.environ["OPENAI_API_KEY"] = "sk-..."

import nest_asyncio

nest_asyncio.apply()

In [None]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

# 注意：较高的温度将有助于使树扩展更加多样化
llm = OpenAI(model="gpt-4-turbo", temperature=0.6)
embed_model = OpenAIEmbedding(model="text-embedding-3-small")

Settings.llm = llm
Settings.embed_model = embed_model

### 下载数据


In [None]:
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

In [None]:
import os
from llama_index.core import (
    SimpleDirectoryReader,
    VectorStoreIndex,
    load_index_from_storage,
)
from llama_index.core.storage import StorageContext


if not os.path.exists("./storage/lyft"):
    # 加载数据
    lyft_docs = SimpleDirectoryReader(
        input_files=["./data/10k/lyft_2021.pdf"]
    ).load_data()
    uber_docs = SimpleDirectoryReader(
        input_files=["./data/10k/uber_2021.pdf"]
    ).load_data()

    # 构建索引
    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
    uber_index = VectorStoreIndex.from_documents(uber_docs)

    # 持久化索引
    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
    uber_index.storage_context.persist(persist_dir="./storage/uber")
else:
    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/lyft"
    )
    lyft_index = load_index_from_storage(storage_context)

    storage_context = StorageContext.from_defaults(
        persist_dir="./storage/uber"
    )
    uber_index = load_index_from_storage(storage_context)

### 设置工具


In [None]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

In [None]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021. "
                "Use a detailed plain text question as input to the tool. "
                "The input is used to power a semantic search engine."
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021. "
                "Use a detailed plain text question as input to the tool. "
                "The input is used to power a semantic search engine."
            ),
        ),
    ),
]

## 设置代理

现在我们可以设置LATS代理了。

在这里，`num_expansions` 指的是在每个节点下要探索的可能子动作的数量。`num_expansions=2` 表示我们将为每个父动作探索两个可能的下一步动作。

`max_rollouts` 指的是搜索空间中每次探索的深度。`max_rollouts=5` 表示在树中将探索最大深度为5。


In [None]:

# from llama_index.agent.lats import LATSAgentWorker

agent_worker = LATSAgentWorker.from_tools(
    query_engine_tools,
    llm=llm,
    num_expansions=2,
    max_rollouts=3,  # 使用-1表示无限次rollouts
    verbose=True,
)
agent = agent.as_worker()

## 运行一些查询

首先，让我们使用逐步执行和较低级别的API来创建和执行一个任务。


In [None]:
task = agent.create_task(
    "Given the risk factors of Uber and Lyft described in their 10K files, "
    "which company is performing better? Please use concrete numbers to inform your decision."
)

In [None]:
# 运行初始步骤
step_output = agent.run_step(task.task_id)

[1;3;32m> Selecting node to expand: Observation: Given the risk factors of Uber and Lyft described in their 10K files, which company is performing better? Please use concrete numbers to inform your decision.
[0m[1;3;33m> Got candidates: ['Review the 10K files of Uber and Lyft to extract relevant financial data and risk factors.', 'Compare key financial metrics from the 10K files of Uber and Lyft, such as revenue, profit margins, and growth rates.']
[0m=== Calling Function ===
Calling function: uber_10k with args: {"input": "Provide key financial metrics from Uber's 10K for 2021, including revenue, profit margins, and growth rates."}
=== Calling Function ===
Calling function: uber_10k with args: {"input": "What are the key financial figures and risk factors for Uber in 2021?"}
=== Function Output ===
In 2021, Uber Technologies, Inc. reported a revenue of $17,455 million. The company experienced a significant revenue growth rate of 57% compared to the previous year. The total costs a

从步骤输出中，我们可以检查任务的状态。


In [None]:
for step in (
    step_output.task_step.step_state["root_node"].children[0].current_reasoning
):
    print(step)
    print("---------")

observation='Given the risk factors of Uber and Lyft described in their 10K files, which company is performing better? Please use concrete numbers to inform your decision.' return_direct=False
---------
observation='Review the 10K files of Uber and Lyft to extract relevant financial data and risk factors.' return_direct=False
---------


In [None]:
for step in (
    step_output.task_step.step_state["root_node"]
    .children[0]
    .children[0]
    .current_reasoning
):
    print(step)
    print("---------")

observation='Given the risk factors of Uber and Lyft described in their 10K files, which company is performing better? Please use concrete numbers to inform your decision.' return_direct=False
---------
observation='Review the 10K files of Uber and Lyft to extract relevant financial data and risk factors.' return_direct=False
---------
thought='To compare the performance of Uber and Lyft using their 10K files, I need to gather financial data and risk factors from both companies. I will start by querying the Uber 10K tool.' action='uber_10k' action_input={'input': 'What are the key financial figures and risk factors for Uber in 2021?'}
---------
observation="Uber Technologies, Inc., in its 2021 financial statements, consolidates its wholly-owned and majority-owned subsidiaries, as well as variable interest entities where it is the primary beneficiary. The financial statements are prepared in accordance with GAAP, and management uses estimates and assumptions that affect reported financi

循环直到我们完成任务。


In [None]:
# 重复直到达到最后一步
while not step_output.is_last:
    step_output = agent.run_step(task.task_id)

response = agent.finalize_response(task.task_id)

[1;3;32m> Selecting node to expand: Observation: Uber Technologies, Inc., in its 2021 financial statements, consolidates its wholly-owned and majority-owned subsidiaries, as well as variable interest entities where it is the primary beneficiary. The financial statements are prepared in accordance with GAAP, and management uses estimates and assumptions that affect reported financial figures, such as the fair values of investments, useful lives of assets, and reserves for income taxes and insurance, among others. These estimates consider the impact of the COVID-19 pandemic on market data and investment recoverability.

Key financial risks for Uber include concentration of credit risk, where cash and other receivables are potentially subject to credit risk concentration. The company's cash, cash equivalents, and securities consist largely of high-credit-quality money market funds, U.S. government and agency securities, and corporate debt securities. Despite exceeding insured limits, the

In [None]:
print(str(response))

Based on the information provided from the 10K files of Uber and Lyft for 2021, here is a comparative analysis of their financial data and risk factors:

**Financial Data:**
1. **Uber** did not disclose specific revenue figures in the provided information, focusing more on their financial management practices and the quality of their financial instruments. They emphasized the high credit quality of their cash equivalents and securities, and the use of GAAP-compliant financial statements influenced by estimates considering the COVID-19 impact.
   
2. **Lyft** reported total revenue of $3,208,323,000 for 2021, with a significant drop due to the COVID-19 pandemic and a partial recovery, still below pre-pandemic levels. This gives a clear numeric insight into their financial status during the year.

**Risk Factors:**
1. **Uber's** key risk factors include concentration of credit risk, with their financial assets placed in high-credit-quality institutions, and various uncertainties that cou

通过直接使用`.chat()`，我们可以让代理自动运行直到完成。


In [None]:
agent.reset()

response = agent.chat(
    "Given the revenue growth and risk factors of Uber and Lyft, "
    "which company is performing better? Please use concrete numbers to inform your decision."
)

[1;3;32m> Selecting node to expand: Observation: Given the revenue growth and risk factors of Uber and Lyft, which company is performing better? Please use concrete numbers to inform your decision.
[0m[1;3;33m> Got candidates: ['Review financial reports of Uber and Lyft for the latest fiscal year to compare revenue growth.', 'Analyze risk factors mentioned in the latest quarterly reports of Uber and Lyft.']
[0m=== Calling Function ===
Calling function: uber_10k with args: {"input": "What was Uber's revenue growth for the fiscal year 2021?"}
=== Function Output ===
Uber's revenue for the fiscal year 2021 was $17,455 million, which represents a growth of 57% compared to the revenue of $11,139 million in 2020.
[1;3;34m> Generated new reasoning step: Thought: The current language of the user is English. I need to use tools to gather data on the revenue growth of Uber and Lyft for the latest fiscal year, which is 2021.
Action: uber_10k
Action Input: {'input': "What was Uber's revenue g

In [None]:
print(str(response))

Uber's revenue growth for the fiscal year 2021 was 57%, increasing from $11,139 million in 2020 to $17,455 million in 2021. Lyft's revenue growth for the same period was 36%. Comparing these figures, Uber has shown a higher revenue growth percentage than Lyft for the fiscal year 2021.
