<a href="https://colab.research.google.com/github/franlin1860/llm/blob/main/structured_planner_v20240914.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Structured Planning Agent

A key pattern in agents is the ability to plan. ReAct for example, uses a structured approach to decompose an input into a set of function calls and thoughts, in order to reason about a final response.

However, breaking down the initial input/task into several sub-tasks can make the ReAct loop (or other reasoning loops) easier to execute.

The `StructuredPlanningAgent` in LlamaIndex wraps any agent worker (ReAct, Function Calling, Chain-of-Abstraction, etc.) and decomposes an initial input into several sub-tasks. Each sub-task is represented by an input, expected outcome, and any dependendant sub-tasks that should be completed first.

This notebook walks through both the high-level and low-level usage of this agent.

**NOTE:** This agent leverages both structured outputs and agentic reasoning. Because of this, we would recommend a capable LLM (OpenAI, Anthropic, etc.), and open-source LLMs may struggle to plan without prompt engineering or fine-tuning.

StructuredPlannerAgent 是 LlamaIndex 中的一种代理，它能够将复杂的任务分解成多个子任务，并利用工具来完成这些子任务，最终完成整个任务。

工作原理：

接收任务： 代理接收一个需要完成的复杂任务。
创建计划： 代理使用 LLM 模型根据任务和可用的工具创建一个计划，该计划包含一系列子任务，每个子任务都有其预期输出和依赖关系。
执行子任务： 代理根据计划依次执行子任务。
对于每个子任务，代理会选择合适的工具来执行。
工具可以是任何可以执行特定功能的对象，例如查询引擎、数据库连接器、API 调用等。
优化计划： 在执行子任务的过程中，代理会根据已完成的子任务和其输出，使用 LLM 模型来优化剩余的计划，以确保最终目标的实现。
返回结果： 当所有子任务都完成后，代理将返回最终结果。
特点：

结构化规划： 将复杂任务分解成多个子任务，使问题更容易解决。
工具使用： 利用各种工具来完成子任务，扩展了代理的能力。
计划优化： 动态调整计划，以适应不断变化的情况和新的信息。
可解释性： 可以清晰地展示代理的推理过程和决策依据。
优势：

提高效率： 自动化复杂的任务，减少人为干预。
减少错误： 通过结构化规划和工具使用，降低出错的可能性。
增强能力： 可以完成比传统代理更复杂的任务。
应用场景：

如前所述，StructuredPlannerAgent 适用于需要多步骤推理和工具使用的场景，例如金融分析、客户服务、数据分析等。
总而言之，StructuredPlannerAgent 是一种强大的代理，它可以帮助自动化复杂的任务，提高效率并减少人为错误。

FunctionCallingAgentWorker 是 LlamaIndex 中的一种代理工作器，它允许代理使用函数调用来与外部工具进行交互，并完成任务。

工作原理：

接收任务： 代理将任务分配给 FunctionCallingAgentWorker。
选择工具： FunctionCallingAgentWorker 根据任务描述和可用的工具，选择最合适的工具来执行任务。
生成函数调用： FunctionCallingAgentWorker 使用 LLM 模型将任务转换为对所选工具的函数调用，包括函数名称和参数。
执行函数调用： 代理执行生成的函数调用，并获取工具的输出。
返回结果： FunctionCallingAgentWorker 将工具的输出返回给代理。
特点：

函数调用： 使用函数调用来与工具交互，更加灵活和可控。
工具选择： 根据任务和工具描述，自动选择最合适的工具。
代码生成： 使用 LLM 模型生成函数调用代码，无需手动编写代码。
优势：

简化工具集成： 通过函数调用，可以轻松地将各种工具集成到代理中。
提高效率： 自动化工具选择和代码生成过程，提高代理的效率。
增强灵活性： 可以使用不同的函数调用来完成不同的任务。
与其他工作器的区别：

ReActAgentWorker： 使用 ReAct 框架进行推理，通过一系列 "思考" 和 "行动" 步骤来完成任务。
ChainOfAbstractionAgentWorker： 使用抽象链来分解任务，将复杂任务分解成多个更简单的子任务。
应用场景：

FunctionCallingAgentWorker 适用于可以使用函数调用来访问的工具，例如：
查询引擎
数据库
API
其他 Python 库
总而言之，FunctionCallingAgentWorker 是一种强大的代理工作器，它允许代理通过函数调用来利用各种工具，从而完成更复杂的任务。

ReActAgentWorker 是 LlamaIndex 中的一种代理工作器，它使用 ReAct 框架来进行推理和决策，最终完成任务。 ReAct (Reason + Act) 是一种将推理和行动结合起来的代理框架，它允许代理在执行行动之前进行思考和规划。

工作原理：

接收任务： 代理将任务分配给 ReActAgentWorker。
初始化状态： ReActAgentWorker 初始化代理的状态，包括当前的观察、思考和行动。
循环推理和行动： 代理进入一个循环，在循环中进行以下步骤：
观察： 代理观察当前的环境和可用信息。
思考： 代理根据观察结果进行思考，例如分析问题、制定计划、评估风险等。
行动： 代理根据思考结果选择并执行一个行动。
更新状态： 代理更新其状态，包括观察、思考和行动的历史记录。
重复循环： 代理重复循环，直到任务完成或达到停止条件。
返回结果： 代理返回最终的结果或行动序列。
特点：

推理和行动结合： 将推理和行动结合起来，使代理能够更加智能地完成任务。
迭代式改进： 通过循环推理和行动，代理可以逐步改进其解决方案。
可解释性： 代理的思考过程和行动选择是透明的，可以进行解释和分析。
优势：

增强决策能力： 通过推理，代理可以做出更明智的决策。
提高任务完成率： 迭代式改进可以帮助代理找到更好的解决方案。
更好的可控性： 可以通过调整代理的思考和行动来控制其行为。
与其他工作器的区别：

FunctionCallingAgentWorker： 专注于使用函数调用来与工具交互。
ChainOfAbstractionAgentWorker： 使用抽象链来分解任务。
应用场景：

ReActAgentWorker 适用于需要复杂推理和决策的任务，例如：
问题解答
代码生成
文本摘要
对话
总而言之，ReActAgentWorker 是一种强大的代理工作器，它使用 ReAct 框架来进行推理和决策，使代理能够更加智能地完成复杂的任务。

# Prevent Disconnection

In [2]:
#@markdown <h3>← 输入了代码后运行以防止断开</h>
import IPython
from google.colab import output

display(IPython.display.Javascript('''
 function ClickConnect(){
   btn = document.querySelector("colab-connect-button")
   if (btn != null){
     console.log("Click colab-connect-button");
     btn.click()
     }

   btn = document.getElementById('ok')
   if (btn != null){
     console.log("Click reconnect");
     btn.click()
     }
  }

setInterval(ClickConnect,60000)
'''))

print("Done.")

<IPython.core.display.Javascript object>

Done.


In [None]:
function ConnectButton(){
    console.log("Connect pushed");
    document.querySelector("#connect").click()
}
setInterval(ConnectButton,60000);

## Setup

In order to create plans, we need a set of tools to create plans on top of. Here, we use some classic 10k examples.

In [1]:
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

--2024-09-14 02:06:08--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1880483 (1.8M) [application/octet-stream]
Saving to: ‘data/10k/uber_2021.pdf’


2024-09-14 02:06:09 (48.9 MB/s) - ‘data/10k/uber_2021.pdf’ saved [1880483/1880483]

--2024-09-14 02:06:09--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1440303 (1.4M) [appl

# LLM Setup

In [3]:
!pip install llama-index
!pip install llama_index-llms-openai_like
!pip install llama_index-embeddings-huggingface

Collecting llama-index
  Downloading llama_index-0.11.9-py3-none-any.whl.metadata (11 kB)
Collecting llama-index-agent-openai<0.4.0,>=0.3.1 (from llama-index)
  Downloading llama_index_agent_openai-0.3.1-py3-none-any.whl.metadata (677 bytes)
Collecting llama-index-cli<0.4.0,>=0.3.1 (from llama-index)
  Downloading llama_index_cli-0.3.1-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-core<0.12.0,>=0.11.9 (from llama-index)
  Downloading llama_index_core-0.11.9-py3-none-any.whl.metadata (2.4 kB)
Collecting llama-index-embeddings-openai<0.3.0,>=0.2.4 (from llama-index)
  Downloading llama_index_embeddings_openai-0.2.5-py3-none-any.whl.metadata (686 bytes)
Collecting llama-index-indices-managed-llama-cloud>=0.3.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.3.0-py3-none-any.whl.metadata (3.8 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_legacy-0.9.48.post3-py3-none-any.whl.metadata (8.5 kB)
Collecting 

In [4]:
import os
os.environ["ZHIPU_API_KEY"] = ""

In [14]:
import os
import logging
import sys
from llama_index.llms.openai_like import OpenAILike
from llama_index.core import Settings, ServiceContext
from llama_index.embeddings.huggingface import HuggingFaceEmbedding

# 配置日志
logging.basicConfig(stream=sys.stdout, level=logging.INFO)

# 定义DeepSpeed model
llm = OpenAILike(model="glm-4-flash",
                 api_base="https://open.bigmodel.cn/api/paas/v4/",
                 api_key=os.environ["ZHIPU_API_KEY"],
                 temperature=0.1,
                 is_function_calling_model=True,
                 is_chat_model=True)

# 配置环境
Settings.llm = llm

# 设置嵌入模型
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-zh-v1.5")
Settings.embed_model = embed_model
Settings.chunk_size = 256

if not llm.metadata.is_function_calling_model:
   raise ValueError(
        f"Model name {llm.metadata.model_name} does not support function calling API. "
        )

In [7]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.tools import QueryEngineTool

# Load documents, create tools
lyft_documents = SimpleDirectoryReader(
    input_files=["./data/10k/lyft_2021.pdf"]
).load_data()
uber_documents = SimpleDirectoryReader(
    input_files=["./data/10k/uber_2021.pdf"]
).load_data()

lyft_index = VectorStoreIndex.from_documents(lyft_documents)
uber_index = VectorStoreIndex.from_documents(uber_documents)

lyft_tool = QueryEngineTool.from_defaults(
    lyft_index.as_query_engine(),
    name="lyft_2021",
    description="Useful for asking questions about Lyft's 2021 10-K filling.",
)

uber_tool = QueryEngineTool.from_defaults(
    uber_index.as_query_engine(),
    name="uber_2021",
    description="Useful for asking questions about Uber's 2021 10-K filling.",
)

## High Level API

In this section, we cover the high-level API for creating with and chatting with a structured planning agent.

### Create the Agent

In [15]:
from llama_index.core.agent import (
    StructuredPlannerAgent,
    FunctionCallingAgentWorker,
    ReActAgentWorker,
)

# create the function calling worker for reasoning
worker = FunctionCallingAgentWorker.from_tools(
    [lyft_tool, uber_tool], verbose=True
)

# wrap the worker in the top-level planner
agent = StructuredPlannerAgent(
    worker, tools=[lyft_tool, uber_tool], verbose=True
)

### Give the agent a complex task

In [16]:
import nest_asyncio

nest_asyncio.apply()

In [17]:
response = agent.chat(
    "Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings."
)

No complex plan predicted. Defaulting to a single task plan.
=== Initial plan ===
default:
Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings. -> 
deps: []


> Running step 74520a7b-ceee-4c36-90cd-90a4199157e7. Step input: Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.
Added user message to memory: Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.
=== LLM Response ===
As of my last update, I don't have direct access to the specific 10-K filings for Lyft and Uber from 2021. However, I can provide a general summary of the types of risk factors that are typically included in such filings for ride-sharing companies like Lyft and Uber. These risk factors often include:

**Lyft's Key Risk Factors (2021):**

1. **Regulatory Changes:** Changes in regulations related to transportation services, labor laws, and data privacy could impact operations and profitability.
2. **Competition:** Intense competition from othe

In [18]:
print(str(response))

As of my last update, I don't have direct access to the specific 10-K filings for Lyft and Uber from 2021. However, I can provide a general summary of the types of risk factors that are typically included in such filings for ride-sharing companies like Lyft and Uber. These risk factors often include:

**Lyft's Key Risk Factors (2021):**

1. **Regulatory Changes:** Changes in regulations related to transportation services, labor laws, and data privacy could impact operations and profitability.
2. **Competition:** Intense competition from other ride-sharing and transportation companies could lead to reduced market share and increased costs.
3. **Driver Relations:** Issues with driver satisfaction and retention could affect service quality and availability.
4. **COVID-19 Impact:** The pandemic's impact on consumer travel patterns and economic conditions could lead to reduced demand for rides.
5. **Financial Performance:** Fluctuations in revenue and profitability due to changes in pricing

## Changing Prompts

The `StructuredPlanningAgent` has two key prompts:
1. The initial planning prompt
2. The plan refinement prompt

Below, we show how to configure these prompts, using the defaults as an example.

In [19]:
DEFAULT_INITIAL_PLAN_PROMPT = """\
Think step-by-step. Given a task and a set of tools, create a comprehesive, end-to-end plan to accomplish the task.
Keep in mind not every task needs to be decomposed into multiple sub-tasks if it is simple enough.
The plan should end with a sub-task that satisfies the overall task.

The tools available are:
{tools_str}

Overall Task: {task}
"""

DEFAULT_PLAN_REFINE_PROMPT = """\
Think step-by-step. Given an overall task, a set of tools, and completed sub-tasks, update (if needed) the remaining sub-tasks so that the overall task can still be completed.
The plan should end with a sub-task that satisfies the overall task.
If the remaining sub-tasks are sufficient, you can skip this step.

The tools available are:
{tools_str}

Overall Task:
{task}

Completed Sub-Tasks + Outputs:
{completed_outputs}

Remaining Sub-Tasks:
{remaining_sub_tasks}
"""

In [36]:
agent1 = StructuredPlannerAgent(
    worker,
    tools=[lyft_tool, uber_tool],
    initial_plan_prompt=DEFAULT_INITIAL_PLAN_PROMPT,
    plan_refine_prompt=DEFAULT_PLAN_REFINE_PROMPT,
    verbose=True,
)

In [37]:
response1 = agent1.chat(
    "Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings."
)

No complex plan predicted. Defaulting to a single task plan.
=== Initial plan ===
default:
Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings. -> 
deps: []


> Running step e49badb7-0e54-45af-9124-762c988019eb. Step input: Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.
Added user message to memory: Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.
=== LLM Response ===
As of my last update, I can provide a general summary of the key risk factors that Lyft and Uber highlighted in their 2021 10-K filings. Please note that the specific details may vary slightly between the two companies, and for the most accurate and up-to-date information, you should refer to their actual filings.

**Lyft's Key Risk Factors (2021 10-K):**

1. **Competition:** Lyft faces intense competition from other ride-sharing and transportation companies, including Uber, which has a larger market share and more resources.

2. **Regulator

In [38]:
print(str(response1))

As of my last update, I can provide a general summary of the key risk factors that Lyft and Uber highlighted in their 2021 10-K filings. Please note that the specific details may vary slightly between the two companies, and for the most accurate and up-to-date information, you should refer to their actual filings.

**Lyft's Key Risk Factors (2021 10-K):**

1. **Competition:** Lyft faces intense competition from other ride-sharing and transportation companies, including Uber, which has a larger market share and more resources.

2. **Regulatory Changes:** Changes in regulations related to transportation services, labor laws, and data privacy could impact Lyft's operations and profitability.

3. **Driver Relations:** Maintaining good relationships with drivers is crucial for Lyft's business model, and any issues could affect service quality and availability.

4. **COVID-19 Impact:** The pandemic has caused fluctuations in demand for ride-sharing services and could continue to affect the c

## Low-level API [Advanced]

In this section, we use the same agent, but expose the lower-level steps that are happening under the hood.

This is useful for when you want to expose the underlying plan, tasks, etc. to a human to modify them on the fly, or for debugging and running things step-by-step.

### Create the Agent

In [21]:
from llama_index.core.agent import (
    StructuredPlannerAgent,
    FunctionCallingAgentWorker,
    ReActAgentWorker,
)

# create the react worker for reasoning
worker = FunctionCallingAgentWorker.from_tools(
    [lyft_tool, uber_tool], verbose=True
)

# wrap the worker in the top-level planner
agent = StructuredPlannerAgent(
    worker, tools=[lyft_tool, uber_tool], verbose=True
)

### Create the initial tasks and plan

In [22]:
plan_id = agent.create_plan(
    "Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings."
)

No complex plan predicted. Defaulting to a single task plan.
=== Initial plan ===
default:
Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings. -> 
deps: []




### Inspect the initial tasks and plan

In [23]:
plan = agent.state.plan_dict[plan_id]

for sub_task in plan.sub_tasks:
    print(f"===== Sub Task {sub_task.name} =====")
    print("Expected output: ", sub_task.expected_output)
    print("Dependencies: ", sub_task.dependencies)

===== Sub Task default =====
Expected output:  
Dependencies:  []


### Execute the first set of tasks

Here, we execute the first set of tasks with their dependencies met.

In [24]:
next_tasks = agent.state.get_next_sub_tasks(plan_id)

for sub_task in next_tasks:
    print(f"===== Sub Task {sub_task.name} =====")
    print("Expected output: ", sub_task.expected_output)
    print("Dependencies: ", sub_task.dependencies)


for sub_task in next_tasks:
    response = agent.run_task(sub_task.name)
    agent.mark_task_complete(plan_id, sub_task.name)

===== Sub Task default =====
Expected output:  
Dependencies:  []
> Running step 736be661-c2fc-4532-b52c-72f396f44f66. Step input: Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.
Added user message to memory: Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.
=== LLM Response ===
As of my last update, I can provide a general summary of the key risk factors identified by Lyft and Uber in their 2021 10-K filings. Please note that the specific details may vary, and for the most accurate and up-to-date information, you should refer to the companies' actual filings.

**Lyft's Key Risk Factors (2021 10-K):**

1. **Competition:** Lyft faces intense competition from established players like Uber and new entrants in the ride-sharing and gig economy space. This competition could lead to reduced market share and increased costs.

2. **Regulatory Changes:** Changes in regulations, including those related to labor laws, could impact Lyft's busi

If we wanted to, we could even execute each task in a step-wise fashion. It would look something like this:

```python
# Step-wise execution per task

for sub_task in next_tasks:
    # get the task from the state
    task = agent.state.get_task(sub_task.name)

    # run intial resoning step
    step_output = agent.run_step(task.task_id)

    # loop until the last step is reached
    while not step_output.is_last:
        step_output = agent.run_step(task.task_id)
    
    # finalize the response and commit to memory
    agent.finalize_response(task.task_id, step_output=step_output)
```

### Check if we are done

If there are no remaining tasks, then we can stop. Otherwise, we can refine the current plan and continue

In [25]:
next_tasks = agent.get_next_tasks(plan_id)
print(len(next_tasks))

0


In [26]:
for sub_task in next_tasks:
    print(f"===== Sub Task {sub_task} =====")

### Refine the plan

Since we have tasks remaining, lets refine our plan to make sure we are on track.

In [27]:
# refine the plan
agent.refine_plan(
    "Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.",
    plan_id,
)

In [28]:
plan = agent.state.plan_dict[plan_id]

for sub_task in plan.sub_tasks:
    print(f"===== Sub Task {sub_task.name} =====")
    print("Expected output: ", sub_task.expected_output)
    print("Dependencies: ", sub_task.dependencies)

===== Sub Task default =====
Expected output:  
Dependencies:  []


### Loop until done

With our plan refined, we can repeat this process until we have no more tasks to run.

In [35]:
import asyncio

# Initialize responses as an empty list to ensure it's defined even if the loop doesn't run
responses = []

while True:
    # are we done?
    next_tasks = agent.get_next_tasks(plan_id)
    if len(next_tasks) == 0:
        break

    # run concurrently for better performance
    responses = await asyncio.gather(
        *[agent.run_task(task_id) for task_id in next_tasks]
    )
    for task_id in next_tasks:
        agent.mark_task_complete(plan_id, task_id)

    # refine the plan
    await agent.refine_plan(
        "Summarize the key risk factors for Lyft and Uber in their 2021 10-K filings.",
        plan_id,
    )

# Check if responses is not empty before accessing elements
if responses:
    print(str(responses[-1]))
else:
    print("No responses found.")

No responses found.


By the end, we should have a single response, which is our final response