# Introduction

## What is agent

“What was X corporation’s total revenue for FY 2022?”

Now consider a question like, “What were the three takeaways from the Q2 earnings call from FY 23? Focus on the technological moats that the company is building”. 

This inquiry requires planning, tailored focus, memory, using different tools, and breaking down a complex question into simpler sub-parts.. These concepts assembled together are essentially what we have come to refer to as an LLM Agent.  

---

“X 公司在2022财年的总收入是多少？”

现在再考虑这样一个问题：“从2023财年Q2盈利电话会议中得出的三个要点是什么？重点关注公司正在构建的技术壁垒。” 

这种查询需要规划、专注、记忆、使用不同工具，并将一个复杂问题分解为更简单的子部分。将这些概念组合在一起，基本上就形成了我们所谓的LLM Agent。

---


They combine thorough data analysis, strategic planning, data retrieval, and the ability to learn from past actions to solve complex issues.


LLM agents are advanced AI systems designed for creating complex text that needs sequential reasoning. They can think ahead, remember past conversations, and use different tools to adjust their responses based on the situation and style needed.


To complete these subtasks, the LLM agent requires a structured plan, a reliable memory to track progress, and access to necessary tools. These components form the backbone of an LLM agent’s workflow.

---

它们结合了深入的数据分析、战略规划、数据检索以及从过去行动中学习的能力来解决复杂问题。

LLM Agent是专为需要推理的复杂文本而设计的AI系统。它们可以构思、记住历史对话，并使用不同的工具根据所需的情况和风格调整其回应。

为了完成这些子任务，Agent需要一个结构化的计划，一个可靠的记忆来跟踪进展，并能够获取必要的工具。这些组件构成了Agent工作流程的支柱。

## LLM agent components


![](agent-overview.png)


LLM agents generally consist of four components:

### Agent/brain

- define general goals of the agent: overall goals and objectives for the agent.
- define tools for execution: Essentially a short list or a  “user manual” for all the tools to which the agent has access
- define explanation for how to make use of different planning modules: Details about the utility of different planning modules and which to use in what situation.
- define relevant memory: This is a dynamic section which fills the most relevant memory items from past conversations with the user at inference time. The “relevance” is determined using the question user asks.
- define persona of the agent (optional): This persona description is typically used to either bias the model to prefer using certain types of tools or to imbue typical idiosyncrasies in the agent’s final response. 

---

- 定义一般目标：整体目标和任务。
- 定义执行工具：可以访问的所有工具的简短列表或“用户手册”。
- 定义如何使用不同规划模块：详细说明不同规划模块的实用性以及在何种情况下使用哪些模块。
- 定义相关记忆：这是一个动态部分，在推理时从用户的历史对话中找出最相关的记忆项目。可以通过用户提出的问题来确定“相关性”。
- 定义人设（可选）：这个人设描述通常用于使模型更喜欢使用某些类型的工具或在最终回应中增加特殊性。


```
template = GENERAL INSTRUCTIONS

Your task is to answer questions. If you cannot ansswer the question, request a
helper or use a tool. Fill with Nil where no tool or helper is required.

AVAILABLE TOOLS
Search Tool
Math Tool

AVAILABLE HELPERS
Decomposition: Breaks Complex Questions down into simplesubparts

CONTEXTUAL INFORMATION
<No previousquestions asked>

QUESTION
How much did the revenue grow between Q1 of 2024 annd Q2 of 2024?

ANSWER FORMAT
"Tool_Request": "<Fill>", "Helper_Request "<Fill>"}
```

### Planning:


- plan formulation and plan reflection. Two effective methods for incorporating feedback in planning are ReAct and Reflexion.
- Task Decomposition (cot, tot,  LLM+P) and Self-Reflection (ReAct, Reflexion, Chain of Hindsight, Algorithm Distillation)

---

- 计划制定和计反思。在规划中整合反馈的两种有效方法是ReAct和Reflexion。
- 任务分解（cot、tot、LLM+P）和自我反思（ReAct、Reflexion、回顾链、蒸馏）


#### without feedback

The planning module helps to break down the necessary steps or subtasks the agent will solve individually to answer the user request. 


规划模块帮助Agent将需要解决的必要步骤或子任务分解为单独的部分，以回答用户的请求。


![](task-decomposition.webp)

#### with feedback

Enables the model to iteratively reflect and refine the execution plan based on past actions and observations.

The goal is to correct and improve on past mistakes which helps to improve the quality of final results. 

This is particularly important in complex real-world environments and tasks where trial and error are key to completing tasks. 

---

使模型能够基于过去的行动和观察，迭代地反思和完善执行计划。

目标是纠正和改进过去的错误，从而提高最终结果的质量。

在复杂的现实世界环境和任务中，特别重要，试错是完成任务的关键。


##### react

Integrates reasoning and acting within LLM by extending the action space to be a combination of task-specific discrete actions and the language space. 

通过将行为空间扩展为特定任务的离散行动和语言空间的组合，将推理和行动整合到LLM中。


![](react.png)


##### reflexion


Standard RL setup, the reward model provides a simple binary reward and the action space follows the setup in ReAct where the task-specific action space is augmented with language to enable complex reasoning steps. 


标准的强化学习，奖励模型提供简单的二元奖励，行动空间遵循ReAct中的设置，其中特定任务的行动空间通过语言扩展，以实现复杂的推理步骤。


![](reflexion.png)

### Memory


- Sensory Memory
- short-term
- long-term

---

- 感觉记忆
- 短期记忆
- 长期记忆

### Tool use

connect with external environments to perform certain tasks.


与外部环境连接以执行特定任务。

## What can LLM agents do?

- Advanced problem solving
- Self-reflection and improvemen: https://blog.langchain.dev/reflection-agents/
- Tool use
- Multi-agent framework: https://blog.langchain.dev/reflection-agents/

---

- 高级问题解决
- 自我反思和改进：https://blog.langchain.dev/reflection-agents/
- 工具使用
- 多代理框架：https://blog.langchain.dev/reflection-agents/

## LLM agent challenges

- Limited context
- Difficulty with long-term planning
- Inconsistent outputs
- Adapting to specific roles
- Prompt dependence (try and trick)
- Managing knowledge
- Cost and efficiency


---


- 有限的上下文
- 长期规划困难
- 输出不一致（不可靠、错误、不遵循指令等）
- 适应特定角色
- 依赖提示（尝试和技巧）
- 知识管理
- 成本和效率

## Homework

- 你认为Agent的长期记忆和短期记忆应该怎么设计？
- 请思考Human-in-the-Loop如何能应用在你的产品或设计中？

## Reference

- https://www.superannotate.com/blog/llm-agents
- https://lilianweng.github.io/posts/2023-06-23-agent/
- https://developer.nvidia.com/blog/introduction-to-llm-agents/
- https://www.promptingguide.ai/research/llm-agents