# Memory Agent

## 回顾

我们创建了一个 chatbot,将语义记忆保存到单一的 [用户 profile](https://langchain-ai.github.io/langgraph/concepts/memory/#profile) 或 [collection](https://langchain-ai.github.io/langgraph/concepts/memory/#collection)。

我们介绍了 [Trustcall](https://github.com/hinthornw/trustcall) 作为更新这两种 schema 的方式。

## 目标

现在,我们将整合之前学到的所有知识点,构建一个具有长期记忆的 [agent](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)。

我们的 agent,`task_mAIstro`,将帮助我们管理 ToDo 列表!

我们之前构建的 chatbots *总是*反思对话并保存记忆。

`task_mAIstro` 将决定*何时*保存记忆(将项目添加到我们的 ToDo 列表)。

我们之前构建的 chatbots 总是保存一种类型的记忆,profile 或 collection。

`task_mAIstro` 可以决定保存到用户 profile 或 ToDo 项目的 collection。

除了语义记忆外,`task_mAIstro` 还将管理程序性记忆。

这允许用户更新他们创建 ToDo 项目的偏好设置。

In [19]:
%%capture --no-stderr
%pip install -U langchain_openai langgraph trustcall langchain_core

In [None]:
import os, getpass

def _set_env(var: str):
    """
    辅助函数 - 设置环境变量
    
    Python 知识点:
    - os.environ: Python 的环境变量字典
    - getpass.getpass(): 安全地提示用户输入密码
    """
    # 检查环境变量是否已在 OS 环境中设置
    env_value = os.environ.get(var)
    if not env_value:
        # 如果未设置,提示用户输入
        env_value = getpass.getpass(f"{var}: ")
    
    # 为当前进程设置环境变量
    os.environ[var] = env_value

# ================== LangSmith 配置 ==================
_set_env("LANGSMITH_API_KEY")
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langchain-academy"

In [None]:
_set_env("OPENAI_API_KEY")

## Trustcall 更新的可视化

Trustcall 创建和更新 JSON schemas。

如果我们想要了解 Trustcall 所做的*具体更改*呢?

例如,我们之前看到 Trustcall 有一些自己的 tools 来:

* 从验证失败中自我纠正 -- [查看 trace 示例](https://smith.langchain.com/public/5cd23009-3e05-4b00-99f0-c66ee3edd06e/r/9684db76-2003-443b-9aa2-9a9dbc5498b7) 
* 更新现有文档 -- [查看 trace 示例](https://smith.langchain.com/public/f45bdaf0-6963-4c19-8ec9-f4b7fe0f68ad/r/760f90e1-a5dc-48f1-8c34-79d6a3414ac3)

了解这些 tools 的可见性对于我们要构建的 agent 非常有用。

下面,我们将展示如何实现这一点!

In [None]:
from pydantic import BaseModel, Field

# ================== 定义 Memory Schema (用于演示 Spy) ==================

# 这些 schemas 用于演示 Trustcall Spy 功能
# 在后面的 agent 实现中会使用更复杂的 schemas

class Memory(BaseModel):
    """
    单个记忆条目 - 用于 Trustcall Spy 演示
    
    这是一个简单的 schema,用于展示:
    - Trustcall 如何创建记忆
    - Trustcall 如何更新记忆
    - Spy 如何捕获这些操作
    """
    content: str = Field(
        description="The main content of the memory. For example: User expressed interest in learning about French."
    )

class MemoryCollection(BaseModel):
    """
    记忆集合 - 包含多个 Memory
    
    用途:
    - with_structured_output 一次性提取多个记忆
    - 演示如何处理多个记忆条目
    """
    memories: list[Memory] = Field(
        description="A list of memories about the user."
    )

我们可以向 Trustcall extractor 添加一个 [listener](https://python.langchain.com/docs/how_to/lcel_cheatsheet/#add-lifecycle-listeners)。

这将把 extractor 执行过程中的 runs 传递给我们将定义的一个类 `Spy`。

我们的 `Spy` 类将提取关于 Trustcall 进行的 tool calls 的信息。

In [None]:
from trustcall import create_extractor
from langchain_openai import ChatOpenAI

# ================== Spy Class - 监听 Trustcall 的 Tool Calls ==================

class Spy:
    """
    Spy 类充当监听器(listener),用于捕获 Trustcall 执行过程中的所有 tool calls
    
    LangChain/LCEL 知识点:
    - Listener: LCEL (LangChain Expression Language) 的一个特性
    - 可以在 Runnable 的生命周期事件(如 on_start, on_end)上附加回调函数
    - 用于监控、日志记录、调试等目的
    
    Trustcall 知识点:
    - Trustcall 内部使用多个 tool calls 来创建和更新 schemas
    - 这些 tool calls 包括:
      1. Memory/ToDo/Profile: 创建新记忆
      2. PatchDoc: 更新现有记忆 (使用 JSON Patch)
      3. 验证和纠错 tools
    
    为什么需要 Spy:
    - 了解 Trustcall 的具体操作
    - 提取更新计划和变化内容
    - 向用户展示发生了什么变化
    - 用于 agent 中告知用户记忆更新情况
    """
    
    def __init__(self):
        """
        初始化 Spy
        
        called_tools: 存储所有捕获的 tool calls
        - 每次 Trustcall 执行后,这里会包含所有 tool calls
        - 格式: List[List[Dict]] - 外层列表对应每次 chat_model 调用,内层列表是该次调用的 tool calls
        """
        self.called_tools = []

    def __call__(self, run):
        """
        回调函数 - 在 Trustcall 执行结束时被调用
        
        参数:
        - run: Run 对象,包含执行的完整信息
          - run.child_runs: 子 runs 列表
          - run.run_type: run 的类型 (例如: "chat_model", "tool", "chain")
          - run.outputs: run 的输出
        
        算法: 广度优先搜索 (BFS)
        - 目的: 遍历整个 run 树,找到所有 chat_model runs
        - 为什么使用 BFS: Trustcall 的执行形成树状结构,BFS 可以系统地遍历所有节点
        """
        
        # Python 数据结构知识点:
        # - 使用列表作为队列实现 BFS
        # - q.pop(): 从队列尾部移除元素 (作为栈使用)
        # - q.extend(): 添加多个元素到队列
        
        # 初始化队列,包含根 run
        q = [run]
        
        # BFS 遍历
        while q:
            # 从队列中取出一个 run
            r = q.pop()
            
            # 如果有子 runs,添加到队列
            # 这样可以遍历整个执行树
            if r.child_runs:
                q.extend(r.child_runs)
            
            # 如果是 chat_model run,提取 tool calls
            if r.run_type == "chat_model":
                # 从 run 的输出中提取 tool calls
                # 路径: outputs -> generations -> [0] -> [0] -> message -> kwargs -> tool_calls
                # 
                # 数据结构:
                # - outputs: 包含生成结果
                # - generations: 生成的消息列表
                # - [0][0]: 第一个 generation 的第一个选项
                # - message: AIMessage 对象
                # - kwargs: 消息的关键字参数
                # - tool_calls: tool call 列表
                self.called_tools.append(
                    r.outputs["generations"][0][0]["message"]["kwargs"]["tool_calls"]
                )

# ================== 初始化组件 ==================

# 初始化 spy 实例
spy = Spy()

# 初始化 model
model = ChatOpenAI(model="gpt-4o", temperature=0)

# ================== 创建 Trustcall Extractor ==================

# 创建基础 extractor
trustcall_extractor = create_extractor(
    model,
    tools=[Memory],  # 使用 Memory schema
    tool_choice="Memory",  # 强制使用 Memory tool
    enable_inserts=True,  # 允许插入新记忆
)

# ================== 添加 Listener ==================

# LangChain 知识点:
# - with_listeners(): Runnable 的方法,用于添加生命周期监听器
# - on_end: 在 Runnable 执行结束时调用
# - on_start: 在 Runnable 执行开始时调用
# - on_error: 在 Runnable 执行出错时调用

# 为 extractor 添加 spy listener
# - 当 extractor 执行结束时,spy.__call__() 会被调用
# - spy 会捕获所有 tool calls 并保存到 spy.called_tools
trustcall_extractor_see_all_tool_calls = trustcall_extractor.with_listeners(on_end=spy)

# 使用场景:
# - 在 agent 中,我们需要知道 Trustcall 对 ToDo 列表做了什么更改
# - 通过 spy.called_tools,我们可以提取这些信息并展示给用户
# - 例如: "Document 0 updated: Added deadline 2024-11-30"

In [4]:
from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

# Instruction
instruction = """Extract memories from the following conversation:"""

# Conversation
conversation = [HumanMessage(content="Hi, I'm Lance."), 
                AIMessage(content="Nice to meet you, Lance."), 
                HumanMessage(content="This morning I had a nice bike ride in San Francisco.")]

# Invoke the extractor
result = trustcall_extractor.invoke({"messages": [SystemMessage(content=instruction)] + conversation})

In [5]:
# Messages contain the tool calls
for m in result["messages"]:
    m.pretty_print()

Tool Calls:
  Memory (call_NkjwwJGjrgxHzTb7KwD8lTaH)
 Call ID: call_NkjwwJGjrgxHzTb7KwD8lTaH
  Args:
    content: Lance had a nice bike ride in San Francisco this morning.


In [6]:
# Responses contain the memories that adhere to the schema
for m in result["responses"]: 
    print(m)

content='Lance had a nice bike ride in San Francisco this morning.'


In [7]:
# Metadata contains the tool call  
for m in result["response_metadata"]: 
    print(m)

{'id': 'call_NkjwwJGjrgxHzTb7KwD8lTaH'}


In [8]:
# Update the conversation
updated_conversation = [AIMessage(content="That's great, did you do after?"), 
                        HumanMessage(content="I went to Tartine and ate a croissant."),                        
                        AIMessage(content="What else is on your mind?"),
                        HumanMessage(content="I was thinking about my Japan, and going back this winter!"),]

# Update the instruction
system_msg = """Update existing memories and create new ones based on the following conversation:"""

# We'll save existing memories, giving them an ID, key (tool name), and value
tool_name = "Memory"
existing_memories = [(str(i), tool_name, memory.model_dump()) for i, memory in enumerate(result["responses"])] if result["responses"] else None
existing_memories

[('0',
  'Memory',
  {'content': 'Lance had a nice bike ride in San Francisco this morning.'})]

In [9]:
# Invoke the extractor with our updated conversation and existing memories
result = trustcall_extractor_see_all_tool_calls.invoke({"messages": updated_conversation, 
                                                        "existing": existing_memories})

In [14]:
# Metadata contains the tool call  
for m in result["response_metadata"]: 
    print(m)

{'id': 'call_bF0w0hE4YZmGyDbuJVe1mh5H', 'json_doc_id': '0'}
{'id': 'call_fQAxxRypV914Xev6nJ9VKw3X'}


In [10]:
# Messages contain the tool calls
for m in result["messages"]:
    m.pretty_print()

Tool Calls:
  Memory (call_bF0w0hE4YZmGyDbuJVe1mh5H)
 Call ID: call_bF0w0hE4YZmGyDbuJVe1mh5H
  Args:
    content: Lance had a nice bike ride in San Francisco this morning. Afterward, he went to Tartine and ate a croissant. He was also thinking about his trip to Japan and going back this winter.
  Memory (call_fQAxxRypV914Xev6nJ9VKw3X)
 Call ID: call_fQAxxRypV914Xev6nJ9VKw3X
  Args:
    content: Lance went to Tartine and ate a croissant. He was also thinking about his trip to Japan and going back this winter.


In [18]:
# Parsed responses
for m in result["responses"]:
    print(m)

content='Lance had a nice bike ride in San Francisco this morning. Afterward, he went to Tartine and ate a croissant. He was also thinking about his trip to Japan and going back this winter.'
content='Lance went to Tartine and ate a croissant. He was also thinking about his trip to Japan and going back this winter.'


In [12]:
# Inspect the tool calls made by Trustcall
spy.called_tools

[[{'name': 'PatchDoc',
   'args': {'json_doc_id': '0',
    'planned_edits': '1. Replace the existing content with the updated memory that includes the new activities: going to Tartine for a croissant and thinking about going back to Japan this winter.',
    'patches': [{'op': 'replace',
      'path': '/content',
      'value': 'Lance had a nice bike ride in San Francisco this morning. Afterward, he went to Tartine and ate a croissant. He was also thinking about his trip to Japan and going back this winter.'}]},
   'id': 'call_bF0w0hE4YZmGyDbuJVe1mh5H',
   'type': 'tool_call'},
  {'name': 'Memory',
   'args': {'content': 'Lance went to Tartine and ate a croissant. He was also thinking about his trip to Japan and going back this winter.'},
   'id': 'call_fQAxxRypV914Xev6nJ9VKw3X',
   'type': 'tool_call'}]]

In [None]:
def extract_tool_info(tool_calls, schema_name="Memory"):
    """
    从 tool calls 中提取信息,用于 patches 和新记忆
    
    功能:
    - 解析 Spy 捕获的 tool calls
    - 区分更新操作 (PatchDoc) 和创建操作 (Memory/ToDo/Profile)
    - 格式化为人类可读的变更说明
    
    参数:
        tool_calls: 来自 model 的 tool calls 列表 (spy.called_tools)
        schema_name: schema tool 的名称 (例如: "Memory", "ToDo", "Profile")
    
    返回:
        格式化的字符串,描述所有变更
    
    使用场景:
        在 agent 的 write_memory 节点中:
        1. Spy 捕获 Trustcall 的所有 tool calls
        2. 调用此函数解析 tool calls
        3. 将解析结果作为 ToolMessage 返回给 agent
        4. Agent 可以告知用户具体做了什么更改
    """

    # ================== 初始化变更列表 ==================
    changes = []
    
    # ================== 遍历 Tool Calls ==================
    # Python 知识点:
    # - tool_calls 是嵌套列表: [[call1, call2], [call3]]
    # - 外层列表: 每次 chat_model 调用
    # - 内层列表: 该次调用的多个 tool calls
    
    for call_group in tool_calls:
        for call in call_group:
            
            # ========== 处理 PatchDoc - 更新操作 ==========
            if call['name'] == 'PatchDoc':
                # PatchDoc 是 Trustcall 内部用于更新现有文档的 tool
                # 
                # call['args'] 结构:
                # - json_doc_id: 被更新文档的 ID
                # - planned_edits: 更新计划的文字描述
                # - patches: JSON Patch 操作列表
                #   - op: 操作类型 (replace, add, remove)
                #   - path: JSON 路径
                #   - value: 新值
                
                changes.append({
                    'type': 'update',  # 标记为更新操作
                    'doc_id': call['args']['json_doc_id'],  # 文档 ID
                    'planned_edits': call['args']['planned_edits'],  # 更新计划
                    'value': call['args']['patches'][0]['value']  # 新值
                })
                
            # ========== 处理 Memory/ToDo/Profile - 创建操作 ==========
            elif call['name'] == schema_name:
                # 这是创建新记忆的 tool call
                # call['args']: 新记忆的完整内容
                
                changes.append({
                    'type': 'new',  # 标记为创建操作
                    'value': call['args']  # 新记忆的内容
                })

    # ================== 格式化结果 ==================
    # Python 知识点:
    # - str.join(): 将列表元素连接为字符串
    # - 列表推导式: 遍历 changes 并格式化每个变更
    
    result_parts = []
    
    for change in changes:
        if change['type'] == 'update':
            # 格式化更新操作
            # 输出示例:
            # "Document 0 updated:
            #  Plan: Add a deadline for the task
            #  Added content: 2024-11-30T23:59:59"
            result_parts.append(
                f"Document {change['doc_id']} updated:\n"
                f"Plan: {change['planned_edits']}\n"
                f"Added content: {change['value']}"
            )
        else:
            # 格式化创建操作
            # 输出示例:
            # "New Memory created:
            #  Content: {'content': 'User likes biking'}"
            result_parts.append(
                f"New {schema_name} created:\n"
                f"Content: {change['value']}"
            )
    
    # 用空行连接所有变更
    return "\n\n".join(result_parts)

# ================== 演示使用 ==================

# 检查 spy.called_tools 以查看提取过程中发生了什么
# spy.called_tools 包含 Trustcall 所有的 tool calls
schema_name = "Memory"

# 调用 extract_tool_info 解析 tool calls
changes = extract_tool_info(spy.called_tools, schema_name)

# 打印格式化的变更信息
# 这就是 agent 会展示给用户的信息
print(changes)

# 实际输出示例:
# "Document 0 updated:
#  Plan: 1. Replace the existing content with the updated memory...
#  Added content: Lance had a nice bike ride in San Francisco this morning...
# 
#  New Memory created:
#  Content: {'content': 'Lance went to Tartine and ate a croissant...'}"

## 创建 agent

有许多不同的 [agent](https://langchain-ai.github.io/langgraph/concepts/high_level/) 架构可供选择。

在这里,我们将实现一个简单的架构,[ReAct](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/#react-implementation) agent。

这个 agent 将成为创建和管理 ToDo 列表的有用助手。

这个 agent 可以决定更新三种类型的长期记忆:

(a) 使用一般用户信息创建或更新用户 `profile`

(b) 在 ToDo 列表 `collection` 中添加或更新项目

(c) 更新其自己关于如何更新 ToDo 列表项目的 `instructions`

In [None]:
from typing import TypedDict, Literal

# ================== UpdateMemory Tool - Agent 的记忆更新决策 ==================

class UpdateMemory(TypedDict):
    """
    Agent 用于决定更新哪种类型记忆的 tool
    
    Agent 架构知识点:
    - ReAct Agent: Reason (推理) -> Act (行动)
    - Agent 需要 tools 来执行具体操作
    - UpdateMemory 是一个"元 tool",用于路由到不同的记忆更新节点
    
    工作流程:
    1. Agent (task_mAIstro) 分析用户消息
    2. 决定是否需要更新记忆
    3. 如果需要,调用 UpdateMemory tool 并指定 update_type
    4. Graph 的路由器 (route_message) 根据 update_type 路由到相应节点:
       - 'user' -> update_profile 节点
       - 'todo' -> update_todos 节点
       - 'instructions' -> update_instructions 节点
    
    三种记忆类型:
    
    1. 'user' - 用户 Profile (语义记忆)
       - 内容: 用户的个人信息 (姓名、位置、工作、家庭等)
       - Schema: Profile (Pydantic model)
       - 存储: ("profile", user_id) namespace
       - 特点: 单一对象,持续更新
       - 示例: "My name is Lance. I live in SF with my wife."
    
    2. 'todo' - ToDo 列表 (语义记忆 - Collection)
       - 内容: 任务列表,每个任务包含详细信息
       - Schema: ToDo (Pydantic model)
       - 存储: ("todo", user_id) namespace
       - 特点: 多个对象,可以添加、更新、删除
       - 示例: "Book swim lessons for the baby"
    
    3. 'instructions' - 创建 ToDo 的指令 (程序性记忆)
       - 内容: 用户对如何创建 ToDo 的偏好
       - Schema: 自由文本 (stored as {'memory': str})
       - 存储: ("instructions", user_id) namespace
       - 特点: 单一对象,指导 agent 行为
       - 示例: "Include specific local businesses when creating tasks"
    
    为什么需要这个 Tool:
    - 让 agent 智能决策: 不是总是保存记忆,而是根据上下文决定
    - 多类型记忆管理: 一个 agent 管理三种不同的记忆类型
    - 清晰的路由: 通过 update_type 明确指示应该更新哪种记忆
    """
    
    # Literal 类型提示: 限制 update_type 只能是这三个值之一
    # Python 知识点: Literal 提供编译时类型检查
    update_type: Literal['user', 'todo', 'instructions']

In [3]:
_set_env("OPENAI_API_KEY")

## Graph 定义

我们添加一个简单的路由器 `route_message`,它做出二进制决策是否保存记忆。

记忆 collection 的更新由 `write_memory` 节点中的 `Trustcall` 处理,和之前一样!

In [None]:
import uuid
from IPython.display import Image, display

from datetime import datetime
from trustcall import create_extractor
from typing import Optional
from pydantic import BaseModel, Field

from langchain_core.runnables import RunnableConfig
from langchain_core.messages import merge_message_runs, HumanMessage, SystemMessage

from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, MessagesState, END, START
from langgraph.store.base import BaseStore
from langgraph.store.memory import InMemoryStore

from langchain_openai import ChatOpenAI

# ================== 初始化 Model ==================

# 使用 GPT-4o 作为核心 model
# - temperature=0: 确保输出稳定和可预测
model = ChatOpenAI(model="gpt-4o", temperature=0)

# ================================================================================
# 第一部分: Memory Schemas 定义
# ================================================================================

"""
Memory Agent 管理三种类型的长期记忆:

1. Profile (用户信息) - 语义记忆
   - 单一对象,持续更新
   - 存储在 ("profile", user_id) namespace
   - 示例: 姓名、位置、工作、家庭成员

2. ToDo (任务列表) - 语义记忆 (Collection)
   - 多个对象,可添加、更新、删除
   - 存储在 ("todo", user_id) namespace
   - 每个任务包含详细信息 (任务、时间、截止日期、解决方案、状态)

3. Instructions (创建 ToDo 的指令) - 程序性记忆
   - 单一对象,指导 agent 行为
   - 存储在 ("instructions", user_id) namespace
   - 示例: "包含具体的本地商家"

这些记忆通过 Store (InMemoryStore) 跨线程(thread)持久化。
"""

# ==================== Profile Schema ====================

class Profile(BaseModel):
    """
    用户 Profile Schema - 用于存储用户的个人信息
    
    设计思路:
    - 所有字段都是 Optional: 用户可能逐步提供信息
    - default=None: 未提供的字段保持为 None
    - default_factory=list: 列表字段默认为空列表
    
    与 ToDo 的区别:
    - Profile: 单一对象,持续更新 (不使用 enable_inserts)
    - ToDo: 多个对象,可以添加新的 (使用 enable_inserts=True)
    
    Pydantic 知识点:
    - BaseModel: Pydantic 的基类,提供数据验证和序列化
    - Field: 为字段添加元数据(description)
    - Optional[str]: 表示字段可以是 str 或 None
    """
    name: Optional[str] = Field(
        description="The user's name",
        default=None
    )
    location: Optional[str] = Field(
        description="The user's location",
        default=None
    )
    job: Optional[str] = Field(
        description="The user's job",
        default=None
    )
    connections: list[str] = Field(
        description="Personal connection of the user, such as family members, friends, or coworkers",
        default_factory=list  # 默认为空列表
    )
    interests: list[str] = Field(
        description="Interests that the user has",
        default_factory=list
    )

# ==================== ToDo Schema ====================

class ToDo(BaseModel):
    """
    ToDo Schema - 用于存储单个任务的详细信息
    
    设计思路:
    - task: 必填字段 (没有 Optional)
    - time_to_complete: Optional - agent 估计完成时间
    - deadline: Optional - 用户指定的截止日期
    - solutions: 必填列表 - 至少包含一个解决方案
    - status: 有默认值的 Literal 类型 - 任务状态
    
    为什么需要 solutions 字段:
    - 让 agent 提供具体的、可操作的建议
    - 可以包含本地商家、服务提供商、具体方案
    - 示例: ["Contact La Petite Baleen Swim School", "Check with SF Recreation"]
    
    为什么使用 datetime:
    - deadline 使用 datetime 类型而非 str
    - 便于日期计算和比较
    - Trustcall 会自动处理日期格式转换
    
    Pydantic 知识点:
    - Literal: 限制字段只能是指定的几个值之一
    - min_items=1: 确保 solutions 至少有一个元素
    - datetime: Python 标准库的日期时间类型
    """
    task: str = Field(
        description="The task to be completed."
    )
    time_to_complete: Optional[int] = Field(
        description="Estimated time to complete the task (minutes)."
    )
    deadline: Optional[datetime] = Field(
        description="When the task needs to be completed by (if applicable)",
        default=None
    )
    solutions: list[str] = Field(
        description="List of specific, actionable solutions (e.g., specific ideas, service providers, or concrete options relevant to completing the task)",
        min_items=1,  # 至少需要一个解决方案
        default_factory=list
    )
    status: Literal["not started", "in progress", "done", "archived"] = Field(
        description="Current status of the task",
        default="not started"  # 新任务默认为 not started
    )

# ================== 创建 Profile Extractor ==================

"""
为什么 Profile 和 ToDo 使用不同的 extractor:

Profile Extractor:
- 不需要 enable_inserts (因为只有一个 profile)
- 总是更新同一个对象

ToDo Extractor:
- 需要 enable_inserts=True (可以添加多个 todos)
- 在 update_todos 节点中动态创建 (需要 Spy)

注意:
- profile_extractor 在这里全局创建
- todo_extractor 在 update_todos 节点中创建 (因为需要附加 spy listener)
"""

profile_extractor = create_extractor(
    model,
    tools=[Profile],  # 只使用 Profile schema
    tool_choice="Profile",  # 强制使用 Profile tool
    # 注意: 不需要 enable_inserts (Profile 是单一对象)
)

# ================================================================================
# 第二部分: Prompt Templates 定义
# ================================================================================

# ==================== Agent 主系统提示词 ====================

MODEL_SYSTEM_MESSAGE = """You are a helpful chatbot. 

You are designed to be a companion to a user, helping them keep track of their ToDo list.

You have a long term memory which keeps track of three things:
1. The user's profile (general information about them) 
2. The user's ToDo list
3. General instructions for updating the ToDo list

Here is the current User Profile (may be empty if no information has been collected yet):
<user_profile>
{user_profile}
</user_profile>

Here is the current ToDo List (may be empty if no tasks have been added yet):
<todo>
{todo}
</todo>

Here are the current user-specified preferences for updating the ToDo list (may be empty if no preferences have been specified yet):
<instructions>
{instructions}
</instructions>

Here are your instructions for reasoning about the user's messages:

1. Reason carefully about the user's messages as presented below. 

2. Decide whether any of the your long-term memory should be updated:
- If personal information was provided about the user, update the user's profile by calling UpdateMemory tool with type `user`
- If tasks are mentioned, update the ToDo list by calling UpdateMemory tool with type `todo`
- If the user has specified preferences for how to update the ToDo list, update the instructions by calling UpdateMemory tool with type `instructions`

3. Tell the user that you have updated your memory, if appropriate:
- Do not tell the user you have updated the user's profile
- Tell the user them when you update the todo list
- Do not tell the user that you have updated instructions

4. Err on the side of updating the todo list. No need to ask for explicit permission.

5. Respond naturally to user user after a tool call was made to save memories, or if no tool call was made."""

"""
MODEL_SYSTEM_MESSAGE 解读:

设计理念:
- 让 agent 成为用户的 ToDo 管理伙伴
- agent 主动决定何时更新记忆 (而非总是更新)
- 三种记忆类型有不同的更新策略

关键指令分析:

1. "Reason carefully" - 强调 ReAct 中的 Reason 部分
   - Agent 需要分析用户消息
   - 决定是否包含需要保存的信息

2. "Decide whether to update" - 决策逻辑
   - 个人信息 -> update profile
   - 任务相关 -> update todos
   - 偏好设置 -> update instructions
   - 通过调用 UpdateMemory tool 来路由

3. "Tell the user" - 用户体验考虑
   - Profile 更新: 不告知 (避免打扰)
   - ToDo 更新: 告知 (用户需要确认)
   - Instructions 更新: 不告知 (内部设置)

4. "Err on the side of updating" - 主动性
   - 倾向于保存 ToDo (无需明确许可)
   - 提高 agent 的主动性和实用性

5. "Respond naturally" - 对话流畅性
   - Tool call 之后自然回应
   - 不要让对话显得机械

Prompt Engineering 技巧:
- 使用 XML 标签 (<user_profile>, <todo>, <instructions>) 清晰分隔内容
- 使用编号列表提供清晰的推理步骤
- 使用 "if/then" 结构明确条件逻辑
- 提供具体的 do/don't 指南
"""

# ==================== Trustcall 系统提示词 ====================

TRUSTCALL_INSTRUCTION = """Reflect on following interaction. 

Use the provided tools to retain any necessary memories about the user. 

Use parallel tool calling to handle updates and insertions simultaneously.

System Time: {time}"""

"""
TRUSTCALL_INSTRUCTION 解读:

目的:
- 指导 Trustcall extractor 如何处理对话
- 提取和更新记忆

关键点:

1. "Reflect on following interaction"
   - Trustcall 需要分析整个对话历史
   - 提取值得保存的信息

2. "Use the provided tools"
   - tools 是 Pydantic schemas (Profile, ToDo)
   - Trustcall 会调用这些 tools 来创建/更新记忆

3. "Use parallel tool calling"
   - 同时处理多个更新和插入
   - 提高效率
   - 示例: 同时更新一个 ToDo 并创建一个新 ToDo

4. "System Time: {time}"
   - 提供当前时间上下文
   - 用于处理 deadline 等时间相关字段
   - datetime.now().isoformat() 格式

为什么需要单独的 Trustcall 指令:
- Agent 的系统提示词: 处理用户交互和决策
- Trustcall 指令: 专注于记忆提取和更新
- 职责分离,提高准确性
"""

# ==================== Instructions 更新提示词 ====================

CREATE_INSTRUCTIONS = """Reflect on the following interaction.

Based on this interaction, update your instructions for how to update ToDo list items. 

Use any feedback from the user to update how they like to have items added, etc.

Your current instructions are:

<current_instructions>
{current_instructions}
</current_instructions>"""

"""
CREATE_INSTRUCTIONS 解读:

目的:
- 让 agent 学习用户的偏好
- 动态调整 ToDo 创建方式
- 实现程序性记忆

关键点:

1. "Reflect on the following interaction"
   - 分析用户的反馈和指令
   - 提取关于偏好的信息

2. "Update your instructions"
   - 不是替换,而是更新
   - 累积学习用户偏好

3. "Use any feedback"
   - 用户可能显式提供反馈
   - 示例: "包含具体的本地商家"
   - 示例: "添加预计完成时间"

4. 提供 current_instructions 上下文
   - 让 model 了解现有指令
   - 避免丢失之前的偏好
   - 进行增量更新而非重写

使用场景:
- 用户: "创建 ToDo 时,包含具体的本地商家"
- Agent: 调用 UpdateMemory(update_type='instructions')
- update_instructions 节点: 使用此提示词更新指令
- 下次创建 ToDo: Agent 会遵循这个偏好

程序性记忆 vs 语义记忆:
- 语义记忆 (Profile, ToDo): 事实性信息
- 程序性记忆 (Instructions): 如何执行任务的知识
"""

# ================================================================================
# 第三部分: Node 定义 - Graph 的核心逻辑
# ================================================================================

"""
Memory Agent Graph 包含 4 个主要节点:

1. task_mAIstro (主 agent 节点)
   - 与用户交互
   - 加载所有三种记忆
   - 决定是否调用 UpdateMemory tool

2. update_profile (更新 profile 节点)
   - 使用 Trustcall 更新用户信息
   - 保存到 ("profile", user_id) namespace

3. update_todos (更新 todos 节点)
   - 使用 Trustcall 更新 ToDo 列表
   - 使用 Spy 捕获变更
   - 保存到 ("todo", user_id) namespace

4. update_instructions (更新指令节点)
   - 使用 model 生成新指令
   - 保存到 ("instructions", user_id) namespace

+ 1 个条件边:
- route_message: 根据 UpdateMemory 的 update_type 路由到相应节点
"""

# ==================== Node 1: task_mAIstro ====================

def task_mAIstro(state: MessagesState, config: RunnableConfig, store: BaseStore):
    """
    主 Agent 节点 - task_mAIstro (任务大师)
    
    职责:
    1. 从 store 加载所有三种长期记忆
    2. 将记忆格式化为系统提示词
    3. 调用 model 生成响应
    4. 决定是否调用 UpdateMemory tool
    
    参数:
        state: MessagesState - 包含对话历史
        config: RunnableConfig - 包含配置信息 (user_id, thread_id)
        store: BaseStore - 长期记忆存储
    
    返回:
        字典包含 "messages" 键,值为 model 的响应
    
    工作流程:
    用户消息 -> 加载记忆 -> 格式化提示词 -> model 生成响应 -> 返回响应
                                                        |
                                                        v
                                            (可能调用 UpdateMemory tool)
    
    LangGraph 知识点:
    - 所有节点函数都接收 (state, config, store) 三个参数
    - state: 当前 graph 的状态
    - config: 运行时配置 (thread_id, user_id 等)
    - store: 跨线程的持久化存储
    - 节点返回的字典会更新 state (这里更新 "messages")
    """

    # ========== 步骤 1: 获取 user_id ==========
    # Python 知识点:
    # - config["configurable"]: 包含用户提供的配置
    # - 用于构建 namespace,区分不同用户的记忆
    user_id = config["configurable"]["user_id"]

    # ========== 步骤 2: 加载 Profile 记忆 ==========
    # 为什么使用 search 而非 get:
    # - search(): 返回 namespace 下的所有项目
    # - get(): 需要知道具体的 key
    # - Profile 只有一个项目,但我们不知道 key (可能是 UUID)
    
    namespace = ("profile", user_id)
    memories = store.search(namespace)
    
    # 处理可能为空的情况
    if memories:
        # memories[0]: 第一个 (也是唯一一个) profile
        # .value: 获取存储的值 (Profile 的字典表示)
        user_profile = memories[0].value
    else:
        # 如果没有 profile,设为 None
        # 系统提示词会说明 "may be empty"
        user_profile = None

    # ========== 步骤 3: 加载 ToDo 记忆 ==========
    # Collection 模式: 可能有多个 ToDos
    
    namespace = ("todo", user_id)
    memories = store.search(namespace)
    
    # Python 知识点:
    # - 列表推导式: [expression for item in iterable]
    # - "\n".join(): 用换行符连接多个字符串
    # - f-string: f"{mem.value}" 格式化字符串
    
    # 将所有 ToDo 格式化为多行字符串
    # 示例输出:
    # {'task': 'Book swim lessons', ...}
    # {'task': 'Fix Yale lock', ...}
    todo = "\n".join(f"{mem.value}" for mem in memories)

    # ========== 步骤 4: 加载 Instructions 记忆 ==========
    # Instructions 是单一对象,类似 Profile
    
    namespace = ("instructions", user_id)
    memories = store.search(namespace)
    
    if memories:
        instructions = memories[0].value
    else:
        # 如果没有指令,设为空字符串
        instructions = ""
    
    # ========== 步骤 5: 格式化系统提示词 ==========
    # 将三种记忆插入到 MODEL_SYSTEM_MESSAGE 模板中
    system_msg = MODEL_SYSTEM_MESSAGE.format(
        user_profile=user_profile,
        todo=todo,
        instructions=instructions
    )

    # ========== 步骤 6: 调用 Model ==========
    # LangChain 知识点:
    # - bind_tools(): 为 model 绑定 tools
    # - parallel_tool_calls=False: 一次只调用一个 tool
    #   (确保 UpdateMemory 调用是顺序的)
    
    # 为什么只绑定 UpdateMemory:
    # - task_mAIstro 只需要决定更新哪种记忆
    # - 具体的记忆提取由各个 update 节点处理
    # - UpdateMemory 是"路由 tool",用于决策
    
    # 消息列表构成:
    # [SystemMessage(系统提示词), ...对话历史...]
    response = model.bind_tools(
        [UpdateMemory],
        parallel_tool_calls=False  # 顺序调用
    ).invoke([SystemMessage(content=system_msg)] + state["messages"])

    # ========== 步骤 7: 返回响应 ==========
    # LangGraph 知识点:
    # - 返回字典的 key 必须是 state schema 中的字段
    # - MessagesState 有 "messages" 字段
    # - 返回的 messages 会追加到 state["messages"]
    return {"messages": [response]}

# ==================== Node 2: update_profile ====================

def update_profile(state: MessagesState, config: RunnableConfig, store: BaseStore):
    """
    更新 Profile 节点
    
    职责:
    1. 从 store 加载现有 profile
    2. 格式化为 Trustcall 需要的格式
    3. 调用 profile_extractor 更新 profile
    4. 保存更新后的 profile 到 store
    5. 返回 ToolMessage 给 task_mAIstro
    
    工作流程:
    task_mAIstro 调用 UpdateMemory(update_type='user')
                    ↓
            route_message 路由到此节点
                    ↓
            加载现有 profile
                    ↓
            调用 Trustcall extractor
                    ↓
            保存更新后的 profile
                    ↓
            返回 ToolMessage
                    ↓
            返回 task_mAIstro
    
    为什么返回 ToolMessage:
    - task_mAIstro 调用了 UpdateMemory tool
    - LangChain 要求每个 tool call 都有对应的 ToolMessage
    - 这样 model 知道 tool 已执行完毕
    - task_mAIstro 可以基于此生成最终响应
    """
    
    # ========== 步骤 1: 获取 user_id ==========
    user_id = config["configurable"]["user_id"]

    # ========== 步骤 2: 定义 namespace ==========
    # Profile 存储在 ("profile", user_id)
    namespace = ("profile", user_id)

    # ========== 步骤 3: 检索现有 Profile ==========
    existing_items = store.search(namespace)

    # ========== 步骤 4: 格式化为 Trustcall 格式 ==========
    # Trustcall 的 "existing" 参数需要特定格式:
    # List[(key, tool_name, value)]
    #
    # - key: 文档的唯一标识符 (UUID 或自定义 ID)
    # - tool_name: schema 的名称 (这里是 "Profile")
    # - value: 文档的内容 (字典)
    #
    # 为什么需要这个格式:
    # - Trustcall 需要知道哪些文档已存在
    # - 对于已存在的文档,使用 PatchDoc 更新
    # - 对于不存在的文档,创建新的
    
    tool_name = "Profile"
    
    # Python 知识点:
    # - 条件表达式: value if condition else other_value
    # - 列表推导式: [(expr) for item in list]
    existing_memories = (
        [(existing_item.key, tool_name, existing_item.value)
         for existing_item in existing_items]
        if existing_items  # 如果有现有项目
        else None  # 否则为 None
    )

    # ========== 步骤 5: 准备 Trustcall 的输入消息 ==========
    # LangChain 知识点:
    # - merge_message_runs(): 合并连续的相同类型消息
    #   - 例如: [HumanMessage, HumanMessage] -> [HumanMessage]
    #   - 避免 API 错误 (一些 API 不允许连续相同类型)
    #
    # 为什么使用 state["messages"][:-1]:
    # - state["messages"][-1] 是 task_mAIstro 的响应
    # - 包含 UpdateMemory tool call
    # - Trustcall 不需要这个,只需要用户对话
    # - 我们只提取用户提供的信息
    
    # 格式化 Trustcall 指令,插入当前时间
    TRUSTCALL_INSTRUCTION_FORMATTED = TRUSTCALL_INSTRUCTION.format(
        time=datetime.now().isoformat()  # ISO 8601 格式: 2024-11-04T13:30:00
    )
    
    # 构建消息列表:
    # [SystemMessage(Trustcall 指令), ...对话历史(不含最后一条)...]
    updated_messages = list(
        merge_message_runs(
            messages=[SystemMessage(content=TRUSTCALL_INSTRUCTION_FORMATTED)] + 
                     state["messages"][:-1]
        )
    )

    # ========== 步骤 6: 调用 Trustcall Extractor ==========
    # profile_extractor 在全局已创建
    # 参数:
    # - messages: 用于提取的对话
    # - existing: 现有的 profile (如果有)
    result = profile_extractor.invoke({
        "messages": updated_messages,
        "existing": existing_memories
    })

    # ========== 步骤 7: 保存到 Store ==========
    # Python 知识点:
    # - zip(): 并行迭代多个列表
    #   - zip([1,2], ['a','b']) -> [(1,'a'), (2,'b')]
    #
    # result 结构:
    # - result["responses"]: Profile 对象列表
    # - result["response_metadata"]: 元数据列表 (包含 ID)
    #
    # 为什么需要 json_doc_id:
    # - 如果是更新操作,json_doc_id 是现有文档的 ID
    # - 如果是创建操作,没有 json_doc_id,生成新 UUID
    # - rmeta.get("json_doc_id", str(uuid.uuid4()))
    #   - 如果有 json_doc_id,使用它
    #   - 否则生成新 UUID
    
    for r, rmeta in zip(result["responses"], result["response_metadata"]):
        store.put(
            namespace,  # ("profile", user_id)
            rmeta.get("json_doc_id", str(uuid.uuid4())),  # key
            r.model_dump(mode="json"),  # value (Profile 的字典)
        )
    
    # ========== 步骤 8: 返回 ToolMessage ==========
    # LangChain 知识点:
    # - ToolMessage: tool 执行结果的消息类型
    # - 必须包含:
    #   - role: "tool"
    #   - content: 执行结果 (字符串)
    #   - tool_call_id: 对应的 tool call 的 ID
    #
    # 如何获取 tool_call_id:
    # - state['messages'][-1]: task_mAIstro 的最后一条消息
    # - .tool_calls: 该消息中的所有 tool calls
    # - [0]: 第一个 tool call (UpdateMemory)
    # - ['id']: tool call 的 ID
    
    tool_calls = state['messages'][-1].tool_calls
    
    # 返回简单的确认消息
    # agent 不会告知用户 profile 已更新 (根据系统提示词)
    return {"messages": [
        {
            "role": "tool",
            "content": "updated profile",  # 简单确认
            "tool_call_id": tool_calls[0]['id']  # 关联到 tool call
        }
    ]}

# ==================== Node 3: update_todos ====================

def update_todos(state: MessagesState, config: RunnableConfig, store: BaseStore):
    """
    更新 ToDo 列表节点
    
    职责:
    1. 从 store 加载现有 todos
    2. 创建 todo_extractor (带 Spy listener)
    3. 调用 extractor 更新 todos
    4. 保存更新后的 todos 到 store
    5. 使用 Spy 提取变更信息
    6. 返回详细的 ToolMessage 给 task_mAIstro
    
    与 update_profile 的主要区别:
    1. 使用 enable_inserts=True (允许添加新 todos)
    2. 使用 Spy listener 捕获变更
    3. 返回详细的变更描述 (而非简单的 "updated")
    
    为什么 todo_extractor 在这里创建:
    - 每次调用都需要新的 Spy 实例
    - Spy 需要附加到 extractor 的 listener
    - 而 profile_extractor 不需要 Spy,可以全局创建
    """
    
    # ========== 步骤 1: 获取 user_id ==========
    user_id = config["configurable"]["user_id"]

    # ========== 步骤 2: 定义 namespace ==========
    namespace = ("todo", user_id)

    # ========== 步骤 3: 检索现有 ToDos ==========
    existing_items = store.search(namespace)

    # ========== 步骤 4: 格式化为 Trustcall 格式 ==========
    tool_name = "ToDo"  # 注意: 这里是 "ToDo" 而非 "Profile"
    
    existing_memories = (
        [(existing_item.key, tool_name, existing_item.value)
         for existing_item in existing_items]
        if existing_items
        else None
    )

    # ========== 步骤 5: 准备 Trustcall 的输入消息 ==========
    TRUSTCALL_INSTRUCTION_FORMATTED = TRUSTCALL_INSTRUCTION.format(
        time=datetime.now().isoformat()
    )
    
    updated_messages = list(
        merge_message_runs(
            messages=[SystemMessage(content=TRUSTCALL_INSTRUCTION_FORMATTED)] + 
                     state["messages"][:-1]
        )
    )

    # ========== 步骤 6: 初始化 Spy ==========
    # 为什么每次都创建新 Spy:
    # - Spy.called_tools 会累积 tool calls
    # - 每次调用需要清空的 Spy
    # - 避免混淆不同调用的 tool calls
    spy = Spy()
    
    # ========== 步骤 7: 创建 ToDo Extractor (带 Spy) ==========
    # 为什么不在全局创建:
    # - 需要附加新的 Spy listener
    # - 每次调用的 Spy 不同
    
    # 与 profile_extractor 的区别:
    # - enable_inserts=True: 允许添加新 todos
    # - .with_listeners(on_end=spy): 附加 Spy listener
    todo_extractor = create_extractor(
        model,
        tools=[ToDo],  # 使用 ToDo schema
        tool_choice=tool_name,  # 强制使用 ToDo tool
        enable_inserts=True  # ✅ 关键区别: 允许插入新记忆
    ).with_listeners(on_end=spy)  # ✅ 附加 Spy listener

    # ========== 步骤 8: 调用 Trustcall Extractor ==========
    result = todo_extractor.invoke({
        "messages": updated_messages,
        "existing": existing_memories
    })

    # ========== 步骤 9: 保存到 Store ==========
    # 与 update_profile 逻辑相同
    for r, rmeta in zip(result["responses"], result["response_metadata"]):
        store.put(
            namespace,
            rmeta.get("json_doc_id", str(uuid.uuid4())),
            r.model_dump(mode="json"),
        )
        
    # ========== 步骤 10: 提取 Tool Calls (使用 Spy) ==========
    # 这是 update_todos 的关键区别点!
    
    # 获取 tool_call_id
    tool_calls = state['messages'][-1].tool_calls

    # ✅ 使用之前定义的 extract_tool_info 函数
    # - spy.called_tools: Spy 捕获的所有 tool calls
    # - tool_name: "ToDo"
    # - 返回格式化的变更描述
    todo_update_msg = extract_tool_info(spy.called_tools, tool_name)
    
    # 示例输出:
    # "New ToDo created:
    #  Content: {'task': 'Book swim lessons', ...}
    # 
    #  Document 0 updated:
    #  Plan: Add a deadline for the task
    #  Added content: 2024-11-30T23:59:59"
    
    # ========== 步骤 11: 返回详细的 ToolMessage ==========
    # 与 update_profile 的区别:
    # - content 不是简单的 "updated todos"
    # - 而是详细的变更描述
    # - task_mAIstro 可以将这些信息告知用户
    return {"messages": [
        {
            "role": "tool",
            "content": todo_update_msg,  # ✅ 详细的变更描述
            "tool_call_id": tool_calls[0]['id']
        }
    ]}

# ==================== Node 4: update_instructions ====================

def update_instructions(state: MessagesState, config: RunnableConfig, store: BaseStore):
    """
    更新 Instructions 节点
    
    职责:
    1. 从 store 加载现有 instructions
    2. 使用 CREATE_INSTRUCTIONS 提示词
    3. 调用 model 生成新 instructions
    4. 覆盖 store 中的 instructions
    5. 返回 ToolMessage 给 task_mAIstro
    
    与其他 update 节点的主要区别:
    1. **不使用 Trustcall**: 直接调用 model
    2. **不使用 Pydantic schema**: instructions 是自由文本
    3. **覆盖而非更新**: 使用固定的 key
    
    为什么不使用 Trustcall:
    - Instructions 是自由格式的文本
    - 不需要结构化提取
    - Model 可以直接生成更新后的 instructions
    """
    
    # ========== 步骤 1: 获取 user_id ==========
    user_id = config["configurable"]["user_id"]
    
    # ========== 步骤 2: 定义 namespace ==========
    namespace = ("instructions", user_id)

    # ========== 步骤 3: 获取现有 Instructions ==========
    # 为什么使用 get 而非 search:
    # - Instructions 使用固定的 key: "user_instructions"
    # - get(namespace, key): 直接通过 key 获取
    # - 比 search 更高效
    
    # BaseStore.get() 返回:
    # - 如果存在: Item 对象 (有 .value 属性)
    # - 如果不存在: None
    existing_memory = store.get(namespace, "user_instructions")
        
    # ========== 步骤 4: 格式化系统提示词 ==========
    # 插入现有 instructions (如果有)
    system_msg = CREATE_INSTRUCTIONS.format(
        current_instructions=existing_memory.value if existing_memory else None
    )
    
    # ========== 步骤 5: 调用 Model 生成新 Instructions ==========
    # LangChain 知识点:
    # - 直接调用 model.invoke(),不使用 bind_tools
    # - 不需要 tools,只需要生成文本
    #
    # 消息列表构成:
    # [
    #   SystemMessage(CREATE_INSTRUCTIONS 提示词),
    #   ...对话历史(不含最后一条)...,
    #   HumanMessage("Please update the instructions based on the conversation")
    # ]
    #
    # 为什么添加额外的 HumanMessage:
    # - 明确指示 model 需要做什么
    # - 触发 model 生成 instructions
    new_memory = model.invoke(
        [SystemMessage(content=system_msg)] + 
        state['messages'][:-1] +  # 对话历史 (不含 UpdateMemory tool call)
        [HumanMessage(content="Please update the instructions based on the conversation")]
    )
    
    # new_memory 是 AIMessage
    # new_memory.content 是生成的 instructions 文本

    # ========== 步骤 6: 保存到 Store ==========
    # 关键区别:
    # - 使用固定的 key: "user_instructions"
    # - 存储格式: {"memory": str}
    # - 每次都覆盖 (不是追加或更新)
    
    key = "user_instructions"
    
    # BaseStore.put() 参数:
    # - namespace: ("instructions", user_id)
    # - key: "user_instructions"
    # - value: {"memory": instructions_text}
    store.put(namespace, key, {"memory": new_memory.content})
    
    # ========== 步骤 7: 返回 ToolMessage ==========
    tool_calls = state['messages'][-1].tool_calls
    
    # 返回简单确认
    # agent 不会告知用户 instructions 已更新 (根据系统提示词)
    return {"messages": [
        {
            "role": "tool",
            "content": "updated instructions",
            "tool_call_id": tool_calls[0]['id']
        }
    ]}

# ================================================================================
# 第四部分: Conditional Edge - 路由逻辑
# ================================================================================

# ==================== Conditional Edge: route_message ====================

def route_message(
    state: MessagesState,
    config: RunnableConfig,
    store: BaseStore
) -> Literal[END, "update_todos", "update_instructions", "update_profile"]:
    """
    条件边 - 根据 UpdateMemory tool call 路由到相应节点
    
    职责:
    1. 检查 task_mAIstro 是否调用了 UpdateMemory tool
    2. 如果调用了,根据 update_type 路由到相应节点
    3. 如果没调用,结束对话
    
    参数:
        state: 当前 graph 状态
        config: 运行时配置
        store: 持久化存储
    
    返回:
        Literal 类型: END 或节点名称
        - END: 结束 graph 执行
        - "update_profile": 路由到 update_profile 节点
        - "update_todos": 路由到 update_todos 节点
        - "update_instructions": 路由到 update_instructions 节点
    
    LangGraph 知识点:
    - Conditional Edge: 根据函数返回值动态路由
    - 返回值必须是 Literal 类型 (类型安全)
    - END: LangGraph 的特殊节点,表示结束
    
    工作流程:
    task_mAIstro -> route_message -> (END / update_profile / update_todos / update_instructions)
                                            |              |              |               |
                                            v              v              v               v
                                          结束    回 task_mAIstro  回 task_mAIstro  回 task_mAIstro
    """
    
    # ========== 步骤 1: 获取最后一条消息 ==========
    # state['messages'][-1]: task_mAIstro 的响应
    message = state['messages'][-1]
    
    # ========== 步骤 2: 检查是否有 Tool Calls ==========
    # AIMessage 知识点:
    # - .tool_calls: 列表,包含该消息中的所有 tool calls
    # - 如果没有 tool calls,列表为空 (len() == 0)
    # - 如果有 tool calls,每个元素是字典:
    #   {
    #     'name': 'UpdateMemory',
    #     'args': {'update_type': 'user'},
    #     'id': 'call_xxx'
    #   }
    
    if len(message.tool_calls) == 0:
        # 没有 tool calls -> agent 决定不更新记忆 -> 结束对话
        return END
    else:
        # ========== 步骤 3: 提取 Tool Call ==========
        # 只取第一个 tool call (因为 parallel_tool_calls=False)
        tool_call = message.tool_calls[0]
        
        # ========== 步骤 4: 根据 update_type 路由 ==========
        # tool_call['args']: UpdateMemory 的参数字典
        # tool_call['args']['update_type']: 'user' / 'todo' / 'instructions'
        
        if tool_call['args']['update_type'] == "user":
            # 用户信息 -> update_profile
            return "update_profile"
        
        elif tool_call['args']['update_type'] == "todo":
            # ToDo 任务 -> update_todos
            return "update_todos"
        
        elif tool_call['args']['update_type'] == "instructions":
            # 偏好设置 -> update_instructions
            return "update_instructions"
        
        else:
            # 意外的 update_type -> 抛出错误
            # 这不应该发生 (因为 Literal 类型限制)
            raise ValueError(f"Unknown update_type: {tool_call['args']['update_type']}")

# ================================================================================
# 第五部分: Graph 构建 - 组装整个 Memory Agent
# ================================================================================

"""
Graph 架构总览:

        START
          ↓
    task_mAIstro  ←──────┐
          ↓               │
    route_message         │
       ↙  ↓  ↘           │
      /   |   \          │
     /    |    \         │
update  update  update   │
profile todos  instructions
     \    |    /         │
      \   |   /          │
       ↘  ↓  ↙           │
    task_mAIstro ─────────┘
          ↓
         END

说明:
1. 从 START 开始,进入 task_mAIstro
2. task_mAIstro 分析消息,可能调用 UpdateMemory tool
3. route_message 检查是否有 tool call:
   - 如果没有 -> END (结束)
   - 如果有 -> 根据 update_type 路由
4. 三个 update 节点更新对应的记忆
5. update 节点完成后,返回 task_mAIstro
6. task_mAIstro 生成最终响应,再次经过 route_message
7. 这次没有 tool call (已处理完),route_message 返回 END

为什么 update 节点返回 task_mAIstro:
- update 节点只返回 ToolMessage
- ToolMessage 不是用户可见的响应
- task_mAIstro 需要基于 ToolMessage 生成用户响应
- 例如: "I've updated your ToDo list with..."
"""

# ========== 步骤 1: 创建 StateGraph ==========
# LangGraph 知识点:
# - StateGraph: LangGraph 的核心类
# - 参数: State schema (这里是 MessagesState)
# - MessagesState: 包含 "messages" 字段的预定义 state
builder = StateGraph(MessagesState)

# ========== 步骤 2: 添加节点 ==========
# LangGraph 知识点:
# - add_node(name, function): 添加节点
#   - name: 可以省略,默认使用函数名
#   - function: 节点函数
# - 节点函数签名: (state, config, store) -> dict
builder.add_node(task_mAIstro)  # 主 agent 节点
builder.add_node(update_todos)  # 更新 todos 节点
builder.add_node(update_profile)  # 更新 profile 节点
builder.add_node(update_instructions)  # 更新 instructions 节点

# ========== 步骤 3: 添加边 ==========
# LangGraph 知识点:
# - add_edge(from, to): 添加固定边
#   - graph 从 from 节点自动流向 to 节点
# - add_conditional_edges(from, function): 添加条件边
#   - function 返回值决定流向哪个节点

# START -> task_mAIstro: 总是从 task_mAIstro 开始
builder.add_edge(START, "task_mAIstro")

# task_mAIstro -> route_message: 由 route_message 决定下一步
# route_message 可能返回: END / "update_profile" / "update_todos" / "update_instructions"
builder.add_conditional_edges("task_mAIstro", route_message)

# update_todos -> task_mAIstro: 更新完 todos 后,回到 task_mAIstro
builder.add_edge("update_todos", "task_mAIstro")

# update_profile -> task_mAIstro: 更新完 profile 后,回到 task_mAIstro
builder.add_edge("update_profile", "task_mAIstro")

# update_instructions -> task_mAIstro: 更新完 instructions 后,回到 task_mAIstro
builder.add_edge("update_instructions", "task_mAIstro")

# ========== 步骤 4: 初始化存储 ==========

# 长期记忆 (跨线程) - InMemoryStore
# LangGraph 知识点:
# - InMemoryStore: 内存中的 key-value 存储
# - 支持 namespace,用于组织不同类型的记忆
# - put(namespace, key, value): 存储
# - get(namespace, key): 获取
# - search(namespace): 搜索 namespace 下的所有项目
#
# 为什么叫 "across_thread_memory":
# - 这是长期记忆,跨越不同的 thread (会话)
# - 即使创建新 thread,仍可访问此存储
# - 示例: 用户的 profile 在所有会话中共享
across_thread_memory = InMemoryStore()

# 短期记忆 (线程内) - MemorySaver
# LangGraph 知识点:
# - MemorySaver: LangGraph 的 checkpointer
# - 保存 graph 的执行状态 (包括 messages)
# - 每个 thread_id 有独立的 checkpoint
# - 用于对话历史和会话状态
#
# 为什么叫 "within_thread_memory":
# - 这是短期记忆,仅在单个 thread (会话) 内
# - 每个 thread 有独立的对话历史
# - 切换到新 thread,对话历史不共享
within_thread_memory = MemorySaver()

# ========== 步骤 5: 编译 Graph ==========
# LangGraph 知识点:
# - compile(): 将 builder 编译为可执行的 graph
# - 参数:
#   - checkpointer: 短期记忆 (对话历史)
#   - store: 长期记忆 (跨会话数据)
# - 返回: CompiledGraph 对象,可以 invoke() 或 stream()
graph = builder.compile(
    checkpointer=within_thread_memory,  # 对话历史
    store=across_thread_memory  # 长期记忆
)

# ========== 步骤 6: 可视化 Graph ==========
# LangGraph 知识点:
# - get_graph(): 获取 graph 的可视化表示
# - xray=1: 显示内部节点结构
# - draw_mermaid_png(): 生成 Mermaid 图表的 PNG
#
# IPython 知识点:
# - display(): 在 Jupyter notebook 中显示内容
# - Image(): 显示图片
display(Image(graph.get_graph(xray=1).draw_mermaid_png()))

"""
总结: Memory Agent 的完整架构

1. **三种记忆类型**:
   - Profile: 用户信息 (单一对象)
   - ToDo: 任务列表 (Collection)
   - Instructions: 创建 ToDo 的偏好 (程序性记忆)

2. **双重记忆系统**:
   - 短期记忆 (MemorySaver): 对话历史,线程内
   - 长期记忆 (InMemoryStore): 跨线程数据

3. **四个核心节点**:
   - task_mAIstro: 主 agent,加载记忆,决策
   - update_profile: 更新用户信息
   - update_todos: 更新任务列表 (使用 Spy)
   - update_instructions: 更新偏好设置

4. **智能路由**:
   - UpdateMemory tool: agent 用于表达更新意图
   - route_message: 根据 update_type 路由到相应节点

5. **Trustcall 集成**:
   - Profile: 单一对象,不使用 enable_inserts
   - ToDo: Collection,使用 enable_inserts=True
   - Instructions: 不使用 Trustcall,直接调用 model

6. **可见性 (Spy)**:
   - 捕获 Trustcall 的 tool calls
   - 提取变更信息展示给用户
   - 增强用户体验和透明度

7. **ReAct 模式**:
   - Reason: task_mAIstro 分析消息
   - Act: 调用 UpdateMemory tool
   - 路由到 update 节点执行操作
   - 返回 task_mAIstro 生成响应

这个 Memory Agent 整合了 Module 5 的所有概念:
- Profile vs Collection
- Trustcall with/without enable_inserts
- Spy listener for visibility
- Dual memory system (checkpointer + store)
- Multi-type memory management
- ReAct agent architecture

它是一个功能完整的 ToDo 管理助手,可以:
- 记住用户信息
- 管理任务列表
- 学习用户偏好
- 跨会话保持记忆
"""

In [23]:
# We supply a thread ID for short-term (within-thread) memory
# We supply a user ID for long-term (across-thread) memory 
config = {"configurable": {"thread_id": "1", "user_id": "Lance"}}

# User input to create a profile memory
input_messages = [HumanMessage(content="My name is Lance. I live in SF with my wife. I have a 1 year old daughter.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


My name is Lance. I live in SF with my wife. I have a 1 year old daughter.
Tool Calls:
  UpdateMemory (call_rOuw3bLYjFFKuSVWsIHF27k5)
 Call ID: call_rOuw3bLYjFFKuSVWsIHF27k5
  Args:
    update_type: user

updated profile

Got it! How can I assist you today, Lance?


In [24]:
# User input for a ToDo
input_messages = [HumanMessage(content="My wife asked me to book swim lessons for the baby.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


My wife asked me to book swim lessons for the baby.
Tool Calls:
  UpdateMemory (call_VjLbRpbLqniJ8we2CNKQ0m3P)
 Call ID: call_VjLbRpbLqniJ8we2CNKQ0m3P
  Args:
    update_type: todo

New ToDo created:
Content: {'task': 'Book swim lessons for 1-year-old daughter.', 'time_to_complete': 30, 'solutions': ['Check local swim schools in SF', 'Look for baby swim classes online', 'Ask friends for recommendations'], 'status': 'not started'}

I've added "Book swim lessons for your 1-year-old daughter" to your ToDo list. If you need any help with that, just let me know!


In [25]:
# User input to update instructions for creating ToDos
input_messages = [HumanMessage(content="When creating or updating ToDo items, include specific local businesses / vendors.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


When creating or updating ToDo items, include specific local businesses / vendors.
Tool Calls:
  UpdateMemory (call_22w3V3Krhjf8WxDeH9YrQILa)
 Call ID: call_22w3V3Krhjf8WxDeH9YrQILa
  Args:
    update_type: instructions

updated instructions

Got it! I'll make sure to include specific local businesses or vendors in San Francisco when creating or updating your ToDo items. Let me know if there's anything else you need!


In [26]:
# Check for updated instructions
user_id = "Lance"

# Search 
for memory in across_thread_memory.search(("instructions", user_id)):
    print(memory.value)

{'memory': '<current_instructions>\nWhen creating or updating ToDo list items for Lance, include specific local businesses or vendors in San Francisco. For example, when adding a task like booking swim lessons, suggest local swim schools or classes in the area.\n</current_instructions>'}


In [27]:
# User input for a ToDo
input_messages = [HumanMessage(content="I need to fix the jammed electric Yale lock on the door.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


I need to fix the jammed electric Yale lock on the door.
Tool Calls:
  UpdateMemory (call_7ooNemi3d6qWMfjf2g2h97EF)
 Call ID: call_7ooNemi3d6qWMfjf2g2h97EF
  Args:
    update_type: todo

New ToDo created:
Content: {'task': 'Fix the jammed electric Yale lock on the door.', 'time_to_complete': 60, 'solutions': ['Contact a local locksmith in SF', "Check Yale's customer support for troubleshooting", 'Look for repair guides online'], 'status': 'not started'}

Document ed0af900-52fa-4f15-907c-1aed1e17b0ce updated:
Plan: Add specific local businesses or vendors to the solutions for booking swim lessons.
Added content: ['Check local swim schools in SF', 'Look for baby swim classes online', 'Ask friends for recommendations', 'Contact La Petite Baleen Swim School', 'Check with SF Recreation and Parks for classes']

I've added "Fix the jammed electric Yale lock on the door" to your ToDo list. If you need any specific recommendations or help, feel free to ask!


In [28]:
# Namespace for the memory to save
user_id = "Lance"

# Search 
for memory in across_thread_memory.search(("todo", user_id)):
    print(memory.value)

{'task': 'Book swim lessons for 1-year-old daughter.', 'time_to_complete': 30, 'deadline': None, 'solutions': ['Check local swim schools in SF', 'Look for baby swim classes online', 'Ask friends for recommendations', 'Contact La Petite Baleen Swim School', 'Check with SF Recreation and Parks for classes'], 'status': 'not started'}
{'task': 'Fix the jammed electric Yale lock on the door.', 'time_to_complete': 60, 'deadline': None, 'solutions': ['Contact a local locksmith in SF', "Check Yale's customer support for troubleshooting", 'Look for repair guides online'], 'status': 'not started'}


In [29]:
# User input to update an existing ToDo
input_messages = [HumanMessage(content="For the swim lessons, I need to get that done by end of November.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


For the swim lessons, I need to get that done by end of November.
Tool Calls:
  UpdateMemory (call_6AbsrTps4EPyD0gKBzkMIC90)
 Call ID: call_6AbsrTps4EPyD0gKBzkMIC90
  Args:
    update_type: todo

Document ed0af900-52fa-4f15-907c-1aed1e17b0ce updated:
Plan: Add a deadline for the swim lessons task to ensure it is completed by the end of November.
Added content: 2024-11-30T23:59:59

I've updated the swim lessons task with a deadline to be completed by the end of November. If there's anything else you need, just let me know!


我们可以看到 Trustcall 对现有记忆执行了 patching:

https://smith.langchain.com/public/4ad3a8af-3b1e-493d-b163-3111aa3d575a/r

In [30]:
# User input for a ToDo
input_messages = [HumanMessage(content="Need to call back City Toyota to schedule car service.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


Need to call back City Toyota to schedule car service.
Tool Calls:
  UpdateMemory (call_tDuYZL7njpwOkg2YMEcf6DDJ)
 Call ID: call_tDuYZL7njpwOkg2YMEcf6DDJ
  Args:
    update_type: todo

New ToDo created:
Content: {'task': 'Call back City Toyota to schedule car service.', 'time_to_complete': 10, 'solutions': ["Find City Toyota's contact number", 'Check car service availability', 'Prepare car details for service scheduling'], 'status': 'not started'}

Document a77482f0-d654-4b41-ab74-d6f2b343a969 updated:
Plan: Add specific local businesses or vendors to the solutions for fixing the jammed electric Yale lock.
Added content: Contact City Locksmith SF

I've added "Call back City Toyota to schedule car service" to your ToDo list. If you need any assistance with that, just let me know!


In [31]:
# Namespace for the memory to save
user_id = "Lance"

# Search 
for memory in across_thread_memory.search(("todo", user_id)):
    print(memory.value)

{'task': 'Book swim lessons for 1-year-old daughter.', 'time_to_complete': 30, 'deadline': '2024-11-30T23:59:59', 'solutions': ['Check local swim schools in SF', 'Look for baby swim classes online', 'Ask friends for recommendations', 'Contact La Petite Baleen Swim School', 'Check with SF Recreation and Parks for classes'], 'status': 'not started'}
{'task': 'Fix the jammed electric Yale lock on the door.', 'time_to_complete': 60, 'deadline': None, 'solutions': ['Contact a local locksmith in SF', "Check Yale's customer support for troubleshooting", 'Look for repair guides online', 'Contact City Locksmith SF', 'Visit SF Lock and Key for assistance'], 'status': 'not started'}
{'task': 'Call back City Toyota to schedule car service.', 'time_to_complete': 10, 'deadline': None, 'solutions': ["Find City Toyota's contact number", 'Check car service availability', 'Prepare car details for service scheduling'], 'status': 'not started'}


现在我们可以创建一个新线程。

这将创建一个新会话。

保存到长期记忆的 Profile、ToDos 和 Instructions 可以被访问。

In [32]:
# We supply a thread ID for short-term (within-thread) memory
# We supply a user ID for long-term (across-thread) memory 
config = {"configurable": {"thread_id": "2", "user_id": "Lance"}}

# Chat with the chatbot
input_messages = [HumanMessage(content="I have 30 minutes, what tasks can I get done?")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


I have 30 minutes, what tasks can I get done?

You can work on the following tasks that fit within your 30-minute timeframe:

1. **Book swim lessons for your 1-year-old daughter.** 
   - Estimated time to complete: 30 minutes
   - Solutions include checking local swim schools in SF, looking for baby swim classes online, asking friends for recommendations, contacting La Petite Baleen Swim School, or checking with SF Recreation and Parks for classes.

2. **Call back City Toyota to schedule car service.**
   - Estimated time to complete: 10 minutes
   - Solutions include finding City Toyota's contact number, checking car service availability, and preparing car details for service scheduling.

You can choose either of these tasks to complete within your available time.


In [33]:
# Chat with the chatbot
input_messages = [HumanMessage(content="Yes, give me some options to call for swim lessons.")]

# Run the graph
for chunk in graph.stream({"messages": input_messages}, config, stream_mode="values"):
    chunk["messages"][-1].pretty_print()


Yes, give me some options to call for swim lessons.

Here are some options you can consider for booking swim lessons for your 1-year-old daughter in San Francisco:

1. **La Petite Baleen Swim School**: Known for their baby swim classes, you can contact them to inquire about their schedule and availability.

2. **SF Recreation and Parks**: They often offer swim classes for young children. Check their website or contact them for more information.

3. **Local Swim Schools**: Search for other local swim schools in SF that offer baby swim classes. You might find some good options nearby.

4. **Ask Friends for Recommendations**: Reach out to friends or family in the area who might have experience with swim lessons for young children.

These options should help you get started on booking swim lessons.


Trace: 

https://smith.langchain.com/public/84768705-be91-43e4-8a6f-f9d3cee93782/r

## Studio

![Screenshot 2024-11-04 at 1.00.19 PM.png](https://cdn.prod.website-files.com/65b8cd72835ceeacd4449a53/6732cfb05d9709862eba4e6c_Screenshot%202024-11-11%20at%207.46.40%E2%80%AFPM.png)