# 如何为聊天机器人添加记忆功能

聊天机器人的一个关键特性是能够将先前对话内容作为上下文使用。这种状态管理可以采取多种形式，包括：

- 简单地将之前的消息填充到聊天模型的提示中。
- 上述方法的基础上，修剪旧消息以减少模型需要处理的干扰信息量。
- 更复杂的修改方式，例如为长时间运行的对话合成摘要。

我们将在下面详细介绍几种技术！

:::note

本操作指南之前构建了一个使用 [RunnableWithMessageHistory](https://v03.api.js.langchain.com/classes/_langchain_core.runnables.RunnableWithMessageHistory.html) 的聊天机器人。你可以在 [v0.2 文档](https://js.langchain.com/v0.2/docs/how_to/chatbots_memory/) 中找到该版本的教程。

与 `RunnableWithMessageHistory` 相比，LangGraph 实现提供了许多优势，包括能够持久化应用程序状态的任意组件（而不仅仅是消息）。

:::

## 准备工作

你需要安装一些包，选择你的聊天模型，并设置其环境变量。

```{=mdx}
import Npm2Yarn from "@theme/Npm2Yarn"

<Npm2Yarn>
  @langchain/core @langchain/langgraph
</Npm2Yarn>
```

让我们设置一个聊天模型，用于下面的示例。

```{=mdx}
import ChatModelTabs from "@theme/ChatModelTabs";

<ChatModelTabs />
```

## 消息传递

最简单的记忆形式是将聊天历史消息传递到一个链中。以下是一个示例：

In [22]:
// @lc-docs-hide-cell

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({ model: "gpt-4o" })

In [23]:
import { HumanMessage, AIMessage } from "@langchain/core/messages";
import {
  ChatPromptTemplate,
  MessagesPlaceholder,
} from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
  [
    "system",
    "You are a helpful assistant. Answer all questions to the best of your ability.",
  ],
  new MessagesPlaceholder("messages"),
]);

const chain = prompt.pipe(llm);

await chain.invoke({
  messages: [
    new HumanMessage(
      "Translate this sentence from English to French: I love programming."
    ),
    new AIMessage("J'adore la programmation."),
    new HumanMessage("What did you just say?"),
  ],
});

AIMessage {
  "id": "chatcmpl-ABSxUXVIBitFRBh9MpasB5jeEHfCA",
  "content": "I said \"J'adore la programmation,\" which means \"I love programming\" in French.",
  "additional_kwargs": {},
  "response_metadata": {
    "tokenUsage": {
      "completionTokens": 18,
      "promptTokens": 58,
      "totalTokens": 76
    },
    "finish_reason": "stop",
    "system_fingerprint": "fp_e375328146"
  },
  "tool_calls": [],
  "invalid_tool_calls": [],
  "usage_metadata": {
    "input_tokens": 58,
    "output_tokens": 18,
    "total_tokens": 76
  }
}


我们可以看到，通过将之前的对话传递给一个链式结构，它可以将其用作回答问题的上下文。这是聊天机器人记忆功能的基本概念——本指南的其余部分将演示传递或重新格式化消息的便捷技术。

## 自动历史记录管理

前面的示例显式地将消息传递给链（和模型）。这是一种完全可以接受的方法，但它确实需要对外部的新消息进行管理。LangChain 还提供了一种使用 LangGraph 的持久化功能来构建具有记忆能力的应用程序的方法。您可以通过在编译图时提供一个 `checkpointer` 来启用 LangGraph 应用程序中的持久化功能。

In [24]:
import { START, END, MessagesAnnotation, StateGraph, MemorySaver } from "@langchain/langgraph";


// Define the function that calls the model
const callModel = async (state: typeof MessagesAnnotation.State) => {
  const systemPrompt = 
    "You are a helpful assistant. " +
    "Answer all questions to the best of your ability.";
  const messages = [{ role: "system", content: systemPrompt }, ...state.messages];
  const response = await llm.invoke(messages);
  return { messages: response };
};

const workflow = new StateGraph(MessagesAnnotation)
// Define the node and edge
  .addNode("model", callModel)
  .addEdge(START, "model")
  .addEdge("model", END);

// Add simple in-memory checkpointer
// highlight-start
const memory = new MemorySaver();
const app = workflow.compile({ checkpointer: memory });
// highlight-end

我们会将最新的输入传递到此处的对话中，并让LangGraph使用检查点来跟踪对话历史记录：

In [25]:
await app.invoke(
  {
    messages: [
      {
        role: "user",
        content: "Translate to French: I love programming."
      }
    ]
  },
  {
    configurable: { thread_id: "1" }
  }
);

{
  messages: [
    HumanMessage {
      "id": "227b82a9-4084-46a5-ac79-ab9a3faa140e",
      "content": "Translate to French: I love programming.",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
      "content": "J'adore la programmation.",
      "additional_kwargs": {},
      "response_metadata": {
        "tokenUsage": {
          "completionTokens": 5,
          "promptTokens": 35,
          "totalTokens": 40
        },
        "finish_reason": "stop",
        "system_fingerprint": "fp_52a7f40b0b"
      },
      "tool_calls": [],
      "invalid_tool_calls": [],
      "usage_metadata": {
        "input_tokens": 35,
        "output_tokens": 5,
        "total_tokens": 40
      }
    }
  ]
}


In [26]:
await app.invoke(
  {
    messages: [
      {
        role: "user",
        content: "What did I just ask you?"
      }
    ]
  },
  {
    configurable: { thread_id: "1" }
  }
);

{
  messages: [
    HumanMessage {
      "id": "1a0560a4-9dcb-47a1-b441-80717e229706",
      "content": "Translate to French: I love programming.",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "chatcmpl-ABSxVrvztgnasTeMSFbpZQmyYqjJZ",
      "content": "J'adore la programmation.",
      "additional_kwargs": {},
      "response_metadata": {
        "tokenUsage": {
          "completionTokens": 5,
          "promptTokens": 35,
          "totalTokens": 40
        },
        "finish_reason": "stop",
        "system_fingerprint": "fp_52a7f40b0b"
      },
      "tool_calls": [],
      "invalid_tool_calls": []
    },
    HumanMessage {
      "id": "4f233a7d-4b08-4f53-bb60-cf0141a59721",
      "content": "What did I just ask you?",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "chatcmpl-ABSxVs5QnlPfbihTOmJrCVg1Dh7Ol",
      "content": "You asked me to translate \"I love programming\" into French

## 修改聊天历史

修改存储的聊天消息可以帮助你的聊天机器人处理各种情况。以下是一些示例：

### 裁剪消息

LLM 和聊天模型具有有限的上下文窗口，即使你没有直接触及限制，你可能也希望限制模型需要处理的干扰信息量。一种解决方案是在将消息传递给模型之前先对其进行裁剪。让我们以上面声明的 `app` 为例说明：

In [27]:
const demoEphemeralChatHistory = [
  { role: "user", content: "Hey there! I'm Nemo." },
  { role: "assistant", content: "Hello!" },
  { role: "user", content: "How are you today?" },
  { role: "assistant", content: "Fine thanks!" },
];

await app.invoke(
  {
    messages: [
      ...demoEphemeralChatHistory,
      { role: "user", content: "What's my name?" }
    ]
  },
  {
    configurable: { thread_id: "2" }
  }
);

{
  messages: [
    HumanMessage {
      "id": "63057c3d-f980-4640-97d6-497a9f83ddee",
      "content": "Hey there! I'm Nemo.",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "c9f0c20a-8f55-4909-b281-88f2a45c4f05",
      "content": "Hello!",
      "additional_kwargs": {},
      "response_metadata": {},
      "tool_calls": [],
      "invalid_tool_calls": []
    },
    HumanMessage {
      "id": "fd7fb3a0-7bc7-4e84-99a9-731b30637b55",
      "content": "How are you today?",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "09b0debb-1d4a-4856-8821-b037f5d96ecf",
      "content": "Fine thanks!",
      "additional_kwargs": {},
      "response_metadata": {},
      "tool_calls": [],
      "invalid_tool_calls": []
    },
    HumanMessage {
      "id": "edc13b69-25a0-40ac-81b3-175e65dc1a9a",
      "content": "What's my name?",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIM

我们可以看到应用记住了预加载的名称。

但假设我们有一个非常小的上下文窗口，我们希望将传递给模型的消息数量裁剪为仅保留最近的两条消息。我们可以使用内置的[trimMessages](/docs/how_to/trim_messages/)工具，在消息到达我们的提示词之前根据其令牌数对其进行裁剪。在这个例子中，我们将每条消息计为1个“令牌”，并且只保留最后两条消息：

In [28]:
import { START, END, MessagesAnnotation, StateGraph, MemorySaver } from "@langchain/langgraph";
import { trimMessages } from "@langchain/core/messages";

// Define trimmer
// highlight-start
// count each message as 1 "token" (tokenCounter: (msgs) => msgs.length) and keep only the last two messages
const trimmer = trimMessages({ strategy: "last", maxTokens: 2, tokenCounter: (msgs) => msgs.length });
// highlight-end

// Define the function that calls the model
const callModel2 = async (state: typeof MessagesAnnotation.State) => {
  // highlight-start
  const trimmedMessages = await trimmer.invoke(state.messages);
  const systemPrompt = 
    "You are a helpful assistant. " +
    "Answer all questions to the best of your ability.";
  const messages = [{ role: "system", content: systemPrompt }, ...trimmedMessages];
  // highlight-end
  const response = await llm.invoke(messages);
  return { messages: response };
};

const workflow2 = new StateGraph(MessagesAnnotation)
  // Define the node and edge
  .addNode("model", callModel2)
  .addEdge(START, "model")
  .addEdge("model", END);

// Add simple in-memory checkpointer
const app2 = workflow2.compile({ checkpointer: new MemorySaver() });

让我们调用这个新应用并检查响应

In [29]:
await app2.invoke(
  {
    messages: [
      ...demoEphemeralChatHistory,
      { role: "user", content: "What is my name?" }
    ]
  },
  {
    configurable: { thread_id: "3" }
  }
);

{
  messages: [
    HumanMessage {
      "id": "0d9330a0-d9d1-4aaf-8171-ca1ac6344f7c",
      "content": "What is my name?",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "3a24e88b-7525-4797-9fcd-d751a378d22c",
      "content": "Fine thanks!",
      "additional_kwargs": {},
      "response_metadata": {},
      "tool_calls": [],
      "invalid_tool_calls": []
    },
    HumanMessage {
      "id": "276039c8-eba8-4c68-b015-81ec7704140d",
      "content": "How are you today?",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "2ad4f461-20e1-4982-ba3b-235cb6b02abd",
      "content": "Hello!",
      "additional_kwargs": {},
      "response_metadata": {},
      "tool_calls": [],
      "invalid_tool_calls": []
    },
    HumanMessage {
      "id": "52213cae-953a-463d-a4a0-a7368c9ee4db",
      "content": "Hey there! I'm Nemo.",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AI

我们可以看到 `trimMessages` 被调用了，并且只有两个最近的消息会被传递给模型。在这种情况下，这意味着模型忘记了我们给它的名称。

查看更多内容请访问我们的[消息裁剪指南](/docs/how_to/trim_messages/)。

### 总结记忆

我们还可以以其他方式使用相同的模式。例如，我们可以在调用应用程序之前，使用额外的LLM调用来生成对话的摘要。让我们重新创建我们的聊天记录：

In [30]:
const demoEphemeralChatHistory2 = [
  { role: "user", content: "Hey there! I'm Nemo." },
  { role: "assistant", content: "Hello!" },
  { role: "user", content: "How are you today?" },
  { role: "assistant", content: "Fine thanks!" },
];

现在，让我们更新模型调用函数，将之前的交互内容提炼成一个摘要：

In [31]:
import { START, END, MessagesAnnotation, StateGraph, MemorySaver } from "@langchain/langgraph";
import { RemoveMessage } from "@langchain/core/messages";


// Define the function that calls the model
const callModel3 = async (state: typeof MessagesAnnotation.State) => {
  const systemPrompt = 
    "You are a helpful assistant. " +
    "Answer all questions to the best of your ability. " +
    "The provided chat history includes a summary of the earlier conversation.";
  const systemMessage = { role: "system", content: systemPrompt };
  const messageHistory = state.messages.slice(0, -1); // exclude the most recent user input
  
  // Summarize the messages if the chat history reaches a certain size
  if (messageHistory.length >= 4) {
    const lastHumanMessage = state.messages[state.messages.length - 1];
    // Invoke the model to generate conversation summary
    const summaryPrompt = 
      "Distill the above chat messages into a single summary message. " +
      "Include as many specific details as you can.";
    const summaryMessage = await llm.invoke([
      ...messageHistory,
      { role: "user", content: summaryPrompt }
    ]);

    // Delete messages that we no longer want to show up
    const deleteMessages = state.messages.map(m => new RemoveMessage({ id: m.id }));
    // Re-add user message
    const humanMessage = { role: "user", content: lastHumanMessage.content };
    // Call the model with summary & response
    const response = await llm.invoke([systemMessage, summaryMessage, humanMessage]);
    return { messages: [summaryMessage, humanMessage, response, ...deleteMessages] };
  } else {
    const response = await llm.invoke([systemMessage, ...state.messages]);
    return { messages: response };
  }
};

const workflow3 = new StateGraph(MessagesAnnotation)
  // Define the node and edge
  .addNode("model", callModel3)
  .addEdge(START, "model")
  .addEdge("model", END);

// Add simple in-memory checkpointer
const app3 = workflow3.compile({ checkpointer: new MemorySaver() });

看看它是否记得我们给它的名字：

In [32]:
await app3.invoke(
  {
    messages: [
      ...demoEphemeralChatHistory2,
      { role: "user", content: "What did I say my name was?" }
    ]
  },
  {
    configurable: { thread_id: "4" }
  }
);

{
  messages: [
    AIMessage {
      "id": "chatcmpl-ABSxXjFDj6WRo7VLSneBtlAxUumPE",
      "content": "Nemo greeted the assistant and asked how it was doing, to which the assistant responded that it was fine.",
      "additional_kwargs": {},
      "response_metadata": {
        "tokenUsage": {
          "completionTokens": 22,
          "promptTokens": 60,
          "totalTokens": 82
        },
        "finish_reason": "stop",
        "system_fingerprint": "fp_e375328146"
      },
      "tool_calls": [],
      "invalid_tool_calls": [],
      "usage_metadata": {
        "input_tokens": 60,
        "output_tokens": 22,
        "total_tokens": 82
      }
    },
    HumanMessage {
      "id": "8b1309b7-c09e-47fb-9ab3-34047f6973e3",
      "content": "What did I say my name was?",
      "additional_kwargs": {},
      "response_metadata": {}
    },
    AIMessage {
      "id": "chatcmpl-ABSxYAQKiBsQ6oVypO4CLFDsi1HRH",
      "content": "You mentioned that your name is Nemo.",
      "additional

请注意，再次调用应用程序将继续累积历史记录，直到达到指定的消息数量（在我们的例子中是四条）。此时，我们将根据初始摘要和新消息生成另一个摘要，依此类推。