# Memory Management

聊天机器人的一个主要特点是能使用以前的对话内容作为上下文。这种状态管理有多种形式，包括：

- 简单地将以前的信息塞进聊天模型提示中。
- 如上，但会修剪旧信息，以减少模型需要处理的干扰信息量。
- 更复杂的修改，如为长对话合成摘要。

## Introduction

一个记忆系统需要支持两个基本动作：读取和写入。回想一下，每个链定义了一些核心执行逻辑，期望特定的输入。这些输入中的一部分直接来自用户，但有些输入则可以来自于记忆。在一次给定的运行中，一个链将与它的记忆系统互动两次。

- 在接收初步用户输入但在执行核心逻辑之前，链会从其记忆系统读取并增强用户输入。
- 在执行核心逻辑之后但在返回答案之前，链将会将当前运行的输入和输出写入记忆中，以便在未来的运行中可以参照。

## Setup

In [None]:
%pip install --upgrade --quiet langchain langchain-google-genai

In [None]:
from google.colab import userdata
API_KEY = userdata.get('API_KEY')

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI

chat = ChatGoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=API_KEY)

## Message Passing

最简单的记忆形式就是将聊天记录信息传递到一个链中。下面是一个例子：

In [None]:
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | chat

response = chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)
response

AIMessage(content='I said "J\'adore la programmation" which is the French translation for "I love programming". \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-88045024-8232-4b31-905a-35fe7f321bbe-0')

我们可以看到，通过将之前的对话传递到链中，聊天机器人可以将其作为回答问题的上下文。这就是聊天机器人记忆的基本概念。

## Chat Message History

直接以数组形式存储和传递消息完全没问题，但我们也可以使用 LangChain 内置的消息历史记录类来存储和加载消息。

In [None]:
from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()
demo_ephemeral_chat_history.add_user_message(
    "Translate this sentence from English to French: I love programming."
)
demo_ephemeral_chat_history.add_ai_message("J'adore la programmation.")

demo_ephemeral_chat_history

InMemoryChatMessageHistory(messages=[HumanMessage(content='Translate this sentence from English to French: I love programming.'), AIMessage(content="J'adore la programmation.")])

In [None]:
demo_ephemeral_chat_history.messages

[HumanMessage(content='Translate this sentence from English to French: I love programming.'),
 AIMessage(content="J'adore la programmation.")]

我们可以直接用它来为我们的链存储多轮对话：

In [None]:
demo_ephemeral_chat_history = ChatMessageHistory()

input1 = "Translate this sentence from English to French: I love programming."
demo_ephemeral_chat_history.add_user_message(input1)

response = chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)
response

AIMessage(content="J'adore programmer. \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-5b427786-fbb8-47f8-9e4d-c6e6137c4df5-0')

In [None]:
print(response.content)

J'adore programmer. 



In [None]:
demo_ephemeral_chat_history.add_ai_message(response)

input2 = "What did I just ask you?"
demo_ephemeral_chat_history.add_user_message(input2)

response = chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)
response

AIMessage(content='You asked me to translate the sentence "I love programming" from English to French. \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-1d9afa50-1701-45b8-a23c-24217a20ed45-0')

In [None]:
print(response.content)

You asked me to translate the sentence "I love programming" from English to French. 



## Memory Types

### Conversation Buffer

`ConversationBufferMemory` 是一种极其简单的记忆形式，它所做的就是把聊天消息保存在内存中，并将这些消息输入到提示模板。

In [None]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")

memory

ConversationBufferMemory(chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='hi!'), AIMessage(content="what's up?")]))

In [None]:
memory.chat_memory

InMemoryChatMessageHistory(messages=[HumanMessage(content='hi!'), AIMessage(content="what's up?")])

In [None]:
memory.chat_memory.messages

[HumanMessage(content='hi!'), AIMessage(content="what's up?")]

另一种使用方式：

In [None]:
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
memory.save_context({"input": "hi"}, {"output": "whats up"})

memory.load_memory_variables({})

{'history': 'Human: hi\nAI: whats up'}

在这个示例中，你可以注意到 `load_memory_variables` 返回了一个名为 `history` 的键值。这意味着你的链（以及可能的输入提示词）可能会期望一个名为 `history` 的输入。一般而言，你可以通过在记忆类中设置参数来管理这个变量。例如，如果你希望记忆变量在 `chat_history` 关键字中返回，你可以这样做：

In [None]:
memory = ConversationBufferMemory(memory_key="chat_history")
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")

memory.load_memory_variables({})

{'chat_history': "Human: hi!\nAI: what's up?"}

最常见的一种使用记忆的方式是返回聊天信息的列表。这些信息可以整合成一个字符串返回（当要传入 LLMs 时 这种方式很有用）或者作为一个聊天消息的列表返回（在传入 ChatModels 时这种方式很有用）。

默认情况下，它们以一整串字符串的方式返回。为了以消息列表的形式返回，你可以设置 `return_messages=True`。

In [None]:
memory = ConversationBufferMemory(return_messages=True)

memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")

memory.load_memory_variables({})

{'history': [HumanMessage(content='hi!'), AIMessage(content="what's up?")]}

In [None]:
memory = ConversationBufferMemory(return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})

memory.load_memory_variables({})

{'history': [HumanMessage(content='hi'), AIMessage(content='whats up')]}

在链中使用：

In [None]:
from langchain_google_genai import GoogleGenerativeAI
from langchain.chains import ConversationChain

llm = GoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=API_KEY)
conversation = ConversationChain(
    llm=llm,
    verbose=True,
    memory=ConversationBufferMemory()
)

In [None]:
conversation.predict(input="Hi there!")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi there!
AI:[0m

[1m> Finished chain.[0m


'Hello! 👋 How can I help you today? 😊 \n'

### Conversation Buffer Window

`ConversationBufferWindowMemory` 跟踪并保存随时间发展的对话互动列表。它只保留最近的 K 次对话记录。这种做法有助于创建一个包含最新互动记录的滑动视窗，可以有效地避免缓存变得过大。

In [None]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})

memory.load_memory_variables({})

{'history': 'Human: not much you\nAI: not much'}

In [None]:
memory = ConversationBufferWindowMemory(k=1, return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})

memory.load_memory_variables({})

{'history': [HumanMessage(content='not much you'),
  AIMessage(content='not much')]}

### Entity

实体记忆在对话中记住了关于特定实体的既定事实。它提取关于实体的信息（使用一个 LLM）并且随着时间的推移建立起关于该实体的知识（也使用一个 LLM）。

In [None]:
from langchain.memory import ConversationEntityMemory

memory = ConversationEntityMemory(llm=llm)
_input = {"input": "Deven & Sam are working on a hackathon project"}
memory.load_memory_variables(_input)
memory

ConversationEntityMemory(llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
)), entity_cache=['Deven', 'Sam'])

In [None]:
memory.save_context(
    _input,
    {"output": " That sounds like a great project! What kind of project are they working on?"}
)
memory

ConversationEntityMemory(chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='Deven & Sam are working on a hackathon project'), AIMessage(content=' That sounds like a great project! What kind of project are they working on?')]), llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
)), entity_cache=['Deven', 'Sam'], entity_store=InMemoryEntityStore(store={'Deven': 'Updated summary: Deven is working on a hackathon project with Sam.', 'Sam': 'Updated summary: Sam is working on a hackathon project with Deven.'}))

In [None]:
memory.load_memory_variables({"input": 'who is Sam'})

{'history': 'Human: Deven & Sam are working on a hackathon project\nAI:  That sounds like a great project! What kind of project are they working on?',
 'entities': {'Sam': 'Updated summary: Sam is working on a hackathon project with Deven.'}}

In [None]:
memory = ConversationEntityMemory(llm=llm, return_messages=True)

_input = {"input": "Deven & Sam are working on a hackathon project"}
memory.load_memory_variables(_input)
memory.save_context(
    _input,
    {"output": " That sounds like a great project! What kind of project are they working on?"}
)
memory.load_memory_variables({"input": 'who is Sam'})



{'history': [HumanMessage(content='Deven & Sam are working on a hackathon project'),
  AIMessage(content=' That sounds like a great project! What kind of project are they working on?')],
 'entities': {'Sam': 'Updated summary:\nSam is working on a hackathon project with Deven.'}}

### Conversation Knowledge Graph

这种类型的记忆使用知识图谱来重建记忆。

In [None]:
from langchain.memory import ConversationKGMemory

memory = ConversationKGMemory(llm=llm)
memory.save_context({"input": "say hi to sam"}, {"output": "who is sam"})
memory.save_context({"input": "sam is a friend"}, {"output": "okay"})

memory.load_memory_variables({"input": "who is sam"})

{'history': 'On Sam: Sam is a friend.'}

In [None]:
memory = ConversationKGMemory(llm=llm, return_messages=True)
memory.save_context({"input": "say hi to sam"}, {"output": "who is sam"})
memory.save_context({"input": "sam is a friend"}, {"output": "okay"})

memory.load_memory_variables({"input": "who is sam"})

{'history': [SystemMessage(content='On Sam: Sam is a friend.')]}

我们也可以更加模块化地从一条新消息中获取当前实体（将使用之前的消息作为上下文）。

In [None]:
memory.get_current_entities("what's Sams favorite color?")

['Sams']

我们也可以更加模块化地从一条新消息中获取知识三元组（将使用之前的消息作为上下文）。

In [None]:
memory.get_knowledge_triplets("her favorite color is red")

[KnowledgeTriple(subject='her', predicate='favorite color', object_='red')]

### Conversation Summary

现在让我们来看一个略微复杂的记忆类型 `ConversationSummaryMemory`。这种记忆类型会随着时间的推移创建对话的总结。这对于压缩对话中随时间积累的信息是有用的。会话总结记忆在对话发生时总结内容，并将当前的总结存储在记忆中。这个记忆之后可以用来将到目前为止的对话总结注入到一个提示词/链中。这种记忆对长时间的对话最有用，因为如果直接在提示词中保持之前的消息历史会占用太多的 Token。

In [None]:
from langchain.memory import ConversationSummaryMemory, ChatMessageHistory

memory = ConversationSummaryMemory(llm=llm)
memory.save_context({"input": "hi"}, {"output": "whats up"})

memory.load_memory_variables({})

{'history': 'New summary:\nThe human greets the AI and the AI responds with an informal greeting in return. \n'}

In [None]:
memory = ConversationSummaryMemory(llm=llm, return_messages=True)
memory.save_context({"input": "hi"}, {"output": "whats up"})

memory.load_memory_variables({})

{'history': [SystemMessage(content='Current summary: \nThe human greeted the AI. The AI responded by asking what was happening. \n')]}

我们也可以直接利用 `predict_new_summary` 方法：

In [None]:
messages = memory.chat_memory.messages
messages

[HumanMessage(content='hi'), AIMessage(content='whats up')]

In [None]:
previous_summary = ""
memory.predict_new_summary(messages, previous_summary)

'Current summary:\nThe human greeted the AI. The AI responded by asking what was happening. \n'

#### Initializing with messages/existing summary

你可以轻松地用 `ChatMessageHistory` 初始化 `ConversationSummaryMemory`。在加载时，会自动生成一个总结。

In [None]:
history = ChatMessageHistory()
history.add_user_message("hi")
history.add_ai_message("hi there!")

memory = ConversationSummaryMemory.from_messages(
    llm=llm,
    chat_memory=history,
    return_messages=True
)

memory

ConversationSummaryMemory(llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
)), chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='hi'), AIMessage(content='hi there!')]), return_messages=True, buffer='Current summary: \nThe human greeted the AI. The AI returned the greeting. \n')

In [None]:
memory.buffer

'Current summary: \nThe human greeted the AI. The AI returned the greeting. \n'

你可以使用之前生成的总结来加速初始化，并通过直接初始化来避免重新生成总结。

In [None]:
memory = ConversationSummaryMemory(
    llm=llm,
    buffer="The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.",
    chat_memory=history,
    return_messages=True
)
memory

ConversationSummaryMemory(llm=GoogleGenerativeAI(model='gemini-1.5-pro-latest', google_api_key=SecretStr('**********'), client=genai.GenerativeModel(
    model_name='models/gemini-1.5-pro-latest',
    generation_config={},
    safety_settings={},
    tools=None,
    system_instruction=None,
)), chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='hi'), AIMessage(content='hi there!')]), return_messages=True, buffer='The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.')

### Conversation Token Buffer

`ConversationTokenBufferMemory` 在内存中保持了一段最近互动的缓存，并使用 Token 的长度而不是互动的数量来决定何时清除互动。

In [None]:
from langchain.memory import ConversationTokenBufferMemory

memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=10)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})

memory.load_memory_variables({})

{'history': 'Human: not much you\nAI: not much'}

### Conversation Summary Buffer

`ConversationSummaryBufferMemory` 融合了两种方法。它会在内存中保留最近交互的一个缓存，并不是简单地丢弃旧的交互记录，而是将它们汇总成一份摘要，然后同时使用缓存与摘要。此外，它根据 Token 的使用长度而非交互次数来决定什么时候从缓存中移除旧的交互信息。

In [None]:
from langchain.memory import ConversationSummaryBufferMemory

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=10)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})

memory.load_memory_variables({})

{'history': 'System: Current summary:\nThe human greets the AI. The AI responds with an informal greeting. \n\nHuman: not much you\nAI: not much'}

## 其它

### Automatic History Management

在之前的例子中，我们显式地将信息传递给处理流程。这种做法是可以接受的，但它需要额外管理新增的信息。LangChain 还提供了一个名为 `RunnableWithMessageHistory` 的 LCEL 链的包装器，能够自动管理这一流程。

为了演示其工作原理，我们对上述代码进行了小幅修改，以便使用最后的 `input` 变量，在聊天记录后填充一个 `HumanMessage` 模板。这意味着我们需要一个名为 `chat_history` 的参数，该参数需要包含当前消息之前的所有消息，而不包括当前消息本身。

In [None]:
chat = ChatGoogleGenerativeAI(model="gemini-1.5-pro-latest", google_api_key=API_KEY)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)

chain = prompt | chat

我们将在这里向对话传递最新输入，然后让 `RunnableWithMessageHistory` 类封装我们的链，并将 `input` 变量追加到聊天记录中。

In [None]:
from langchain_core.runnables.history import RunnableWithMessageHistory

demo_ephemeral_chat_history_for_chain = ChatMessageHistory()

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history_for_chain,
    input_messages_key="input",
    history_messages_key="chat_history",
)

除了我们要封装的链之外，该类还需要一些参数：

- 一个工厂方法，它根据给定的会话 ID 返回相关的消息历史。这使得您的处理流程能够同时为多个用户服务，通过为不同的对话加载对应的消息。
- `input_messages_key` 用于指定输入中的哪一部分内容需要被监测并记录在聊天历史里。在这个例子里，我们想要监测并记录作为 `input` 参数传入的那部分字符串。
- `history_messages_key` 用于指定先前消息应该如何被插入到提示中。在我们的提示中有一个名为 `chat_history` 的 `MessagesPlaceholder`，因此我们设置此属性以确保名称相对应。
- （对于那些生成多个结果的处理流程）一个 `output_messages_key` 设置项用于指定哪部分输出应记录为历史信息。它正好与 `input_messages_key` 相对应。

我们可以像平常一样使用这个新链，只需额外指定一个 `configurable` 字段，用以指明会话的特定 `session_id`，并将其传递给工厂方法。在这个演示中，我们并没有使用这个字段，但在真实应用场景中，你可能需要返回一个与传入的会话 ID 相匹配的聊天历史记录。

In [None]:
response = chain_with_message_history.invoke(
    {"input": "Translate this sentence from English to French: I love programming."},
    {"configurable": {"session_id": "unused"}},
)
response



AIMessage(content="J'adore programmer. \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-80dfa9c4-d6e2-4a61-aa6f-e5442179916c-0')

In [None]:
response = chain_with_message_history.invoke(
    {"input": "What did I just ask you?"}, {"configurable": {"session_id": "unused"}}
)
response



AIMessage(content='You asked me to translate the sentence "I love programming" from English to French. \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-44501353-255d-4183-abeb-c420cf64b783-0')

### Modifying Chat History

编辑保存的聊天记录能够使您的聊天机器人适应多种不同的情景。这里有几个例子：

#### Trimming Messages

LLMs 和 ChatModels 处理信息时有着数量上的限制。即便您没有触及这些限制，可能仍希望减少模型在解析时需要考虑的无关信息。一种做法是仅保留和处理最近接收到的 `n` 条消息。以下是一个包含若干预存消息的历史记录示例：

In [None]:
demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages

[HumanMessage(content="Hey there! I'm Nemo."),
 AIMessage(content='Hello!'),
 HumanMessage(content='How are you today?'),
 AIMessage(content='Fine thanks!')]

我们将这些历史消息和我们之前定义的 `RunnableWithMessageHistory` 数据处理流程一同使用：

In [None]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

response = chain_with_message_history.invoke(
    {"input": "What's my name?"},
    {"configurable": {"session_id": "unused"}},
)
response



AIMessage(content='Nemo! \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-5174d81b-e1bd-4cab-8b0e-47f689cb3c91-0')

我们可以观察到，处理流程保存了之前预先加载的名字信息。

但是如果我们面对的上下文空间非常有限，而我们又想减少传给处理流程的消息数量，保留为最近的两条信息。此时我们可以采用 `clear` 方法删除历史消息并重新将必要的信息加入历史记录。虽然这不是必需的，但为了确保这个方法总能被执行，我们可以将其放置在处理流程的开始位置：

In [None]:
from langchain_core.runnables import RunnablePassthrough


def trim_messages(chain_input):
    stored_messages = demo_ephemeral_chat_history.messages
    if len(stored_messages) <= 2:
        return False

    demo_ephemeral_chat_history.clear()

    for message in stored_messages[-2:]:
        demo_ephemeral_chat_history.add_message(message)

    return True


chain_with_trimming = (
    RunnablePassthrough.assign(messages_trimmed=trim_messages)
    | chain_with_message_history
)

In [None]:
response = chain_with_trimming.invoke(
    {"input": "How are you today?"},
    {"configurable": {"session_id": "unused"}},
)
response



AIMessage(content="I am an AI language model, so I don't have feelings like humans do. However, I'm here to assist you and make your day better in any way I can! How can I help you today? \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-027922c9-4014-4f15-8405-59808e0f2ef9-0')

In [None]:
demo_ephemeral_chat_history.messages

[HumanMessage(content="What's my name?"),
 AIMessage(content='Nemo! \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-5174d81b-e1bd-4cab-8b0e-47f689cb3c91-0'),
 HumanMessage(content='How are you today?'),
 AIMessage(content="I am an AI language model, so I don't have feelings like humans do. However, I'm here to assist you and make your day better in any way I can! How can I help you today? \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'H

通过这种方式，我们可以看到过去最早的两条消息已经从记录中删除，而最新的对话内容被添加到了历史记录的最后部分。当再次运行这个处理流程时，`trim_messages` 方法会被重新执行，且仅有两条最新的消息被传递给模型。在这种情况下，模型将在下一次执行时忘记我们之前给予的名字信息：

In [None]:
response = chain_with_trimming.invoke(
    {"input": "What is my name?"},
    {"configurable": {"session_id": "unused"}},
)
response



AIMessage(content="I don't have access to past conversations, so I don't know your name. Would you like to tell me? \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-3a1e833f-90d3-465d-a295-ba72de0bf773-0')

In [None]:
demo_ephemeral_chat_history.messages

[HumanMessage(content='How are you today?'),
 AIMessage(content="I am an AI language model, so I don't have feelings like humans do. However, I'm here to assist you and make your day better in any way I can! How can I help you today? \n", response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-027922c9-4014-4f15-8405-59808e0f2ef9-0'),
 HumanMessage(content='What is my name?'),
 AIMessage(content="I don't have access to past conversations, so I don't know your name. Would you like to tell me? \n", response_metadata={'prompt_feedback': {'block_re

#### Summary Memory

我们还可以将这种模式应用于其他场景。比如，我们可以在调用我们的链之前，利用一个额外的大语言模型来自动生成这段对话的概要。下面我们重新构建聊天历史记录和聊天机器人处理流程：

In [None]:
demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages

[HumanMessage(content="Hey there! I'm Nemo."),
 AIMessage(content='Hello!'),
 HumanMessage(content='How are you today?'),
 AIMessage(content='Fine thanks!')]

我们将稍微修改提示内容，让大语言模型 (LLM) 明白它要处理的是一个精简的概要，而不是详尽的聊天记录：

In [None]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability. "
            "The provided chat history includes facts about the user you are speaking with.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("user", "{input}"),
    ]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

现在，让我们编写一个可以从以往的交互过程中获取关键信息的函数。我们也可以将这个函数添加到链的前端：

In [None]:
def summarize_messages(chain_input):
    stored_messages = demo_ephemeral_chat_history.messages
    if len(stored_messages) == 0:
        return False
    summarization_prompt = ChatPromptTemplate.from_messages(
        [
            MessagesPlaceholder(variable_name="chat_history"),
            (
                "user",
                "Distill the above chat messages into a single summary message. Include as many specific details as you can.",
            ),
        ]
    )
    summarization_chain = summarization_prompt | chat

    summary_message = summarization_chain.invoke({"chat_history": stored_messages})

    demo_ephemeral_chat_history.clear()

    demo_ephemeral_chat_history.add_message(summary_message)

    return True


chain_with_summarization = (
    RunnablePassthrough.assign(messages_summarized=summarize_messages)
    | chain_with_message_history
)

让我们看看它是否记住了我们给它的名字：

In [None]:
response = chain_with_summarization.invoke(
    {"input": "What did I say my name was?"},
    {"configurable": {"session_id": "unused"}},
)
response



AIMessage(content='You said your name was Nemo. \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-bd2c4686-dc8f-47d4-a7a2-cfd19e4cd091-0')

In [None]:
demo_ephemeral_chat_history.messages

[AIMessage(content='Nemo greeted the assistant and asked how they were doing. The assistant replied that they were doing well. \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-431872aa-72b5-42b7-9cf5-effd32135343-0'),
 HumanMessage(content='What did I say my name was?'),
 AIMessage(content='You said your name was Nemo. \n', response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked':

请注意，每当再次运行这个处理流程时，它都会根据现有的摘要及新增的信息生成一个新的摘要。同样，您也可以采取一种折中的方式，在保留聊天历史记录中的一部分消息的同时，将其它部分消息制作成摘要。