<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/memory/composable_memory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="在 Colab 中打开"/></a>


# 简单可组合的内存

这是一个简单的内存类，它可以用于创建具有不同存储和检索功能的内存对象。内存对象可以被组合在一起，以创建更复杂的存储结构。


在这个笔记本中，我们演示了如何将多个记忆源注入到一个agent中。具体来说，我们使用了`SimpleComposableMemory`，它由一个`primary_memory`和可能有几个次要记忆源（存储在`secondary_memory_sources`中）组成。主要区别在于`primary_memory`将被用作agent的主要聊天缓冲区，而从`secondary_memory_sources`中检索到的任何消息将只被注入到系统提示消息中。

多个记忆源可能在以下情况下很有用，例如在您有一个长期记忆（如`VectorMemory`），您希望将其与默认的`ChatMemoryBuffer`一起使用。在这个笔记本中，您将看到使用`SimpleComposableMemory`，您将能够有效地将来自长期记忆的所需消息“加载”到主要记忆中（即`ChatMemoryBuffer`）。


## `SimpleComposableMemory`是如何工作的？


我们从`SimpleComposableMemory`的基本用法开始。在这里，我们构建了一个`VectorMemory`以及一个默认的`ChatMemoryBuffer`。`VectorMemory`将是我们的次要内存源，而`ChatMemoryBuffer`将是主要的内存源。要实例化一个`SimpleComposableMemory`对象，我们需要提供一个`primary_memory`，以及（可选地）一个`secondary_memory_sources`列表。


![SimpleComposableMemoryIllustration](https://d3ddy8balm3goa.cloudfront.net/llamaindex/simple-composable-memory.excalidraw.svg)


In [None]:
from llama_index.core.memory import (    VectorMemory,  # 向量内存    SimpleComposableMemory,  # 简单可组合内存    ChatMemoryBuffer,  # 聊天内存缓存)from llama_index.core.llms import ChatMessage  # 聊天消息from llama_index.embeddings.openai import OpenAIEmbedding  # OpenAI嵌入vector_memory = VectorMemory.from_defaults(    vector_store=None,  # 将其保留为None以使用默认的内存向量存储    embed_model=OpenAIEmbedding(),  # 嵌入模型    retriever_kwargs={"similarity_top_k": 1},  # 检索器参数)# 让我们在我们的次要向量内存中设置一些初始消息msgs = [    ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),  # 你是一个有点有帮助的助手    ChatMessage.from_str("Bob likes burgers.", "user"),  # Bob喜欢汉堡    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),  # 的确，Bob喜欢苹果    ChatMessage.from_str("Alice likes apples.", "user"),  # Alice喜欢苹果]vector_memory.set(msgs)chat_memory_buffer = ChatMemoryBuffer.from_defaults()composable_memory = SimpleComposableMemory.from_defaults(    primary_memory=chat_memory_buffer,  # 主要内存    secondary_memory_sources=[vector_memory],  # 次要内存来源)

In [None]:
composable_memory.primary_memory

ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))

In [None]:
composable_memory.secondary_memory_sources

[VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x137b912a0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='288b0ef3-570e-4698-a1ae-b3531df66361', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'content': 'Alice likes apples.', 'additional_kwargs': {}}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, text='Alice likes apples.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'))]

### 将消息放入内存中


由于`SimpleComposableMemory`本身是`BaseMemory`的子类，因此我们可以像对待其他内存模块一样向其中添加消息。需要注意的是，对于`SimpleComposableMemory`，调用`.put()`实际上会在所有内存源上调用`.put()`。换句话说，消息会被添加到`primary`和`secondary`源中。


In [None]:
msgs = [
    ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
    ChatMessage.from_str("Jerry likes juice.", "user"),
]

In [None]:
# 将所有内存源模块加载for m in msgs:    composable_memory.put(m)

### 从内存中获取`get()`消息


当调用`.get()`时，我们类似地执行`primary`内存的所有`.get()`方法以及所有`secondary`来源的方法。这给我们留下了一系列消息列表，我们必须将其“组合”成一个合理的消息集（以传递到下游代理）。一般情况下，必须特别小心，以确保最终的消息序列既合理又符合LLM提供程序的聊天API。

对于`SimpleComposableMemory`，我们**将系统消息中的`secondary`来源的消息注入到`primary`内存的消息中**。`primary`来源的其余消息历史保持不变，最终返回的就是这种组合。


In [None]:
msgs = composable_memory.get("What does Bob like?")
msgs

 ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]

In [None]:
# 查看注入到主内存系统消息中的内存print(msgs[0])

system: You are a REALLY helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: Bob likes burgers.
	ASSISTANT: Indeed, Bob likes apples.


This is the end of the retrieved message dialogues.


### 连续调用 `get()`


连续调用`get()`将简单地替换系统提示中加载的`secondary`内存消息。


In [None]:
msgs = composable_memory.get("What does Alice like?")
msgs

 ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]

In [None]:
# 查看注入到主内存系统消息中的内存print(msgs[0])

system: You are a REALLY helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: Alice likes apples.


This is the end of the retrieved message dialogues.


### 如果`get()`方法检索到已经存在于`primary`内存中的`secondary`消息会怎么样？


如果从`secondary`内存中检索到的消息已经存在于`primary`内存中，那么这些多余的`secondary`消息将不会被添加到系统消息中。在下面的例子中，消息"Jerry likes juice."被`put`到了所有内存源中，因此系统消息没有发生改变。


In [None]:
msgs = composable_memory.get("What does Jerry like?")
msgs

[ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.', additional_kwargs={}),
 ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]

### 如何“重置”内存


与其他方法`put()`和`get()`类似，调用`reset()`将在`primary`和`secondary`内存源上执行`reset()`。如果您只想重置`primary`，那么应该只从它调用`reset()`方法。


#### `reset()` 仅重置主存储器


In [None]:
composable_memory.primary_memory.reset()

In [None]:
composable_memory.primary_memory.get()

[]

In [None]:
composable_memory.secondary_memory_sources[0].get("What does Alice like?")

[ChatMessage(role=<MessageRole.USER: 'user'>, content='Alice likes apples.', additional_kwargs={})]

#### `reset()` 重置所有内存资源


In [None]:
composable_memory.reset()

In [None]:
composable_memory.primary_memory.get()

[]

In [None]:
composable_memory.secondary_memory_sources[0].get("What does Alice like?")

[]

## 使用`SimpleComposableMemory`与代理程序


在这里，我们将使用一个带有代理的`SimpleComposableMemory`，并演示如何使用来自一个代理对话的消息作为另一个代理会话中的一部分的次要、长期记忆来源。


In [None]:
from llama_index.llms.openai import OpenAI
from llama_index.core.tools import FunctionTool
from llama_index.core.agent import FunctionCallingAgentWorker

import nest_asyncio

nest_asyncio.apply()

### 定义我们的内存模块


In [None]:
vector_memory = VectorMemory.from_defaults(    vector_store=None,  # 将其保留为None以使用默认的内存向量存储    embed_model=OpenAIEmbedding(),  # 嵌入模型    retriever_kwargs={"similarity_top_k": 2},  # 检索器参数)chat_memory_buffer = ChatMemoryBuffer.from_defaults()  # 从默认值创建聊天记忆缓冲区composable_memory = SimpleComposableMemory.from_defaults(    primary_memory=chat_memory_buffer,  # 主要记忆    secondary_memory_sources=[vector_memory],  # 次要记忆来源)

### 定义我们的Agent


In [None]:
def multiply(a: int, b: int) -> int:    """将两个整数相乘并返回结果整数"""    return a * bdef mystery(a: int, b: int) -> int:    """对两个数字进行神秘操作"""    return a**2 - b**2multiply_tool = FunctionTool.from_defaults(fn=multiply)mystery_tool = FunctionTool.from_defaults(fn=mystery)

In [None]:
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_worker = FunctionCallingAgentWorker.from_tools(
    [multiply_tool, mystery_tool], llm=llm, verbose=True
)
agent = agent_worker.as_agent(memory=composable_memory)

### 执行一些函数调用


当调用`.chat()`时，消息被放入可组合的内存中，我们从前面的部分了解到，这意味着所有消息都被放入`primary`和`secondary`两个源中。


In [None]:
response = agent.chat("What is the mystery function on 5 and 6?")

Added user message to memory: What is the mystery function on 5 and 6?
=== Calling Function ===
Calling function: mystery with args: {"a": 5, "b": 6}
=== Function Output ===
-11
=== LLM Response ===
The mystery function on 5 and 6 returns -11.


In [None]:
response = agent.chat("What happens if you multiply 2 and 3?")

Added user message to memory: What happens if you multiply 2 and 3?
=== Calling Function ===
Calling function: multiply with args: {"a": 2, "b": 3}
=== Function Output ===
6
=== LLM Response ===
If you multiply 2 and 3, the result is 6.


### 新代理会话


现在我们已经将消息添加到我们的`vector_memory`中，我们可以看到将这个内存与新的agent会话一起使用时的效果。具体来说，我们要求新的agents“回忆”函数调用的输出，而不是重新计算。


#### 一个没有过去记忆的代理

在深度强化学习中，有时我们会遇到一种情况，即代理在执行任务时没有过去的记忆。这意味着代理只能根据当前的状态和奖励来做决策，而无法依靠之前的经验或记忆。这种情况可能会对代理的学习和决策产生影响，因此需要特殊的算法和技术来处理。


In [None]:
llm = OpenAI(model="gpt-3.5-turbo-0613")
agent_worker = FunctionCallingAgentWorker.from_tools(
    [multiply_tool, mystery_tool], llm=llm, verbose=True
)
agent_without_memory = agent_worker.as_agent()

In [None]:
response = agent_without_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
=== LLM Response ===
I'm sorry, but I don't have access to the previous output of the mystery function on 5 and 6.


#### 一个带有我们过去记忆的代理

在这个示例中，我们将创建一个带有记忆的代理，使其能够记住先前的状态并在需要时使用这些信息。


我们发现，没有访问我们过去记忆的代理无法完成任务。对于接下来的代理，我们将确实传入我们先前的长期记忆（即`vector_memory`）。请注意，我们甚至使用了一个全新的`ChatMemoryBuffer`，这意味着这个代理没有`chat_history`。尽管如此，它仍然能够从我们的长期记忆中检索以获取所需的过去对话。


In [None]:
llm = OpenAI(model="gpt-3.5-turbo-0613")agent_worker = FunctionCallingAgentWorker.from_tools(    [multiply_tool, mystery_tool], llm=llm, verbose=True)composable_memory = SimpleComposableMemory.from_defaults(    primary_memory=ChatMemoryBuffer.from_defaults(),    secondary_memory_sources=[        vector_memory.copy(            deep=True        )  # using a copy here for illustration purposes        # later will use original vector_memory again    ],)agent_with_memory = agent_worker.as_agent(memory=composable_memory)

In [None]:
agent_with_memory.chat_history  # 一个空的聊天记录

[]

In [None]:
response = agent_with_memory.chat(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)

Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
=== LLM Response ===
The output of the mystery function on 5 and 6 is -11.


In [None]:
response = agent_with_memory.chat(
    "What was the output of the multiply function on 2 and 3 again? Don't recompute."
)

Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute.
=== LLM Response ===
The output of the multiply function on 2 and 3 is 6.


In [None]:
agent_with_memory.chat_history

[ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the mystery function on 5 and 6 again? Don't recompute.", additional_kwargs={}),
 ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the mystery function on 5 and 6 is -11.', additional_kwargs={}),
 ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the multiply function on 2 and 3 again? Don't recompute.", additional_kwargs={}),
 ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the multiply function on 2 and 3 is 6.', additional_kwargs={})]

### `.chat(user_input)` 在幕后发生了什么

当调用`.chat(user_input)`时，Rasa会执行以下步骤：

1. Rasa将用户输入传递给当前活动的对话模型。
2. 对话模型将用户输入转换为对应的意图（intent）和实体（entities）。
3. Rasa使用对话策略（policy）来预测下一个动作，例如回复用户的消息或者请求更多信息。
4. Rasa执行预测的动作，并将响应返回给用户。

这些步骤使Rasa能够理解用户输入并做出相应的响应。


在内部，`.chat(user_input)`调用实际上会调用内存的`.get()`方法，并将`user_input`作为参数传递。正如我们在前面的部分学到的那样，这最终将返回`primary`和所有`secondary`内存源的组合。这些组合的消息将作为聊天历史传递给LLM的聊天API。


In [None]:
composable_memory = SimpleComposableMemory.from_defaults(    primary_memory=ChatMemoryBuffer.from_defaults(),    secondary_memory_sources=[        vector_memory.copy(            deep=True        )  # 为了说明在前一小节中发生了什么，进行了复制    ],)agent_with_memory = agent_worker.as_agent(memory=composable_memory)

In [None]:
agent_with_memory.memory.get(
    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
)



In [None]:
print(
    agent_with_memory.memory.get(
        "What was the output of the mystery function on 5 and 6 again? Don't recompute."
    )[0]
)

system: You are a helpful assistant.

Below are a set of relevant dialogues retrieved from potentially several memory sources:

=====Relevant messages from memory source 1=====

	USER: What is the mystery function on 5 and 6?
	ASSISTANT: None
	TOOL: -11
	ASSISTANT: The mystery function on 5 and 6 returns -11.


This is the end of the retrieved message dialogues.
