# LangChain: Memory

本节会介绍如何使用 LangChain 来让 ChatBot 记忆聊天上下文的内容。

## 环境初始化

In [None]:
!pip install python-dotenv
!pip install openai
!pip install --upgrade langchain

In [None]:
%env OPENAI_API_KEY=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

In [4]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']

## ConversationBufferMemory

先说结论：

- 语言模型本质上是不会记录和用户之间的对话的历史信息的，每次调用 API 发起的请求都是独立的；
- `ConversationBufferMemory` 的运作原理就是将 AI 与 Human 完整的对话记录存储在 buffer 中，Human 每次发起询问，都会自动携带上所有的历史对话记录；
- 随着对话变得越来越长，所需的记忆存储量也变得非常大，向 LLM 发生大量的 Token 的成本也会增加。


In [5]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

这里的 `ConversionChain` 暂时不需要去知道原理，后面的章节会涉及。

In [6]:
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferMemory()
conversation = ConversationChain(
  llm=llm,
  memory=memory,
  verbose=True # 如果想知道 LangChain 的运行细节，则将 verbose 设置为 True
)

In [7]:
conversation.predict(input="Hi, my name is DwD")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is DwD
AI:[0m

[1m> Finished chain.[0m


"Hello DwD, it's nice to meet you! My name is OpenAI. How can I assist you today?"

In [8]:
conversation.predict(input="What is 1+1?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is DwD
AI: Hello DwD, it's nice to meet you! My name is OpenAI. How can I assist you today?
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m


'The answer to 1+1 is 2.'

In [9]:
conversation.predict(input="What is my name?")



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is DwD
AI: Hello DwD, it's nice to meet you! My name is OpenAI. How can I assist you today?
Human: What is 1+1?
AI: The answer to 1+1 is 2.
Human: What is my name?
AI:[0m

[1m> Finished chain.[0m


'Your name is DwD, as you mentioned earlier.'

通过 `memory.buffer` 可以获取到当前所有的历史对话记录。

In [10]:
print(memory.buffer)

Human: Hi, my name is DwD
AI: Hello DwD, it's nice to meet you! My name is OpenAI. How can I assist you today?
Human: What is 1+1?
AI: The answer to 1+1 is 2.
Human: What is my name?
AI: Your name is DwD, as you mentioned earlier.


也可以使用下面的方式来获取历史对话记录：

In [11]:
memory.load_memory_variables({})

{'history': "Human: Hi, my name is DwD\nAI: Hello DwD, it's nice to meet you! My name is OpenAI. How can I assist you today?\nHuman: What is 1+1?\nAI: The answer to 1+1 is 2.\nHuman: What is my name?\nAI: Your name is DwD, as you mentioned earlier."}

调用 `load_memory_variables` 时，这里传入了一个空的字典，使用该函数可以做一些更高级的定制化配置，本节不会涉及相关内容。

通过对话的方式将上下文写入 memory 外，也可以直接手动写入（往 buffer 中进行 append 的方式）：

In [12]:
memory.save_context(
  {"input": "How's the weather today?"},
  {"output": "Today is a sunny day."}
)

print(memory.buffer)

Human: Hi, my name is DwD
AI: Hello DwD, it's nice to meet you! My name is OpenAI. How can I assist you today?
Human: What is 1+1?
AI: The answer to 1+1 is 2.
Human: What is my name?
AI: Your name is DwD, as you mentioned earlier.
Human: How's the weather today?
AI: Today is a sunny day.


## ConversationBufferWindowMemory

作用：从命名来看，多了一个 `window`，因此就有了容量限制，`ConversationBufferWindowMemory` 会仅保留最后若干轮对话消息，这样可以防止历史记录无限增加。

In [13]:
from langchain.memory import ConversationBufferWindowMemory

In [14]:
llm = ChatOpenAI(temperature=0.0)
memory = ConversationBufferWindowMemory(k=1) # 通过参数 k 来设置窗口大小，这里表示仅保留最后一轮对话记录
conversation = ConversationChain(
  llm=llm,
  memory=memory,
  verbose=True # 如果想知道 LangChain 的运行细节，则将 verbose 设置为 True
)

In [15]:
memory.save_context(
  {"input": "How's the weather today?"},
  {"output": "Today is a sunny day."}
)
memory.save_context(
  {"input": "My name is DwD."},
  {"output": "Nice to meet you!"}
)

memory.load_memory_variables({})

{'history': 'Human: My name is DwD.\nAI: Nice to meet you!'}

可以看到虽然写入了俩段对话，但是 memory 中仅保存了一段记录。

## ConversationTokenBufferMemory

作用：因为大部分 LLMs 的 API 都是通过 token 计费，使用 `ConversationTokenBufferMemory` 可以将 memory 以 token 级别限制历史记录长度，比如 50 则会保留最新的长度为 50 个 token 的历史记录（不一定是 50 token，只需要保证低于 50 token）。

注意这里要安装一个新的依赖项。

In [16]:
!pip install tiktoken

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tiktoken
  Downloading tiktoken-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tiktoken
Successfully installed tiktoken-0.4.0


In [17]:
from langchain.memory import ConversationTokenBufferMemory

这里有一个 llm 参数，因为不同的 LLM 对于 token 数量的计算方式不同。

In [18]:
llm = ChatOpenAI(temperature=0.0)
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)
conversation = ConversationChain(
  llm=llm,
  memory=memory,
  verbose=True # 如果想知道 LangChain 的运行细节，则将 verbose 设置为 True
)

In [19]:
memory.save_context(
  {"input": "How's the weather today?"},
  {"output": "Today is a sunny day."}
)
memory.save_context(
  {"input": "My name is DwD."},
  {"output": "Nice to meet you!"}
)

memory.load_memory_variables({})

{'history': 'Human: My name is DwD.\nAI: Nice to meet you!'}

## ConversationSummaryBufferMemory

作用：用第三人称视角总结当前历史对话记录，并且生成的 prompt 长度不能超过指定的 max_token。

In [20]:
from langchain.memory import ConversationSummaryBufferMemory

In [23]:
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, {"output": f"{schedule}"})

In [24]:
memory.load_memory_variables({})

{'history': "System: The human and AI engage in small talk before discussing the day's schedule. The AI informs the human of a morning meeting with the product team, time to work on the LangChain project, and a lunch meeting with a customer interested in the latest AI developments."}

## Vector data memory

## Entity memories