Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Implementation of token buffer memory #3211

Merged
merged 19 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/docs/modules/memory/how_to/buffer_token_memory.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Conversation token buffer memory

This notebook covers how to use `ConversationTokenBufferMemory`. This memory keeps a buffer of recent interactions in memory, and uses token length rather than number of interactions to determine when to flush interactions.

Below is a basic demonstration of the usage of token buffer memory.

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/memory/token_buffer.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>

We can also get the history as a list of messages, useful if you are using this with `MessagesPlaceholder` in a chat prompt template.

```typescript
const memory = new ConversationTokenBufferMemory({
llm: model,
maxTokenLimit: 10,
returnMessages: true,
});
```
18 changes: 18 additions & 0 deletions examples/src/memory/token_buffer.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import { OpenAI } from "langchain/llms/openai";
import { ConversationTokenBufferMemory } from "langchain/memory";

const model = new OpenAI({});
const memory = new ConversationTokenBufferMemory({
llm: model,
maxTokenLimit: 10,
});

await memory.saveContext({ input: "hi" }, { output: "whats up" });
await memory.saveContext({ input: "not much you" }, { output: "not much" });

const result1 = await memory.loadMemoryVariables({});
console.log(result1);

/*
{ history: 'Human: not much you\nAI: not much' }
*/
115 changes: 115 additions & 0 deletions langchain/src/memory/buffer_token_memory.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
import {
InputValues,
MemoryVariables,
getBufferString,
OutputValues,
} from "./base.js";

import { BaseChatMemory, BaseChatMemoryInput } from "./chat_memory.js";
import { BaseLanguageModel } from "../base_language/index.js";

/**
* Interface for the input parameters of the `BufferTokenMemory` class.
*/

export interface ConversationTokenBufferMemoryInput
extends BaseChatMemoryInput {
/* Prefix for human messages in the buffer. */
humanPrefix?: string;

/* Prefix for AI messages in the buffer. */
aiPrefix?: string;

/* The LLM for this instance. */
llm: BaseLanguageModel;

/* Memory key for buffer instance. */
memoryKey?: string;

/* Maximmum number of tokens allowed in the buffer. */
maxTokenLimit?: number;
}

/**
* Class that represents a conversation chat memory with a token buffer.
* It extends the `BaseChatMemory` class and implements the
* `ConversationTokenBufferMemoryInput` interface.
*/

export class ConversationTokenBufferMemory
extends BaseChatMemory
implements ConversationTokenBufferMemoryInput
{
humanPrefix = "Human";

aiPrefix = "AI";

memoryKey = "history";

maxTokenLimit = 2000; // Default max token limit of 2000 which can be overridden

llm: BaseLanguageModel;

constructor(fields: ConversationTokenBufferMemoryInput) {
super(fields);
this.llm = fields.llm;
this.humanPrefix = fields?.humanPrefix ?? this.humanPrefix;
this.aiPrefix = fields?.aiPrefix ?? this.aiPrefix;
this.memoryKey = fields?.memoryKey ?? this.memoryKey;
this.maxTokenLimit = fields?.maxTokenLimit ?? this.maxTokenLimit;
}

get memoryKeys() {
return [this.memoryKey];
}

/**
* Loads the memory variables. It takes an `InputValues` object as a
* parameter and returns a `Promise` that resolves with a
* `MemoryVariables` object.
* @param _values `InputValues` object.
* @returns A `Promise` that resolves with a `MemoryVariables` object.
*/
async loadMemoryVariables(_values: InputValues): Promise<MemoryVariables> {
const messages = await this.chatHistory.getMessages();
if (this.returnMessages) {
const result = {
[this.memoryKey]: messages,
};
return result;
}
const result = {
[this.memoryKey]: getBufferString(
messages,
this.humanPrefix,
this.aiPrefix
),
};
return result;
}

/**
* Saves the context from this conversation to buffer. If the amount
* of tokens required to save the buffer exceeds MAX_TOKEN_LIMIT,
* prune it.
*/
async saveContext(inputValues: InputValues, outputValues: OutputValues) {
await super.saveContext(inputValues, outputValues);

// Prune buffer if it exceeds the max token limit set for this instance.
const buffer = await this.chatHistory.getMessages();
let currBufferLength = await this.llm.getNumTokens(
getBufferString(buffer, this.humanPrefix, this.aiPrefix)
);

if (currBufferLength > this.maxTokenLimit) {
const prunedMemory = [];
while (currBufferLength > this.maxTokenLimit) {
prunedMemory.push(buffer.shift());
currBufferLength = await this.llm.getNumTokens(
getBufferString(buffer, this.humanPrefix, this.aiPrefix)
);
}
}
}
}
4 changes: 4 additions & 0 deletions langchain/src/memory/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,7 @@ export {
ConversationSummaryBufferMemory,
type ConversationSummaryBufferMemoryInput,
} from "./summary_buffer.js";
export {
ConversationTokenBufferMemory,
type ConversationTokenBufferMemoryInput,
} from "./buffer_token_memory.js";
53 changes: 53 additions & 0 deletions langchain/src/memory/tests/buffer_token_memory.int.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
import { test, expect } from "@jest/globals";
import { OpenAI } from "../../llms/openai.js";
import { ConversationTokenBufferMemory } from "../buffer_token_memory.js";
import { ChatMessageHistory } from "../../stores/message/in_memory.js";
import { HumanMessage, AIMessage } from "../../schema/index.js";

test("Test buffer token memory with LLM", async () => {
const memory = new ConversationTokenBufferMemory({
llm: new OpenAI(),
maxTokenLimit: 10,
});
const result1 = await memory.loadMemoryVariables({});
expect(result1).toStrictEqual({ history: "" });

await memory.saveContext({ input: "foo" }, { output: "bar" });
const expectedString = "Human: foo\nAI: bar";
const result2 = await memory.loadMemoryVariables({});
expect(result2).toStrictEqual({ history: expectedString });

await memory.saveContext({ foo: "foo" }, { bar: "bar" });
await memory.saveContext({ foo: "bar" }, { bar: "foo" });
const expectedString3 = "Human: bar\nAI: foo";
const result3 = await memory.loadMemoryVariables({});
expect(result3).toStrictEqual({ history: expectedString3 });
});

test("Test buffer token memory return messages", async () => {
const memory = new ConversationTokenBufferMemory({
llm: new OpenAI(),
returnMessages: true,
});
const result1 = await memory.loadMemoryVariables({});
expect(result1).toStrictEqual({ history: [] });

await memory.saveContext({ foo: "bar" }, { bar: "foo" });
const expectedResult = [new HumanMessage("bar"), new AIMessage("foo")];
const result2 = await memory.loadMemoryVariables({});
expect(result2).toStrictEqual({ history: expectedResult });
});

test("Test buffer token memory with pre-loaded history", async () => {
const pastMessages = [
new HumanMessage("My name's Jonas"),
new AIMessage("Nice to meet you, Jonas!"),
];
const memory = new ConversationTokenBufferMemory({
llm: new OpenAI(),
returnMessages: true,
chatHistory: new ChatMessageHistory(pastMessages),
});
const result = await memory.loadMemoryVariables({});
expect(result).toStrictEqual({ history: pastMessages });
});