agent-memory

TypeScript library providing persistent memory for AI agents — conversation history, long-term memory with vector search, knowledge base, and automatic fact extraction.

English | 中文

Design Document: English | 中文

Why agent-memory?

Most AI agent frameworks lack built-in persistent memory. agent-memory fills this gap with a production-ready, embedded memory system for LLM-powered agents and chatbots. No external databases required — just npm install and go.

Works with OpenAI, Anthropic, LangChain, and any LLM/embedding provider
Ideal for building RAG (Retrieval-Augmented Generation) pipelines
Drop-in context management with automatic token budgeting
Local-first: all data stays on your machine via SQLite + HNSW vector index

Features

Three-layer memory: Working (transient) → Conversation (session) → Long-term (persistent)
Knowledge base: Pre-loaded reference documents with ref-only injection and on-demand full-text loading
Hybrid retrieval: Keyword + vector search across conversations, memories, and knowledge base
Token budget: Context assembly with automatic ranking and budget-aware truncation
Natural forgetting: Access-based decay simulating human memory curves
Embedded storage: SQLite + HNSW vector index, zero external dependencies
LLM tool integration: Export tool definitions for OpenAI / Anthropic / LangChain
CLI included: Built-in command-line tool for debugging and data management
Not bound to any LLM or embedding provider: Built-in local embedding, injectable custom providers

Install

npm install agent-memory

Requires Node.js >= 18.

Quick Start

import { createMemory } from 'agent-memory';

const memory = await createMemory();

// Append conversation messages
await memory.appendMessage('user', 'I prefer TypeScript over JavaScript');
await memory.appendMessage('assistant', 'Noted! I will use TypeScript in examples.');

// Save a fact to long-term memory
await memory.saveMemory('preference', 'language', 'User prefers TypeScript');

// Assemble context for the next LLM call
const ctx = await memory.assembleContext('What language should I use?');
// ctx.text → formatted memory context ready for prompt injection
// ctx.tokenCount → tokens used
// ctx.sources → retrieval audit trail

// Clean up
await memory.close();

Configuration

All options are optional with sensible defaults:

const memory = await createMemory({
  // Data directory (default: $AGENT_MEMORY_DATA_DIR || './memoryData')
  dataDir: './my-agent-data',

  // Custom embedding provider (default: built-in all-MiniLM-L6-v2, 384d)
  embedding: myEmbeddingProvider,

  // LLM provider for archive summaries & fact extraction (default: none)
  llm: myLLMProvider,

  // Token budget
  tokenBudget: {
    contextWindow: 128000,
    systemPromptReserve: 2000,
    outputReserve: 1000,
  },

  // Archive scheduler
  archive: {
    quietMinutes: 5,
    windowHours: 24,
    minBatch: 5,
    maxBatch: 20,
  },

  // Decay / forgetting
  decay: {
    dormantAfterDays: 90,
    expireAfterDays: 180,
  },

  // Capacity limits
  limits: {
    maxConversationMessages: 500,
    maxLongTermMemories: 1000,
  },

  // Callback on decay warning
  onDecayWarning: (item) => console.log('Decaying:', item.key),
});

Knowledge Base

Pre-load reference documents that the agent can access on demand:

// Add knowledge chunks
await memory.addKnowledge('api-docs', 'Authentication', 'All requests require Bearer token...');
await memory.addKnowledgeBatch([
  { source: 'faq', title: 'Pricing', content: 'Plans start at...' },
  { source: 'faq', title: 'Refunds', content: 'Refunds within 30 days...' },
]);

// Search knowledge
const results = await memory.searchKnowledge('how to authenticate', 5);

// Replace a source (remove all + re-add)
await memory.removeKnowledgeBySource('api-docs');

When assembleContext() runs, knowledge base results are injected as title + excerpt + reference ID only. The LLM can then call knowledge_read(id) to load full content on demand.

LLM Tool Integration

Export memory operations as tool definitions for function calling:

// Get tool definitions for your LLM SDK
const tools = memory.getToolDefinitions('openai'); // or 'anthropic' | 'langchain'

// In your tool call handler
const result = await memory.executeTool('memory_search', { query: 'user preferences' });

Available tools: memory_search, memory_save, memory_list, memory_delete, memory_get_history, knowledge_read, knowledge_search.

Custom Providers

Embedding Provider

const memory = await createMemory({
  embedding: {
    dimensions: 1536,
    async embed(text: string): Promise<number[]> {
      // Call OpenAI, Cohere, Ollama, etc.
      return await myEmbeddingAPI(text);
    },
  },
});

LLM Provider

Enables archive summarization and LLM-based fact extraction:

const memory = await createMemory({
  llm: {
    async generate(prompt: string): Promise<string> {
      return await myLLM.complete(prompt);
    },
  },
});

Dynamic Token Budget

Update token budget at runtime or per-call:

// Update instance-level budget
memory.updateTokenBudget({ contextWindow: 32000 });

// Override for a single call
const ctx = await memory.assembleContext('query', { contextWindow: 16000 });

CLI

The package includes a command-line tool for debugging and data management:

# Append a message
memory append user "I prefer dark mode"

# Search memories
memory search "user preferences"

# Assemble context
memory context "What does the user like?"

# Manage knowledge base
memory kb-add --source api --title Auth --file auth.md
memory kb-list --source api
memory kb-search "authentication"

# Stats & maintenance
memory stats
memory maintenance
memory export --output backup.json
memory import backup.json

Use --data-dir <path> to specify a custom data directory. Run memory help for the full command list.

Maintenance

// Manual archive + decay detection
const result = await memory.runMaintenance();

// Export / import
const data = await memory.export();
await memory.import(data);

// Permanently remove soft-deleted entries
await memory.purge();

Architecture

Three-layer memory + Knowledge Base
────────────────────────────────────
 L1  Working Memory     (in-process RAM, managed by Agent runtime)
 L2  Conversation Memory (SQLite, sliding window, auto-archive)
 L3  Long-term Memory    (SQLite + HNSW vectors, semantic search)
 KB  Knowledge Base      (SQLite + HNSW vectors, ref-only injection)

Retrieval: L2 keyword + L3 vector + KB vector → merge → rank → budget fill

License

MIT

_{Keywords: AI agent memory, LLM memory, persistent memory, conversation history, vector search, semantic search, knowledge base, RAG, retrieval-augmented generation, fact extraction, token budget, context window management, SQLite vector database, HNSW, TypeScript AI library, chatbot memory, long-term memory, OpenAI memory, Anthropic memory, LangChain memory}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
doc		doc
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-memory

Why agent-memory?

Features

Install

Quick Start

Configuration

Knowledge Base

LLM Tool Integration

Custom Providers

Embedding Provider

LLM Provider

Dynamic Token Budget

CLI

Maintenance

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agent-memory

Why agent-memory?

Features

Install

Quick Start

Configuration

Knowledge Base

LLM Tool Integration

Custom Providers

Embedding Provider

LLM Provider

Dynamic Token Budget

CLI

Maintenance

Architecture

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages