Performance

Optimization strategies and best practices for Memory Journal MCP Server.

Design Philosophy: Fast context retrieval is critical for AI workflows. Memory Journal is optimized for sub-second query response times, enabling AI to access project history without noticeable latency.

Performance Overview

Memory Journal achieves:

2-3 second startup (10x improvement from v1.0)
<10ms entry creation (typical)
<50ms full-text search (for 1000 entries)
<1s semantic search (reliable on first load after v1.2.1 fix)
No performance degradation from v2.x refactoring (all async operations maintained)

Startup Performance

Lazy Loading Optimization

Problem (early versions):

Startup time: 14 seconds
ML dependencies loaded eagerly

Solution:

Lazy initialization of ML model (@huggingface/transformers)
Model loads only on first semantic search
Startup: 2-3 seconds

Modular Architecture

v3.0.0 TypeScript architecture:

Modular TypeScript codebase with clear separation of concerns
Handler layer, manager layer, and database layer
Each module focused on a single responsibility

Performance Impact:

✅ No degradation - All async operations preserved
✅ Fast startup - 2-3 seconds
✅ No overhead from modularization
✅ Better maintainability - Easy to optimize specific components
✅ Full TypeScript strict mode - Zero type errors

The modular architecture makes it easier to identify and optimize performance bottlenecks!

Implementation (v3.0.0 TypeScript):

// Lazy initialization - model loads on first semantic search
private async ensureInitialized(): Promise<void> {
  if (this.initialized) return;

  // Load embeddings model (lazy, ~5s first time)
  this.embedder = await pipeline(
    'feature-extraction',
    'Xenova/all-MiniLM-L6-v2'
  );

  this.initialized = true;
}

Timeline:

Server startup: 2-3s
First semantic search: +5s (one-time model load)
Subsequent searches: <1s

Database Performance

SQLite Optimization

Memory Journal uses optimal SQLite settings:

-- better-sqlite3 uses native file-based database with WAL mode
PRAGMA journal_mode = WAL;           -- Write-Ahead Logging for concurrent reads
PRAGMA synchronous = NORMAL;         -- Balanced durability/performance
PRAGMA foreign_keys = ON;
PRAGMA busy_timeout = 30000;         -- 30s wait for locks

Benefits:

WAL mode: Concurrent readers with single writer
Native file-based: Direct disk access for reliable persistence
Efficient I/O for queries

Indexing Strategy

Indexes created:

-- Entry lookups
CREATE INDEX idx_memory_journal_timestamp ON memory_journal(timestamp);
CREATE INDEX idx_memory_journal_type ON memory_journal(entry_type);
CREATE INDEX idx_memory_journal_personal ON memory_journal(is_personal);
CREATE INDEX idx_memory_journal_deleted ON memory_journal(deleted_at);

-- Tag lookups
CREATE INDEX idx_tags_name ON tags(name);
CREATE INDEX idx_entry_tags_entry ON entry_tags(entry_id);
CREATE INDEX idx_entry_tags_tag ON entry_tags(tag_id);

-- Relationship lookups
CREATE INDEX idx_relationships_from ON relationships(from_entry_id);
CREATE INDEX idx_relationships_to ON relationships(to_entry_id);

Query optimization:

All queries use appropriate indexes
No full table scans (except ANALYZE)
Covering indexes where possible

Full-Text Search Performance

Full-text search speed:

Entries	Search Time
100	<5ms
1,000	<10ms
10,000	<50ms
100,000	<200ms

Optimization:

FTS5 queries with BM25 ranking
Query term escaping for safe search
Result limits (default 10)
Fallback to LIKE for queries with special characters

Transaction Management

Single Connection Pattern

Anti-pattern (causes locking):

async function updateEntry(...) {
  const db1 = getDb();
  db1.run("UPDATE memory_journal ...");
  // Nested connection - causes lock!
  await autoCreateTags(tags); // Opens separate db handle
}

Correct pattern:

async function updateEntry(...) {
  const db = getDb();
  db.run("UPDATE memory_journal ...");
  // Use same connection
  db.run("INSERT OR IGNORE INTO tags ...");
  // No nested connections = no locks
}

Benefits:

No database locking
Atomic transactions
Better concurrency

Thread-Safe Tag Creation

Batched race-safe pattern (v4.5+):

// Batch insert all tags at once — INSERT OR IGNORE prevents race conditions
const placeholders = tags.map(() => "(?, 0)").join(", ");
db.run(
  `INSERT OR IGNORE INTO tags (name, usage_count) VALUES ${placeholders}`,
  tags,
);

// Batch lookup all tag IDs in one query
const inClause = tags.map(() => "?").join(", ");
const result = db.exec(
  `SELECT id, name FROM tags WHERE name IN (${inClause})`,
  tags,
);

Benefits:

Batch insert + batch lookup (2 queries instead of 2N)
No duplicates, no conflicts
Multiple entries creating same tag handled atomically

Search Performance

Full-Text Search (FTS5)

Fastest option:

better-sqlite3 native file-based database
FTS5 with BM25 ranking
No external dependencies

Best practices:

// Use specific terms
search_entries({ query: "lazy loading pattern" }); // Fast

// Avoid overly broad
search_entries({ query: "a" }); // Slow (too many matches)

// Use limits
search_entries({ query: "optimization", limit: 10 }); // Fast

Date Range Search

Very fast:

Indexed by timestamp
Simple range query
~5ms typical

Best practices:

// Reasonable ranges
search_by_date_range({
  start_date: "2025-10-01",
  end_date: "2025-10-31", // 1 month
});

// Add filters for large ranges
search_by_date_range({
  start_date: "2025-01-01",
  end_date: "2025-12-31", // 1 year
  tags: ["performance"], // Filter!
});

Semantic Search

Performance characteristics:

Operation	Time
First search (model load)	~5s
Generate embedding	50-100ms
sqlite-vec search	10-50ms
Fetch entries	10-50ms
Total (subsequent)	70-200ms

Optimization:

Model cached after first load
sqlite-vec index in SQLite database
Batch entry fetching

Best practices:

// Use reasonable limits
semantic_search({ query: "...", limit: 10 }); // Fast

// Higher thresholds = faster
semantic_search({
  query: "...",
  similarity_threshold: 0.5, // Fewer results
});

Visualization Performance

Mermaid Generation

Performance:

Entry-centric (depth 2): <100ms
Tag-based (20 entries): <50ms
Recursive CTE: efficient graph traversal

Best practices:

// Reasonable depth
visualize_relationships({ entry_id: 42, depth: 2 }); // Good

// Reasonable limits
visualize_relationships({ tags: ["feature"], limit: 20 }); // Good

// Avoid huge graphs
visualize_relationships({ depth: 5, limit: 100 }); // Slow to render

Memory Usage

Typical Memory Footprint

Minimal (no semantic search):

Server: 20-30 MB
SQLite cache: 64 MB
Total: ~100 MB

With semantic search:

Server: 20-30 MB
SQLite cache: 64 MB
Model: 80-100 MB
sqlite-vec index: 1-10 MB (depends on entries)
Total: ~200 MB

Scaling Considerations

Small Journal (<1000 entries)

Performance:

All operations <50ms
Startup: 2-3s
No optimization needed

Medium Journal (1000-10000 entries)

Performance:

Full-text search: <50ms
Semantic search: <200ms
Startup: 2-3s

Recommendations:

Use search filters (tags, dates)
Limit visualization size
Regular ANALYZE (monthly)

Large Journal (>10000 entries)

Performance:

Full-text search: <200ms
Semantic search: <500ms
Startup: 2-3s

Recommendations:

Archive old entries
Use specific search queries
Increase similarity_threshold
Consider database partitioning

Token Efficiency

Tool Filtering

Reduce context window consumption by disabling unused tools:

MEMORY_JOURNAL_MCP_TOOL_FILTER="-search,-analytics,-relationships,-export,-admin,-github,-backup"

Token savings: From the full 67-tool set (~6,500 tokens baseline), filtering can save up to ~86% by exposing only the 6 core tools. Common configurations like -github (~36%) or -admin,-github (~48%) provide significant savings with minimal functionality loss.

Benefits:

Faster AI responses - Smaller context = faster processing
Reduced API costs - Fewer tokens = lower bills
Stay under client limits - Essential for Windsurf (100-tool limit)
Better tool selection - AI makes better choices with fewer options

Complete Tool Filtering Guide →

Optimization Tips

For Faster Searches

1. Use specific queries:

// Good
search_entries({ query: "lazy loading pattern" });

// Poor
search_entries({ query: "loading" });

2. Filter early:

// Good
search_entries({
  query: "optimization",
  is_personal: false, // Reduces search space
});

3. Use appropriate limits:

// Default 10 is good
search_entries({ query: "...", limit: 10 });

// Only increase if needed
search_entries({ query: "...", limit: 50 }); // Slower

For Faster Entry Creation

1. Disable auto_context if not needed:

create_entry({
  content: "...",
  auto_context: false, // Skips Git subprocess
});

2. Tag batching is automatic:

As of v4.5+, linkTagsToEntry batches all tag inserts and lookups internally. No user action needed — passing multiple tags is already optimized.

For Faster Visualization

1. Use appropriate depth:

// Good balance
visualize_relationships({ entry_id: 42, depth: 2 });

// Only use depth 3 if necessary
visualize_relationships({ entry_id: 42, depth: 3 }); // Slower

2. Limit graph size:

visualize_relationships({
  tags: ["feature"],
  limit: 20, // Sweet spot
});

Benchmarks

Memory Journal is designed for extremely low overhead during AI task execution. We include a vitest bench suite to maintain these baseline guarantees:

Database Reads: Operations execute in fractions of a millisecond. calculateImportance is ~7x faster than retrieving 50 recent entries (composite index optimization narrows this gap by accelerating getRecentEntries ~4x).
Vector Search Engine: Both search (780 ops/sec) and indexing (640 ops/sec) are high-throughput via sqlite-vec with SQL-native KNN queries.
Core MCP Routines: getTools uses cached O(1) dispatch (~4800x faster than tool execution). create_entry and search_entries execute through the full MCP layer with sub-millisecond overhead.

To run the benchmarking suite locally:

npm run bench

Internal Optimizations

Performance improvements implemented in the server internals (v4.5+). These are transparent to users — no API changes.

Batch Tag Fetching (N+1 Elimination)

Multi-row methods (getRecentEntries, getEntriesPage, searchEntries, searchByDateRange) previously called getTagsForEntry() per row, causing N+1 queries. Now uses batchGetTagsForEntries() + rowsToEntries() to fetch all tags in a single IN (...) query.

Operation	Before	After
`getRecentEntries(50)`	51 queries	2 queries
`searchEntries` (20 results)	21 queries	2 queries

Batch Tag Linking

linkTagsToEntry() batches tag inserts and lookups:

Step	Before (per tag)	After (batched)
Insert tags	N × `INSERT OR IGNORE`	1 × multi-value `INSERT`
Lookup IDs	N × `SELECT id`	1 × `SELECT ... WHERE IN`
Total statements	4N	2 + 2N

Tool Dispatch Cache

callTool() caches tool definitions in a Map for O(1) lookup instead of rebuilding all 67 ToolDefinition objects on every call.

Benchmark	Before	After	Speedup
`create_entry`	105 ops/sec	800 ops/sec	7.6×
`search_entries`	95 ops/sec	733 ops/sec	7.7×
`get_recent_entries`	82 ops/sec	106 ops/sec	1.3×

Cache invalidates when context parameters change (db, github, vectorManager, config, teamDb).

Conditional JOIN in Date Range Search

searchByDateRange only JOINs tag tables (entry_tags, tags) when a tag filter is provided. Without tags, queries skip the JOIN and DISTINCT, avoiding unnecessary row multiplication.

Consolidated Statistics Queries

getStatistics() reduced from 5 sequential db.exec() calls to 3 using multi-statement exec() and SUM(CASE ...) aggregation.

Simplified Vector Index Rebuild

rebuildIndex() removed a redundant orphan detection pass that preceded a delete-all pass. Now performs a single delete-all before re-indexing.

Maintenance for Performance

Monthly

-- Update statistics
ANALYZE;

-- Check index usage
EXPLAIN QUERY PLAN SELECT ...;

Quarterly

-- Reclaim space
VACUUM;

-- Integrity check
PRAGMA integrity_check;

As Needed

Archive old entries
Review slow queries
Check database size

Troubleshooting Performance Issues

Slow Startup

Check:

Version (should be latest)
ML dependencies installed
System resources (CPU, disk)

Fix:

Update to the latest version (includes all performance optimizations)
Use Docker image (optimized)
Remove ML dependencies if not needed

Slow Searches

Check:

Database size
Query complexity
Filters applied

Fix:

Use more specific queries
Add filters (tags, dates, is_personal)
Run ANALYZE
Consider archiving

Database Locks

Check:

Multiple connections
Long transactions
Nested database calls

Fix:

Update to the latest version (includes all locking fixes)
Use single connection per transaction
Avoid nested database calls

Automated Maintenance (HTTP Only)

For long-running HTTP/SSE deployments, the server includes a built-in scheduler that can automate database optimization and index rebuilds on configurable intervals. See Configuration → Automated Scheduler for CLI flags and setup.

Performance

Performance

Performance Overview

Startup Performance

Lazy Loading Optimization

Modular Architecture

Database Performance

SQLite Optimization

Indexing Strategy

Full-Text Search Performance

Transaction Management

Single Connection Pattern

Thread-Safe Tag Creation

Search Performance

Full-Text Search (FTS5)

Date Range Search

Semantic Search

Visualization Performance

Mermaid Generation

Memory Usage

Typical Memory Footprint

Scaling Considerations

Small Journal (<1000 entries)

Medium Journal (1000-10000 entries)

Large Journal (>10000 entries)

Token Efficiency

Tool Filtering

Optimization Tips

For Faster Searches

For Faster Entry Creation

For Faster Visualization

Benchmarks

Internal Optimizations

Batch Tag Fetching (N+1 Elimination)

Batch Tag Linking

Tool Dispatch Cache

Conditional JOIN in Date Range Search

Consolidated Statistics Queries

Simplified Vector Index Rebuild

Maintenance for Performance

Monthly

Quarterly

As Needed

Troubleshooting Performance Issues

Slow Startup

Slow Searches

Database Locks

Automated Maintenance (HTTP Only)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

🏠 Home

Getting Started

Core Features