Performance

Optimization strategies and best practices for Memory Journal MCP Server.

Design Philosophy: Fast context retrieval is critical for AI workflows. Memory Journal is optimized for sub-second query response times, enabling AI to access project history without noticeable latency.

Performance Overview

Memory Journal v2.2.0 achieves:

2-3 second startup (10x improvement from v1.0)
<10ms entry creation (typical)
<50ms full-text search (for 1000 entries)
<1s semantic search (reliable on first load after v1.2.1 fix)
No performance degradation from v2.x refactoring (all async operations maintained)

Startup Performance

v1.1.3 Optimization: Lazy Loading

Problem (v1.0):

Startup time: 14 seconds
ML dependencies loaded eagerly
import sentence_transformers took ~14s

Solution (v1.1.3):

Lazy import of ML libraries
Model loads only on first semantic search
Startup: 2-3 seconds

v2.0.0 Optimization: Modular Architecture

Changes:

Refactored from single 4093-line file to 30 modules
Each module ~150-300 lines (highly maintainable)
Clear separation of concerns

Performance Impact:

✅ No degradation - All async operations preserved
✅ Same startup time - 2-3 seconds maintained
✅ Same operation speed - No overhead from modularization
✅ Better maintainability - 10x easier to optimize specific components
✅ True Pyright strict - 700+ type issues fixed, zero exclusions (better IDE support)

The modular architecture makes it easier to identify and optimize performance bottlenecks in the future!

Implementation:

# Top level - check availability only
VECTOR_SEARCH_AVAILABLE = False
if importlib.util.find_spec("sentence_transformers"):
    VECTOR_SEARCH_AVAILABLE = True

# In VectorSearchManager
def _ensure_initialized(self):
    if not self.initialized:
        # Load on first use
        from sentence_transformers import SentenceTransformer
        import faiss
        self.model = SentenceTransformer(self.model_name)
        self.initialized = True

Timeline:

Server startup: 2-3s
First semantic search: +15s (one-time)
Subsequent searches: <1s

Database Performance

SQLite Optimization

Memory Journal uses optimal SQLite settings:

PRAGMA journal_mode = WAL;          # Write-Ahead Logging
PRAGMA synchronous = NORMAL;         # Balance speed/safety
PRAGMA cache_size = -64000;          # 64MB cache
PRAGMA mmap_size = 268435456;        # 256MB memory-mapped I/O
PRAGMA temp_store = MEMORY;          # Temp tables in memory
PRAGMA busy_timeout = 30000;         # 30s lock timeout

Benefits:

WAL mode: Concurrent reads + writes
Large cache: Hot data stays in memory
Memory-mapped I/O: Direct memory access
Memory temp store: Fast temporary operations

Indexing Strategy

Indexes created:

-- Entry lookups
CREATE INDEX idx_memory_journal_timestamp ON memory_journal(timestamp);
CREATE INDEX idx_memory_journal_type ON memory_journal(entry_type);
CREATE INDEX idx_memory_journal_personal ON memory_journal(is_personal);
CREATE INDEX idx_memory_journal_deleted ON memory_journal(deleted_at);

-- Tag lookups
CREATE INDEX idx_tags_name ON tags(name);
CREATE INDEX idx_entry_tags_entry ON entry_tags(entry_id);
CREATE INDEX idx_entry_tags_tag ON entry_tags(tag_id);

-- Relationship lookups
CREATE INDEX idx_relationships_from ON relationships(from_entry_id);
CREATE INDEX idx_relationships_to ON relationships(to_entry_id);

Query optimization:

All queries use appropriate indexes
No full table scans (except ANALYZE)
Covering indexes where possible

FTS5 Performance

Full-text search speed:

Entries	Search Time
100	<5ms
1,000	<10ms
10,000	<50ms
100,000	<200ms

Optimization:

Porter stemming (matches variations)
BM25 ranking (relevance scoring)
Result limits (default 10)

Transaction Management

Single Connection Pattern

Anti-pattern (v1.0 - caused locking):

def update_entry(...):
    with db.get_connection() as conn1:
        conn1.execute("UPDATE memory_journal ...")
        # Nested connection - causes lock!
        db.auto_create_tags(tags)  # Opens conn2

Correct pattern (v1.2.1):

def update_entry(...):
    with db.get_connection() as conn:
        conn.execute("UPDATE memory_journal ...")
        # Use same connection
        conn.execute("INSERT OR IGNORE INTO tags ...")
        # No nested connections = no locks

Benefits:

No database locking
Atomic transactions
Better concurrency

Thread-Safe Tag Creation

Race-safe pattern:

# INSERT OR IGNORE prevents race conditions
conn.execute(
    "INSERT OR IGNORE INTO tags (name, usage_count) VALUES (?, 1)",
    (tag_name,)
)

# Then lookup
cursor = conn.execute("SELECT id FROM tags WHERE name = ?", (tag_name,))
tag_id = cursor.fetchone()[0]

Handles concurrent tag creation:

Multiple entries creating same tag
No duplicates
No conflicts

Search Performance

Full-Text Search (FTS5)

Fastest option:

SQLite native
Optimized C implementation
No external dependencies

Best practices:

// Use specific terms
search_entries({ query: "lazy loading pattern" })  // Fast

// Avoid overly broad
search_entries({ query: "a" })  // Slow (too many matches)

// Use limits
search_entries({ query: "optimization", limit: 10 })  // Fast

Date Range Search

Very fast:

Indexed by timestamp
Simple range query
~5ms typical

Best practices:

// Reasonable ranges
search_by_date_range({
  start_date: "2025-10-01",
  end_date: "2025-10-31"  // 1 month
})

// Add filters for large ranges
search_by_date_range({
  start_date: "2025-01-01",
  end_date: "2025-12-31",  // 1 year
  tags: ["performance"]     // Filter!
})

Semantic Search

Performance characteristics:

Operation	Time
First search (model load)	15s
Generate embedding	50-100ms
FAISS search	10-50ms
Fetch entries	10-50ms
Total (subsequent)	70-200ms

Optimization:

Model cached after first load
FAISS index in memory
Batch entry fetching

Best practices:

// Use reasonable limits
semantic_search({ query: "...", limit: 10 })  // Fast

// Higher thresholds = faster
semantic_search({
  query: "...",
  similarity_threshold: 0.5  // Fewer results
})

Visualization Performance

Mermaid Generation

Performance:

Entry-centric (depth 2): <100ms
Tag-based (20 entries): <50ms
Recursive CTE: efficient graph traversal

Best practices:

// Reasonable depth
visualize_relationships({ entry_id: 42, depth: 2 })  // Good

// Reasonable limits
visualize_relationships({ tags: ["feature"], limit: 20 })  // Good

// Avoid huge graphs
visualize_relationships({ depth: 5, limit: 100 })  // Slow to render

Memory Usage

Typical Memory Footprint

Minimal (no semantic search):

Server: 20-30 MB
SQLite cache: 64 MB
Total: ~100 MB

With semantic search:

Server: 20-30 MB
SQLite cache: 64 MB
Model: 80-100 MB
FAISS index: 1-10 MB (depends on entries)
Total: ~200 MB

Scaling Considerations

Small Journal (<1000 entries)

Performance:

All operations <50ms
Startup: 2-3s
No optimization needed

Medium Journal (1000-10000 entries)

Performance:

FTS5 search: <50ms
Semantic search: <200ms
Startup: 2-3s

Recommendations:

Use search filters (tags, dates)
Limit visualization size
Regular ANALYZE (monthly)

Large Journal (>10000 entries)

Performance:

FTS5 search: <200ms
Semantic search: <500ms
Startup: 2-3s

Recommendations:

Archive old entries
Use specific search queries
Increase similarity_threshold
Consider database partitioning

Token Efficiency (v2.2.0)

Tool Filtering

Reduce context window consumption by disabling unused tools:

MEMORY_JOURNAL_MCP_TOOL_FILTER="-search,-analytics,-relationships,-export,-admin,-test"

Token savings by configuration:

Configuration	Filter	Tools	Token Reduction
Full (default)	(none)	16	Baseline (~2,450 tokens)
Production	`-test`	14	~12%
Read-only	`-admin`	14	~15%
Focused	`-test,-admin`	12	~25%
Lightweight	(core only)	5	~69%

Benefits:

Faster AI responses - Smaller context = faster processing
Reduced API costs - Fewer tokens = lower bills
Stay under client limits - Essential for Windsurf (100-tool limit)
Better tool selection - AI makes better choices with fewer options

Complete Tool Filtering Guide →

Optimization Tips

For Faster Searches

1. Use specific queries:

// Good
search_entries({ query: "lazy loading pattern" })

// Poor
search_entries({ query: "loading" })

2. Filter early:

// Good
search_entries({
  query: "optimization",
  is_personal: false  // Reduces search space
})

3. Use appropriate limits:

// Default 10 is good
search_entries({ query: "...", limit: 10 })

// Only increase if needed
search_entries({ query: "...", limit: 50 })  // Slower

For Faster Entry Creation

1. Disable auto_context if not needed:

create_entry({
  content: "...",
  auto_context: false  // Skips Git subprocess
})

2. Batch tag creation:

// Create entries with same tags together
// Tags only created once

For Faster Visualization

1. Use appropriate depth:

// Good balance
visualize_relationships({ entry_id: 42, depth: 2 })

// Only use depth 3 if necessary
visualize_relationships({ entry_id: 42, depth: 3 })  // Slower

2. Limit graph size:

visualize_relationships({
  tags: ["feature"],
  limit: 20  // Sweet spot
})

Benchmarks

Entry Operations

Operation	Time
Create entry (no Git)	5-10ms
Create entry (with Git)	50-100ms
Update entry	5-15ms
Delete entry (soft)	5-10ms
Delete entry (permanent)	10-20ms
Get entry by ID	1-5ms

Search Operations

Operation	Entries	Time
FTS5 search	1,000	<10ms
FTS5 search	10,000	<50ms
Date range	1,000	<5ms
Date range	10,000	<20ms
Semantic search	1,000	100-200ms
Semantic search	10,000	200-500ms

Visualization

Operation	Entries	Relationships	Time
Entry-centric (depth 2)	10	15	<50ms
Tag-based	20	30	<100ms
Recent	20	25	<100ms

Maintenance for Performance

Monthly

-- Update statistics
ANALYZE;

-- Check index usage
EXPLAIN QUERY PLAN SELECT ...;

Quarterly

-- Reclaim space
VACUUM;

-- Integrity check
PRAGMA integrity_check;

As Needed

Archive old entries
Review slow queries
Check database size

Troubleshooting Performance Issues

Slow Startup

Check:

Version (should be v2.1.0+)
ML dependencies installed (v1.2.1 fixed first-load delay)
System resources (CPU, disk)

Fix:

Update to v2.1.0 (includes all performance optimizations)
Use Docker image (optimized)
Remove ML dependencies if not needed

Slow Searches

Check:

Database size
Query complexity
Filters applied

Fix:

Use more specific queries
Add filters (tags, dates, is_personal)
Run ANALYZE
Consider archiving

Database Locks

Check:

Multiple connections
Long transactions
Nested database calls

Fix:

Update to v2.1.0 (includes all locking fixes)
Use single connection per transaction
Avoid nested database calls

Next: Check Architecture or Security.

Performance

Performance

Performance Overview

Startup Performance

v1.1.3 Optimization: Lazy Loading

v2.0.0 Optimization: Modular Architecture

Database Performance

SQLite Optimization

Indexing Strategy

FTS5 Performance

Transaction Management

Single Connection Pattern

Thread-Safe Tag Creation

Search Performance

Full-Text Search (FTS5)

Date Range Search

Semantic Search

Visualization Performance

Mermaid Generation

Memory Usage

Typical Memory Footprint

Scaling Considerations

Small Journal (<1000 entries)

Medium Journal (1000-10000 entries)

Large Journal (>10000 entries)

Token Efficiency (v2.2.0)

Tool Filtering

Optimization Tips

For Faster Searches

For Faster Entry Creation

For Faster Visualization

Benchmarks

Entry Operations

Search Operations

Visualization

Maintenance for Performance

Monthly

Quarterly

As Needed

Troubleshooting Performance Issues

Slow Startup

Slow Searches

Database Locks

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

🏠 Home

Getting Started

Core Features

Knowledge Graph

GitHub Integration

Workflows

Deployment