Skip to content
Chris & Mike edited this page Mar 14, 2026 · 15 revisions

Database Schema

Complete reference for Memory Journal's SQLite database schema - the persistent storage layer for project context.

Schema Design: Optimized for three critical AI context operations: fast entry retrieval, relationship graph traversal, and full-text search.


Overview

Memory Journal uses SQLite via better-sqlite3 (native):

  • Native file-based database with WAL mode
  • FTS5 full-text search with BM25 ranking
  • Foreign key constraints
  • Comprehensive indexes
  • Automatic schema migrations

Core Tables

memory_journal

Main table storing journal entries.

CREATE TABLE memory_journal (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    entry_type TEXT NOT NULL,
    content TEXT NOT NULL,
    timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
    is_personal INTEGER DEFAULT 1,
    significance_type TEXT,
    auto_context TEXT,
    deleted_at TEXT,
    -- GitHub Projects (v1.2.0)
    project_number INTEGER,
    project_owner TEXT,
    -- GitHub Issues & PRs (v2.0.0)
    issue_number INTEGER,
    issue_url TEXT,
    pr_number INTEGER,
    pr_url TEXT,
    pr_status TEXT,
    -- GitHub Actions (v2.1.0)
    workflow_run_id INTEGER,
    workflow_name TEXT,
    workflow_status TEXT
);

Columns:

Column Type Nullable Description
id INTEGER No Primary key (auto-increment)
entry_type TEXT No Type of entry (personal_reflection, technical_achievement, etc.)
content TEXT No Entry content (max 50KB)
timestamp TEXT No ISO 8601 timestamp (auto-set)
is_personal INTEGER No 1 = personal, 0 = project (default: 1)
significance_type TEXT Yes Significance marker (technical_breakthrough, milestone, etc.)
auto_context TEXT Yes JSON string of Git/GitHub context
deleted_at TEXT Yes Soft delete timestamp (NULL = not deleted)
project_number INTEGER Yes GitHub Project number (v1.2.0)
project_owner TEXT Yes GitHub Project owner (user or org) (v1.2.0)
issue_number INTEGER Yes GitHub Issue number (v2.0.0)
issue_url TEXT Yes Full URL to GitHub Issue (v2.0.0)
pr_number INTEGER Yes Pull Request number (v2.0.0)
pr_url TEXT Yes Full URL to Pull Request (v2.0.0)
pr_status TEXT Yes PR status: draft/open/merged/closed (v2.0.0)

| workflow_run_id | INTEGER | Yes | GitHub Actions workflow run ID (v2.1.0) | | workflow_name | TEXT | Yes | Workflow name for quick reference (v2.1.0) | | workflow_status | TEXT | Yes | Workflow status: queued/in_progress/completed (v2.1.0) |

Indexes:

CREATE INDEX idx_memory_journal_timestamp ON memory_journal(timestamp);
CREATE INDEX idx_memory_journal_type ON memory_journal(entry_type);
CREATE INDEX idx_memory_journal_personal ON memory_journal(is_personal);
CREATE INDEX idx_memory_journal_deleted ON memory_journal(deleted_at);
-- GitHub Projects indexes (v1.2.0)
CREATE INDEX idx_memory_journal_project ON memory_journal(project_number);
-- GitHub Issues & PRs indexes (v2.0.0)
CREATE INDEX idx_memory_journal_issue_number ON memory_journal(issue_number);
CREATE INDEX idx_memory_journal_pr_number ON memory_journal(pr_number);
-- GitHub Actions index (v2.1.0)
CREATE INDEX idx_memory_journal_workflow_run_id ON memory_journal(workflow_run_id);

Entry Types:

  • personal_reflection
  • project_decision
  • technical_achievement
  • bug_fix
  • feature_implementation
  • code_review
  • meeting_notes
  • learning
  • research
  • planning
  • retrospective
  • standup
  • other

Significance Types:

  • milestone
  • breakthrough
  • technical_breakthrough
  • decision
  • lesson_learned
  • blocker_resolved
  • release

tags

Tag definitions with usage tracking.

CREATE TABLE tags (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT UNIQUE NOT NULL,
    usage_count INTEGER DEFAULT 0
);

Columns:

Column Type Nullable Description
id INTEGER No Primary key (auto-increment)
name TEXT No Tag name (unique, max 100 chars)
usage_count INTEGER No Number of entries using this tag

Indexes:

CREATE INDEX idx_tags_name ON tags(name);

Tag Management:

  • Auto-created on first use (INSERT OR IGNORE)
  • usage_count updated automatically
  • Lowercase recommended (not enforced)

entry_tags

Junction table linking entries to tags (many-to-many).

CREATE TABLE entry_tags (
    entry_id INTEGER NOT NULL,
    tag_id INTEGER NOT NULL,
    PRIMARY KEY (entry_id, tag_id),
    FOREIGN KEY (entry_id) REFERENCES memory_journal(id) ON DELETE CASCADE,
    FOREIGN KEY (tag_id) REFERENCES tags(id) ON DELETE CASCADE
);

Columns:

Column Type Nullable Description
entry_id INTEGER No Reference to memory_journal.id
tag_id INTEGER No Reference to tags.id

Composite Primary Key: (entry_id, tag_id)

Indexes:

CREATE INDEX idx_entry_tags_entry ON entry_tags(entry_id);
CREATE INDEX idx_entry_tags_tag ON entry_tags(tag_id);

Cascading Deletes:

  • Deleting an entry removes all its tag associations
  • Deleting a tag removes all entry associations

relationships

Typed relationships between entries.

CREATE TABLE relationships (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    from_entry_id INTEGER NOT NULL,
    to_entry_id INTEGER NOT NULL,
    relationship_type TEXT NOT NULL,
    description TEXT,
    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (from_entry_id) REFERENCES memory_journal(id) ON DELETE CASCADE,
    FOREIGN KEY (to_entry_id) REFERENCES memory_journal(id) ON DELETE CASCADE
);

Columns:

Column Type Nullable Description
id INTEGER No Primary key (auto-increment)
from_entry_id INTEGER No Source entry ID
to_entry_id INTEGER No Target entry ID
relationship_type TEXT No Type of relationship
description TEXT Yes Optional description
created_at TEXT No ISO 8601 timestamp (auto-set)

Indexes:

CREATE INDEX idx_relationships_from ON relationships(from_entry_id);
CREATE INDEX idx_relationships_to ON relationships(to_entry_id);

Relationship Types:

  • references - General connection
  • implements - Implementation
  • clarifies - Explanation
  • evolves_from - Iteration
  • response_to - Reply
  • blocked_by - Blocked by (causal)
  • resolved - Resolved (causal)
  • caused - Caused (causal)

Cascading Deletes:

  • Deleting an entry removes all its relationships (both directions)

embeddings

Vector embeddings for semantic search (optional).

CREATE TABLE embeddings (
    entry_id INTEGER PRIMARY KEY,
    embedding BLOB NOT NULL,
    model_name TEXT NOT NULL,
    FOREIGN KEY (entry_id) REFERENCES memory_journal(id) ON DELETE CASCADE
);

Columns:

Column Type Nullable Description
entry_id INTEGER No Primary key, reference to memory_journal.id
embedding BLOB No Binary vector data (384 dimensions)
model_name TEXT No Model name (e.g., 'all-MiniLM-L6-v2')

Storage:

  • Each embedding: ~1.5KB (384 floats × 4 bytes)
  • 1000 entries: ~1.5MB
  • 10000 entries: ~15MB

Cascading Deletes:

  • Deleting an entry removes its embedding

Full-Text Search

FTS5 Full-Text Search

The codebase uses SQLite FTS5 for full-text search with BM25 relevance ranking, phrase queries, and prefix matching.

Note

An FTS5 virtual table (memory_journal_fts) is maintained via triggers that sync with the memory_journal table on insert, update, and delete. Queries containing unsafe FTS5 characters automatically fall back to LIKE matching.

Current implementation:

-- FTS5 virtual table (managed by migrateSchema):
CREATE VIRTUAL TABLE memory_journal_fts USING fts5(
  content, tokenize='porter unicode61'
);
-- Triggers: memory_journal_ai, memory_journal_au, memory_journal_ad

-- Full-text search uses FTS5 MATCH with BM25 ranking:
SELECT mj.* FROM memory_journal mj
JOIN memory_journal_fts fts ON mj.rowid = fts.rowid
WHERE memory_journal_fts MATCH ? AND mj.deleted_at IS NULL
ORDER BY bm25(memory_journal_fts)
LIMIT ?;

Features:

  • BM25 relevance ranking
  • Phrase search and prefix matching
  • Porter stemming via unicode61 tokenizer
  • Automatic fallback to LIKE for unsafe queries

Schema Migrations

version 1.1.3 Migration

Added soft delete and relationships.

function runMigrations(db: Database): void {
  // Add deleted_at column if missing
  const columns = db.exec("PRAGMA table_info(memory_journal)");
  const columnNames = columns[0]?.values.map((row) => row[1]) ?? [];

  if (!columnNames.includes("deleted_at")) {
    console.error("Running migration: Adding deleted_at column");
    db.run("ALTER TABLE memory_journal ADD COLUMN deleted_at TEXT");
  }

  // Create relationships table if missing
  const tables = db.exec(
    "SELECT name FROM sqlite_master WHERE type='table' AND name='relationships'",
  );
  if (tables.length === 0 || tables[0].values.length === 0) {
    console.error("Running migration: Creating relationships table");
    db.run(`
      CREATE TABLE relationships (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        from_entry_id INTEGER NOT NULL,
        to_entry_id INTEGER NOT NULL,
        relationship_type TEXT NOT NULL,
        description TEXT,
        created_at TEXT DEFAULT CURRENT_TIMESTAMP,
        FOREIGN KEY (from_entry_id) REFERENCES memory_journal(id) ON DELETE CASCADE,
        FOREIGN KEY (to_entry_id) REFERENCES memory_journal(id) ON DELETE CASCADE
      )
    `);
    db.run(
      "CREATE INDEX idx_relationships_from ON relationships(from_entry_id)",
    );
    db.run("CREATE INDEX idx_relationships_to ON relationships(to_entry_id)");
  }
}

Migration Strategy:

  • Check if columns/tables exist
  • Add if missing
  • Non-destructive (data preserved)
  • Idempotent (safe to run multiple times)

Database Size Estimates

Typical Entry

  • Content: 500 bytes (average)
  • Metadata: 200 bytes
  • Tags: 50 bytes (2-3 tags)
  • Embedding: 1500 bytes (if enabled)
  • Total: ~2250 bytes per entry

Size Projections

Entries With Embeddings Without Embeddings
100 220 KB 75 KB
1,000 2.2 MB 750 KB
10,000 22 MB 7.5 MB
100,000 220 MB 75 MB
  • Tags table: negligible
  • Relationships: ~100 bytes each

Query Examples

Get Entry with Tags

SELECT
    e.*,
    GROUP_CONCAT(t.name, ', ') as tags
FROM memory_journal e
LEFT JOIN entry_tags et ON e.id = et.entry_id
LEFT JOIN tags t ON et.tag_id = t.id
WHERE e.id = ?
  AND e.deleted_at IS NULL
GROUP BY e.id;

Full-Text Search

SELECT
    e.id,
    e.content,
    e.entry_type,
    e.timestamp
FROM memory_journal e
WHERE e.content LIKE '%' || ? || '%'
  AND e.deleted_at IS NULL
ORDER BY e.timestamp DESC
LIMIT ?;

Entry Relationships (Recursive)

WITH RECURSIVE related_entries(entry_id, depth) AS (
    VALUES(?, 0)
    UNION
    SELECT DISTINCT
        CASE
            WHEN r.from_entry_id = re.entry_id THEN r.to_entry_id
            ELSE r.from_entry_id
        END,
        re.depth + 1
    FROM relationships r
    JOIN related_entries re ON
        r.from_entry_id = re.entry_id OR r.to_entry_id = re.entry_id
    WHERE re.depth < ?
)
SELECT DISTINCT e.*, re.depth
FROM related_entries re
JOIN memory_journal e ON e.id = re.entry_id
WHERE e.deleted_at IS NULL
ORDER BY re.depth, e.timestamp DESC;

Maintenance

VACUUM

Reclaim space after deletes:

VACUUM;

When to run:

  • After many permanent deletes
  • Database size is much larger than data
  • Typically monthly or quarterly

ANALYZE

Update query planner statistics:

ANALYZE;

When to run:

  • After bulk inserts
  • After schema changes
  • Typically monthly

Note: v1.1.3 removed automatic ANALYZE from startup (too expensive).

Integrity Check

Verify database integrity:

PRAGMA integrity_check;

Index Usage

Check if indexes are being used:

EXPLAIN QUERY PLAN
SELECT * FROM memory_journal WHERE timestamp > '2025-10-01';

Backup & Restore

Backup

Use the built-in MCP backup tools rather than raw SQLite commands:

// Create a backup
backup_journal({ name: "pre-migration" });

// List available backups
list_backups();

// Restore from backup (auto-creates safety backup first)
restore_backup({ filename: "backup_pre-migration.db", confirm: true });

// Clean up old backups
cleanup_backups({ keep_count: 5 });

Note

better-sqlite3 uses a native file-based database with WAL mode. Use the MCP backup tools for managed backup and restore operations.


Schema Evolution

Future Additions

Planned schema changes:

  1. Attachments table - File attachments
  2. Versions table - Entry version history
  3. Metadata table - Schema version tracking
  4. Collections table - User-defined collections

Backward Compatibility

  • Migrations preserve existing data
  • Old clients continue to work (graceful degradation)
  • New columns have defaults

Next: Check Architecture or Performance.

Clone this wiki locally