Skip to content

SQLite tags join table for indexed tag queries #43

@pathosDev

Description

@pathosDev

Follow-up to #36. The `SqliteJournal` stores tags as a CSV column; `SqliteQuery.currentEventsByTag` filters with `WHERE ',' || tags || ',' LIKE '%,foo,%'` — correct, but a sequential scan that gets expensive once the journal is large.

Strategy: add a normalised `event_tags` join table populated alongside every `append`. Tag queries become an indexed lookup. Existing CSV column stays so `Journal.read` keeps returning tags inline (no API change for the journal contract).

Schema:

```sql
CREATE TABLE event_tags (
persistence_id TEXT NOT NULL,
sequence_nr INTEGER NOT NULL,
tag TEXT NOT NULL,
timestamp INTEGER NOT NULL,
PRIMARY KEY (tag, timestamp, persistence_id, sequence_nr),
FOREIGN KEY (persistence_id, sequence_nr) REFERENCES events
);
CREATE INDEX idx_event_tags_pid_seq ON event_tags(persistence_id, sequence_nr);
```

Migration: on startup, if `event_tags` is missing OR is empty while `events` has tagged rows, run a one-shot backfill from the CSV column. Idempotent — re-running is a no-op.

Components:

File Task
`src/persistence/journals/SqliteJournal.ts` Create `event_tags` table; insert into it inside the `append` transaction.
`src/persistence/query/SqliteQuery.ts` New `currentEventsByTag` query that joins `events ⋈ event_tags WHERE tag = ? AND timestamp >= ?`.
`src/persistence/journals/SqliteMigration.ts` (new) Backfill helper for upgrading existing databases.
`tests/unit/persistence/journals/SqliteTagsMigration.test.ts` (new) CSV → join-table migration; new path is correct + faster.

Estimate: 2-3 days.

Verification:

  • Tag-query benchmark: 100k events with 5 distinct tags — old path scans all 100k rows; new path does an index lookup.
  • Migration test: open a v0 DB (CSV-only), trigger backfill, verify all events present in `event_tags`.

Out of scope:

  • Removing the CSV column (kept for backward-compat + so `Journal.read` still returns tags inline).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions