Follow-up to #36. The `SqliteJournal` stores tags as a CSV column; `SqliteQuery.currentEventsByTag` filters with `WHERE ',' || tags || ',' LIKE '%,foo,%'` — correct, but a sequential scan that gets expensive once the journal is large.
Strategy: add a normalised `event_tags` join table populated alongside every `append`. Tag queries become an indexed lookup. Existing CSV column stays so `Journal.read` keeps returning tags inline (no API change for the journal contract).
Schema:
```sql
CREATE TABLE event_tags (
persistence_id TEXT NOT NULL,
sequence_nr INTEGER NOT NULL,
tag TEXT NOT NULL,
timestamp INTEGER NOT NULL,
PRIMARY KEY (tag, timestamp, persistence_id, sequence_nr),
FOREIGN KEY (persistence_id, sequence_nr) REFERENCES events
);
CREATE INDEX idx_event_tags_pid_seq ON event_tags(persistence_id, sequence_nr);
```
Migration: on startup, if `event_tags` is missing OR is empty while `events` has tagged rows, run a one-shot backfill from the CSV column. Idempotent — re-running is a no-op.
Components:
| File |
Task |
| `src/persistence/journals/SqliteJournal.ts` |
Create `event_tags` table; insert into it inside the `append` transaction. |
| `src/persistence/query/SqliteQuery.ts` |
New `currentEventsByTag` query that joins `events ⋈ event_tags WHERE tag = ? AND timestamp >= ?`. |
| `src/persistence/journals/SqliteMigration.ts` (new) |
Backfill helper for upgrading existing databases. |
| `tests/unit/persistence/journals/SqliteTagsMigration.test.ts` (new) |
CSV → join-table migration; new path is correct + faster. |
Estimate: 2-3 days.
Verification:
- Tag-query benchmark: 100k events with 5 distinct tags — old path scans all 100k rows; new path does an index lookup.
- Migration test: open a v0 DB (CSV-only), trigger backfill, verify all events present in `event_tags`.
Out of scope:
- Removing the CSV column (kept for backward-compat + so `Journal.read` still returns tags inline).
Follow-up to #36. The `SqliteJournal` stores tags as a CSV column; `SqliteQuery.currentEventsByTag` filters with `WHERE ',' || tags || ',' LIKE '%,foo,%'` — correct, but a sequential scan that gets expensive once the journal is large.
Strategy: add a normalised `event_tags` join table populated alongside every `append`. Tag queries become an indexed lookup. Existing CSV column stays so `Journal.read` keeps returning tags inline (no API change for the journal contract).
Schema:
```sql
CREATE TABLE event_tags (
persistence_id TEXT NOT NULL,
sequence_nr INTEGER NOT NULL,
tag TEXT NOT NULL,
timestamp INTEGER NOT NULL,
PRIMARY KEY (tag, timestamp, persistence_id, sequence_nr),
FOREIGN KEY (persistence_id, sequence_nr) REFERENCES events
);
CREATE INDEX idx_event_tags_pid_seq ON event_tags(persistence_id, sequence_nr);
```
Migration: on startup, if `event_tags` is missing OR is empty while `events` has tagged rows, run a one-shot backfill from the CSV column. Idempotent — re-running is a no-op.
Components:
Estimate: 2-3 days.
Verification:
Out of scope: