fix(phlower): drop covering index that blocks pod startup by webjunkie · Pull Request #25 · PostHog/phlower

webjunkie · 2026-04-30T21:07:53Z

Hotfix — PR #24 merged with a heavy covering index in SCHEMA. CREATE INDEX on the existing 26 GB prod DB takes 20+ min, longer than the startup probe budget. Pod stuck in CrashLoopBackOff: each attempt killed mid-build, SQLite rolls back partial index, net zero progress.

The batched purge from #24 already fixes the death spiral on its own. The covering index was a marginal optimization not worth the deploy hazard.

Test plan

Verified query plan without the new index: idx_inv_finished + invocation_details PK autoindex give a clean plan
Pod starts cleanly after rollout

PR #24 merged with a (finished_at, task_id) covering index in the SCHEMA. On the existing prod DB (~26 GB) the CREATE INDEX takes 20+ min — longer than the 3 min startup probe budget. Each pod restart is killed mid-build, SQLite rolls back the partial index, net zero progress. Pod stuck in CrashLoopBackOff for 30+ min. The index was a marginal optimization, not a fix. The batched purge from PR #24 already solves the lock-holding death spiral. Without the new index the inner SELECT still uses idx_inv_finished (fast filter) plus the PK autoindex on invocation_details (covering) — verified clean query plan via EXPLAIN. Removing it lets prod-us pod start in seconds.

**Why** Hourly purge runs `DELETE FROM invocations WHERE finished_at < ?` against a 26 GB table. The DELETE holds the SQLite write lock long enough that the flush loop can't drain its in-memory buffer. New events keep arriving, buffer grows, each subsequent flush is bigger, takes longer, falls further behind. RSS runs away, liveness probe kills the pod. PR #25 made this gentler (batched purge with yields between batches), but it didn't fix the root cause — it just moved the threshold. Same death spiral fired again on May 2: mega-flushes of 200K → 678K → 2.55M → 1.47M records, then SIGKILL at 03:15 UTC. **What** Move invocations to daily-partitioned tables (`invocations_YYYYMMDD`, `invocation_details_YYYYMMDD`). Retention is now `DROP TABLE` — a metadata operation. The hourly purge stops contending with the flush loop entirely. Pre-partition databases get migrated on first boot: `ALTER TABLE invocations RENAME TO invocations_legacy`. Rename is metadata-only, instant even on the 26 GB prod-us DB. Reads UNION across the legacy table until its data ages past retention, then it gets dropped wholesale. Also caps the in-memory write-behind buffer (`SQLITE_PENDING_BUFFER_CAP`, default 200k records) with drop-oldest semantics. Bounded RAM matters more than capturing every record during overload — the alternative is OOM and losing all of them. **Notes** - `refresh_cached_stats` uses `MAX(rowid)` on the legacy table instead of `COUNT(*)` — index-fast, accurate enough for healthz during the transition window. - Recovery still chunks per-partition by 4-hour windows so a multi-GB legacy table doesn't hold the read snapshot for minutes. - Today's and tomorrow's partitions get pre-created in the purge loop so midnight-UTC rollover doesn't race with the first flush of the day. - Disk-pressure path simplified: emergency mode just drops more partitions until under the cap. No more fallback DELETE. **Test plan** - [x] Smoke test: write across two days, query, purge — verified partition rotation - [x] Migration test: pre-partition DB → renamed to `_legacy`, reads find both, recovery works (1000 rows replayed) - [x] Docker build + boot: clean fresh-DB and migrated DB scenarios - [ ] Deploy to prod-us, monitor RSS over the next purge cycle (~01:00 UTC May 3) - [ ] Verify `dropped-invocations` counter stays at 0 in healthz logs

webjunkie enabled auto-merge (squash) April 30, 2026 21:08

webjunkie merged commit ae4b3a9 into main Apr 30, 2026
10 checks passed

webjunkie deleted the fix/drop-heavy-index branch April 30, 2026 21:08

webjunkie mentioned this pull request May 2, 2026

fix(phlower): partition invocations by day, drop instead of delete #26

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(phlower): drop covering index that blocks pod startup#25

fix(phlower): drop covering index that blocks pod startup#25
webjunkie merged 1 commit into
mainfrom
fix/drop-heavy-index

webjunkie commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

webjunkie commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant