fix(phlower): drop covering index that blocks pod startup#25
Merged
Conversation
PR #24 merged with a (finished_at, task_id) covering index in the SCHEMA. On the existing prod DB (~26 GB) the CREATE INDEX takes 20+ min — longer than the 3 min startup probe budget. Each pod restart is killed mid-build, SQLite rolls back the partial index, net zero progress. Pod stuck in CrashLoopBackOff for 30+ min. The index was a marginal optimization, not a fix. The batched purge from PR #24 already solves the lock-holding death spiral. Without the new index the inner SELECT still uses idx_inv_finished (fast filter) plus the PK autoindex on invocation_details (covering) — verified clean query plan via EXPLAIN. Removing it lets prod-us pod start in seconds.
5 tasks
webjunkie
added a commit
that referenced
this pull request
May 2, 2026
**Why** Hourly purge runs `DELETE FROM invocations WHERE finished_at < ?` against a 26 GB table. The DELETE holds the SQLite write lock long enough that the flush loop can't drain its in-memory buffer. New events keep arriving, buffer grows, each subsequent flush is bigger, takes longer, falls further behind. RSS runs away, liveness probe kills the pod. PR #25 made this gentler (batched purge with yields between batches), but it didn't fix the root cause — it just moved the threshold. Same death spiral fired again on May 2: mega-flushes of 200K → 678K → 2.55M → 1.47M records, then SIGKILL at 03:15 UTC. **What** Move invocations to daily-partitioned tables (`invocations_YYYYMMDD`, `invocation_details_YYYYMMDD`). Retention is now `DROP TABLE` — a metadata operation. The hourly purge stops contending with the flush loop entirely. Pre-partition databases get migrated on first boot: `ALTER TABLE invocations RENAME TO invocations_legacy`. Rename is metadata-only, instant even on the 26 GB prod-us DB. Reads UNION across the legacy table until its data ages past retention, then it gets dropped wholesale. Also caps the in-memory write-behind buffer (`SQLITE_PENDING_BUFFER_CAP`, default 200k records) with drop-oldest semantics. Bounded RAM matters more than capturing every record during overload — the alternative is OOM and losing all of them. **Notes** - `refresh_cached_stats` uses `MAX(rowid)` on the legacy table instead of `COUNT(*)` — index-fast, accurate enough for healthz during the transition window. - Recovery still chunks per-partition by 4-hour windows so a multi-GB legacy table doesn't hold the read snapshot for minutes. - Today's and tomorrow's partitions get pre-created in the purge loop so midnight-UTC rollover doesn't race with the first flush of the day. - Disk-pressure path simplified: emergency mode just drops more partitions until under the cap. No more fallback DELETE. **Test plan** - [x] Smoke test: write across two days, query, purge — verified partition rotation - [x] Migration test: pre-partition DB → renamed to `_legacy`, reads find both, recovery works (1000 rows replayed) - [x] Docker build + boot: clean fresh-DB and migrated DB scenarios - [ ] Deploy to prod-us, monitor RSS over the next purge cycle (~01:00 UTC May 3) - [ ] Verify `dropped-invocations` counter stays at 0 in healthz logs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hotfix — PR #24 merged with a heavy covering index in SCHEMA. CREATE INDEX on the existing 26 GB prod DB takes 20+ min, longer than the startup probe budget. Pod stuck in CrashLoopBackOff: each attempt killed mid-build, SQLite rolls back partial index, net zero progress.
The batched purge from #24 already fixes the death spiral on its own. The covering index was a marginal optimization not worth the deploy hazard.
Test plan