perf(kiloclaw): index-backed first-instance lookup for setup promo#3260
Conversation
Lock in current behavior before refactoring the hot-path query that accounts for ~21% of read-replica time. Real-DB tests cover destroyed-row semantics, multi-instance first-instance ordering, custom maxAgeHours, and per-user isolation.
This query was ~21% of read-replica time-consumed. The old aggregate `min(created_at) >= now() - interval` filtered only on `user_id`; every existing index on that column is partial on `destroyed_at IS NULL`, so the planner fell back to scanning the whole user history (destroyed rows must still count for first-instance semantics, so we can't add the partial predicate to the query). Add a non-partial `(user_id, created_at)` index and rewrite the lookup as `ORDER BY created_at LIMIT 1` with the window check in JS. Verified with EXPLAIN that this becomes an index-only scan returning a single row. Migration uses CREATE INDEX CONCURRENTLY (precedent: 0115) since `kiloclaw_instances` is a hot table.
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (6 files)
NotesMigration — Query rewrite — Semantics — destroyed instances still count (no Test coverage — 9 cases cover: no instances, inside/outside window, multiple instances (oldest wins), destroyed-instance counting, custom Fix these issues in Kilo Cloud Reviewed by claude-4.6-sonnet-20260217 · 852,599 tokens |
jeanduplessis
left a comment
There was a problem hiding this comment.
Reviewed the final combined diff and PR context. Approving per request without line comments.
Summary
The
userIsWithinFirstKiloClawInstanceWindowquery inapps/web/src/lib/kiloclaw/setup-promo.tswas responsible for ~21% of read-replica time-consumed.The original query used
min(created_at) >= now() - intervaland filtered only onuser_id. Every existing index onkiloclaw_instances.user_idis partial ondestroyed_at IS NULL, so the planner couldn't use them — destroyed rows must still count for "first instance" semantics, otherwise a returning user could destroy their first instance and re-qualify for the setup-promo window indefinitely.Two changes:
(user_id, created_at)btree index (IDX_kiloclaw_instances_user_id_created_at). This is the only index that can serve auser_id-only lookup over the full instance history.min()aggregate withORDER BY created_at LIMIT 1plus a JS-side window check.EXPLAINconfirms this becomes an index-only scan returning a single row.The function's behavior is unchanged. New tests in
setup-promo.test.ts(added in the first commit on this branch) lock in the semantics — including the destroyed-rows-still-count case — and pass against both the old and new implementations.The migration uses
CREATE INDEX CONCURRENTLY(precedent:0115_real_energizer.sql) sincekiloclaw_instancesis a hot table and a blocking index build would lock writes.Verification
EXPLAINagainst the local test database that the rewritten query is an index-only scan usingIDX_kiloclaw_instances_user_id_created_at, no heap fetches, no sequential scan.EXPLAINfor the originalmin()query with the new index in place — planner correctly rewrites it to a Limit + Index Only Scan, confirming the index alone would have resolved the perf issue.Visual Changes
N/A
Reviewer Notes
COMMIT;and ends withBEGIN;to allowCREATE INDEX CONCURRENTLYto run outside the transaction Drizzle wraps migrations in. Same pattern aspackages/db/src/migrations/0115_real_energizer.sql. Worth a sanity check from whoever owns the migration apply path.min()semantics.min()-via-index optimization (which can occasionally be defeated by parallel workers / unusual stats).