perf(telemetry): cache approxIndexSizeBytes - codedb_status 9.4x faster#474
Conversation
…9.4x faster The approxIndexSizeBytes function iterates 3 indexes (word_index, trigram_index, sparse_ngram_index) to sum approximate memory usage. It's called by codedb_status — the only caller — once per status call. After PR #471 lifted the trigram cap from 64KB to 1MB (a correctness fix that pulled large code files into the index), this iteration got ~2x slower because trigram_index grew. The CI bench flagged codedb_status +96% (216us -> 423us) which was noisy enough to gate the parity merge. Fix: cache the computed size with a 5-second TTL. Status is a human-readable summary that says "index_memory: NN KB"; 5-second staleness is fine for an approximate counter. The first call after each 5s window still does the full iteration to refresh. Measured on React corpus (100 iter warm, codedb_status MCP): before (no cache): ~423us p50 after (cached): 45us p50 (9.4x speedup) 150us p99 (the periodic refresh) Also faster than the pre-cap-lift baseline (216us p50), because the cached fast-path skips even the original iteration cost. Cost: ~32 bytes of atomic state (two u64 atomics).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f056a3568a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (cached_at != 0 and now - cached_at < SIZE_CACHE_TTL_MS) { | ||
| return size_cache_value.load(.monotonic); |
There was a problem hiding this comment.
Scope index-size cache per explorer
This cache is process-global, so codedb_status can return the wrong index_memory when two different projects are queried within 5 seconds: MCP resolves project to different Explorer instances via ProjectCache.get(...) and then calls handleStatus(...), but approxIndexSizeBytes reuses one shared cached value for all explorers. In that scenario, project B will report project A’s index size until the TTL expires, which is a functional regression (not just staleness) for multi-project MCP sessions.
Useful? React with 👍 / 👎.
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
Fixes the codedb_status +96% regression flagged after PR #471 (trigram cap-lift). Caches the index-size computation for 5 seconds - appropriate for a "this is approximate" memory counter.
Measured (React corpus, 100 iter warm, MCP)
9.4x faster, also faster than the pre-cap-lift baseline (216 us). Cost: 32 bytes of atomic state.