fix(health): single-flight HealthMetricsService.compute() to prevent concurrent dedup full-scans (#262)#264
Conversation
…concurrent dedup full-scans (#262) Four callers can invoke compute() concurrently (hourly cron, HTTP refresh endpoint, snapshot service, dream-cycle completion hook). Each call runs a full GROUP BY raw over the memories table. On a ~430K-row / 3.9GB table, three overlapping invocations consumed ~240% CPU and saturated the host fans in an observed incident (issue #262). This change adds a single-flight guard: concurrent compute() callers share the same in-flight promise, so the underlying queries run at most once per overlap window. The slot is released in finally(), so subsequent non-overlapping callers compute fresh as before, and a rejection does not poison future calls. Note: this does not address the underlying query cost. A follow-up will add an expression index on memories(raw) or relax the stat to a sampled approximation. Issue #262 stays open until both fixes land. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughHealthMetricsService now caches in-flight ChangesSingle-flight concurrent health-metrics computation
Sequence DiagramsequenceDiagram
participant Caller1
participant Caller2
participant Caller3
participant HealthMetricsService
Caller1->>HealthMetricsService: compute() [no cached promise]
HealthMetricsService->>HealthMetricsService: create inFlightCompute promise
HealthMetricsService->>HealthMetricsService: start computeUncached()
HealthMetricsService->>Caller1: return promise
Caller2->>HealthMetricsService: compute() [inFlightCompute exists]
HealthMetricsService->>Caller2: return same promise
Caller3->>HealthMetricsService: compute() [inFlightCompute exists]
HealthMetricsService->>Caller3: return same promise
HealthMetricsService->>HealthMetricsService: computeUncached completes
HealthMetricsService->>HealthMetricsService: clear inFlightCompute (finally)
Caller1->>Caller1: resolve with result
Caller2->>Caller2: resolve with result
Caller3->>Caller3: resolve with result
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Recall Benchmark ResultsFull outputCommit: 56e36f6 |
Closes #262 (partial — single-flight only; expression index / sampled stat is the follow-up).
Why a v2 PR?
PR #263 was branched off a stale base and ended up carrying PR #242 (Elasticsearch hybrid search) plus the actual fix, which produced unresolvable merge conflicts against current staging. This PR is the same fix commit cherry-picked clean onto current staging. Closing #263.
The fix
Four callers can invoke `HealthMetricsService.compute()` concurrently (hourly cron, HTTP refresh endpoint, snapshot service, dream-cycle completion hook). Each call runs a full `GROUP BY raw` over the memories table. On a ~430K-row / 3.9GB table, three overlapping invocations were observed consuming ~240% CPU and saturating host fans (issue #262).
Adds a single-flight guard so concurrent callers share the in-flight promise. The slot is released in `finally()` — subsequent non-overlapping callers compute fresh, and a rejection does not poison future calls.
Scope
What this does NOT fix
The underlying query cost. `GROUP BY raw` on 3.9GB still scans the whole table — it just won't be scanned 3x in parallel anymore. Follow-up needed: expression index on `memories(raw)` or sampled approximation. Issue #262 stays open for that.
Verification
```
pnpm jest src/health/health-metrics.service.spec.ts
Tests: 11 passed, 11 total
```
🤖 Generated with Claude Code
Summary by CodeRabbit