Skip to content

CS-11128: DB-coordinated cost barrier (replaces per-process pendingCostPromises)#4878

Merged
lukemelia merged 1 commit into
mainfrom
cs-11128-replace-per-process-pendingcostpromises-with-db-coordinated
May 19, 2026
Merged

CS-11128: DB-coordinated cost barrier (replaces per-process pendingCostPromises)#4878
lukemelia merged 1 commit into
mainfrom
cs-11128-replace-per-process-pendingcostpromises-with-db-coordinated

Conversation

@lukemelia
Copy link
Copy Markdown
Contributor

Summary

  • Replaces the in-memory Map<matrixUserId, Promise> cost barrier in packages/realm-server/lib/proxy-forward.ts with a two-layer serializer that holds across replicas: pg_advisory_xact_lock(hash('cost-barrier:' + userId)) for cross-replica, plus an in-process promise chain (PgAdapter#userCostQueue) so per-replica pool footprint stays bounded to one pinned client per active user, not per concurrent request.
  • handleStreamingRequest now awaits saveUsageCost inline (no more fire-and-forget) so the lock is held until the cost row commits.
  • New DBAdapter.withUserCostLock(matrixUserId, fn). PgAdapter implements it; SQLite is a passthrough.
  • ai-bot is intentionally not changed: it runs N=1 and talks to OpenRouter directly, not through the realm-server's /_request-forward or /_openrouter/chat/completions endpoints. If ai-bot ever scales out, wrapping its critical section in assistant.pgAdapter.withUserCostLock(userId, ...) will extend the barrier with no schema changes.

Why

proxy-forward.ts:17 (pre-PR) gated a user's next billable upstream call behind the prior call's cost deduction landing in credits_ledger, via a process-local map. Under N replicas with no stickiness, two concurrent same-user requests on different replicas would each see an empty map, both forward to OpenRouter, and the bounded-overdraft guard the map was supposed to enforce was silently disabled. Identified in the 2026-05-12 audit for Unlock Horizontal Scaling of realm-server.

The ticket's recommendation was "wrap upstream-call + cost-deduction in pg_advisory_xact_lock(hashtext(matrixUserId)); the existing in-memory map can be deleted." The straight version of that holds one pool client per concurrent same-user request — unacceptable on the realm-server's shared pool, which is sized for the indexer's 20-client baseline + ~20 margin. Retaining the in-memory map as an in-process coalescer brings per-replica footprint back to one pinned client per active user.

Key design notes

  • Namespaced lock key. hashUserIdForCostLock prefixes 'cost-barrier:' before sha256 so user-cost locks can never collide with the realm-write lock space (hashRealmUrlForAdvisoryLock), even though input shapes are disjoint in practice.
  • Failure chain. A prior in-process caller's rejection does NOT cascade — await previous.catch(...) swallows it so the queue marches on. The lock's rollback semantics are the same as withWriteLock: aborted transaction releases the xact-lock on connection release, no stale-lock risk.
  • Callback receives no txQuerier. Cost-save runs through the shared dbAdapter on a separate pool connection; the only pinned client is the lock holder.

Test plan

  • pnpm --filter @cardstack/realm-server test-module realm-advisory-locks-test.ts → 18/18 pass (7 new: hash determinism, namespace disjointness, 64-bit range, serialization, parallel-different-users, throw-releases, N=8 in-process coalescing, chain-failure resilience, no-realm-write-cross-talk)
  • pnpm --filter @cardstack/realm-server test-module request-forward-test.ts → 13/13 pass (1 new: "serializes concurrent same-user OpenRouter calls via the cost-barrier lock" — gates two concurrent requests at the fetch stub, asserts the second blocks until the first's cost lands, then verifies both debits)
  • pnpm --filter @cardstack/realm-server test-module openrouter-passthrough-test.ts → 5/5 still pass
  • CI green
  • Smoke-test on staging once deployed: a single user with two tabs hitting the same realm-server fleet should see correct serialization and no overdraft

🤖 Generated with Claude Code

… cost barrier

Pre-CS-11128, packages/realm-server/lib/proxy-forward.ts gated a user's
next billable proxy call behind a per-process in-memory Map<userId,
Promise>. Under N realm-server replicas with no stickiness, two
concurrent same-user requests landing on different replicas bypass the
gate, both forward to OpenRouter before the credit ledger has caught
up, and the bounded-overdraft guard the map was supposed to enforce is
silently disabled.

Replace with a two-layer barrier:

1. Cross-replica: pg_advisory_xact_lock on a namespaced hash of the
   matrix user id (hashUserIdForCostLock prefix 'cost-barrier:' keeps
   the key space disjoint from withWriteLock's realm-URL space). Held
   across validate-credits → upstream call → save-cost; the
   in-progress map's job is reused, just durable across replicas.
2. In-process: PgAdapter#userCostQueue chains same-user callers in
   one process on an in-memory promise. Only the head of the chain is
   actively waiting on the DB lock, so per-replica pool footprint is
   bounded to one pinned client per active same-user user — not per
   concurrent request. Without this, N concurrent same-user tabs on
   one replica would pin N pool clients while blocked on the lock,
   eating into the indexer's 40-client budget.

handleStreamingRequest now awaits saveUsageCost inline (not
fire-and-forget) so the lock is held until the cost row commits.
Handlers (request-forward, openrouter-passthrough) wrap the critical
section in dbAdapter.withUserCostLock(userId, …).

ai-bot is N=1 by design and talks to OpenRouter directly (not through
realm-server), so it remains on its in-memory barrier — documented in
memory rather than changed here.

Tests:
- realm-advisory-locks-test.ts: 18 pass, 7 new (hashUserIdForCostLock
  determinism + namespace separation + 64-bit range; withUserCostLock
  serialization, parallel-different-users, throw-releases, N=8
  in-process coalescing, chain-failure resilience, no cross-talk with
  withWriteLock).
- request-forward-test.ts: 13 pass, 1 new ("serializes concurrent
  same-user OpenRouter calls via the cost-barrier lock") — gates two
  concurrent same-user requests at the fetch stub, asserts the second
  blocks until the first's cost lands, then verifies both debits.
- openrouter-passthrough-test.ts: 5/5 still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Preview deployments

Host Test Results

    1 files      1 suites   1h 45m 48s ⏱️
2 673 tests 2 658 ✅ 15 💤 0 ❌
2 692 runs  2 677 ✅ 15 💤 0 ❌

Results for commit af89225.

Realm Server Test Results

    1 files      1 suites   8m 44s ⏱️
1 416 tests 1 416 ✅ 0 💤 0 ❌
1 507 runs  1 507 ✅ 0 💤 0 ❌

Results for commit af89225.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Replaces an in-memory per-user cost-barrier Map in proxy-forward.ts with a two-layer serializer (Postgres advisory xact-lock + per-process promise queue) exposed via a new DBAdapter.withUserCostLock(matrixUserId, fn). This fixes a silent overdraft bug under N>1 realm-server replicas where same-user concurrent calls could each pass the per-process gate and bill twice, while keeping per-replica pool footprint bounded to one pinned connection per active user (not per concurrent request).

Changes:

  • Adds withUserCostLock to DBAdapter; implemented in PgAdapter with namespaced hashUserIdForCostLock + pg_advisory_xact_lock and an in-process queue, no-op in SQLiteAdapter.
  • Refactors proxy-forward.ts to remove pendingCostPromises/awaitPendingCost/trackCostDeduction; handleStreamingRequest now awaits saveUsageCost inline so the lock is held until the cost row commits.
  • Updates handle-request-forward.ts and handle-openrouter-passthrough.ts to wrap validateCredits → upstream call → saveUsageCost inside withUserCostLock, plus new advisory-lock and request-forward tests.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
packages/runtime-common/db.ts Adds withUserCostLock to DBAdapter interface with design comments.
packages/postgres/pg-adapter.ts Implements withUserCostLock + hashUserIdForCostLock; extracts shared #runWithAdvisoryXactLock.
packages/host/app/lib/sqlite-adapter.ts Passthrough withUserCostLock for SQLite.
packages/realm-server/lib/proxy-forward.ts Removes in-memory cost-barrier helpers; awaits saveUsageCost inline before returning from streaming handler.
packages/realm-server/handlers/handle-request-forward.ts Wraps credit-check/fetch/save-cost in withUserCostLock; consolidates streaming-support check.
packages/realm-server/handlers/handle-openrouter-passthrough.ts Same lock wrapper as above for the OpenRouter passthrough route.
packages/realm-server/tests/realm-advisory-locks-test.ts New tests for hash determinism, namespacing, serialization, parallel different users, failure-release, in-process coalescing, no cross-talk with realm-write lock.
packages/realm-server/tests/request-forward-test.ts New end-to-end test that two concurrent same-user calls serialize at the fetch stub.
packages/realm-server/tests/{prerender-proxy,screenshot-card,indexing-event-sink}-test.ts Test-double DBAdapters add withUserCostLock stub.
packages/bot-runner/tests/{bot-runner,command-runner}-test.ts Test-double DBAdapters add withUserCostLock stub.
packages/runtime-common/tests/run-command-task-shared-tests.ts Test-double DBAdapter adds withUserCostLock stub.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lukemelia lukemelia marked this pull request as ready for review May 19, 2026 16:35
@lukemelia lukemelia requested review from a team and jurgenwerk May 19, 2026 16:36
@lukemelia lukemelia merged commit 0b75191 into main May 19, 2026
133 of 135 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants