Phase 2: job-scoped same-realm search cache during indexing#4791
Conversation
Wrap `searchRealms` in `_federated-search` with a per-process `JobScopedSearchCache` keyed by (jobId, normalizedQuery, normalizedOpts). Entries store the *resolved* doc only (not in-flight promises) so a slow first populate cannot tail-latency-block concurrent peers past their render-timeout window. Concurrent same-key callers each run their own populate; Phase 1's in-flight dedup at `RealmIndexQueryEngine.searchCards` absorbs the duplicate inner SQL+loadLinks work. This cache only optimises *sequential* repeats within the same indexing job. Gated on three conditions all holding: - `x-boxel-job-id` present (only the indexer worker stamps this; live user / API callers never carry it and therefore always see fresh data). - `x-boxel-consuming-realm` present (the host's render route only sets it during prerender; new constant + sanitizer live in runtime-common's prerender-headers.ts so the host SPA and realm- server can both import it). - The request's `realms` list is exactly `[consumingRealm]` — cross- realm reads bypass the cache because a peer realm can swap its `boxel_index` mid-batch and a cached value would freeze a stale snapshot. Cache lifetime: 10-minute TTL with `setTimeout(unref)` cleanup. A `clearJob(jobId)` method is exposed for future NOTIFY-driven eviction on worker job done/reject; until then, leaked entries persist at most TTL after the job ends (no correctness impact — `jobId` keying means no future job hits a stale entry). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ob-cache # Conflicts: # packages/host/app/services/store.ts # packages/realm-server/handlers/handle-search.ts # packages/realm-server/server.ts
Prettier line wrapping + replace assert.ok with logical operators by binding the boolean to a local first (qunit lint rule no-assert-logical-expression).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1501a8cfdb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Preview deploymentsHost Test Results 1 files ± 0 1 suites ±0 4h 9m 24s ⏱️ + 2h 8m 3s Results for commit c6f1e2c. ± Comparison against earlier commit 82fb488. For more details on these errors, see this check. Realm Server Test Results 1 files ±0 1 suites ±0 11m 16s ⏱️ + 2m 2s Results for commit c6f1e2c. ± Comparison against earlier commit 82fb488. |
- Introduce `X_BOXEL_JOB_ID_HEADER` in runtime-common's
prerender-headers.ts as the canonical definition; realm-server's
prerender-constants.ts re-exports as `PRERENDER_JOB_ID_HEADER` so
existing imports keep working.
- prerender-app's `/prerender-visit` route forwards the inbound
`x-boxel-job-id` header into `prerenderer.prerenderVisit({...jobId})`.
- Prerenderer + RenderRunner thread `jobId` through to
`prerenderVisitAttempt`, which `page.evaluate`s
`globalThis.__boxelJobId = jobId` before the render begins.
- Host's `_federated-search` fetch wrapper reads the global and
attaches the header alongside `consumingRealmHeader()` /
`duringPrerenderHeaders()`.
- CORS allow-list adds `X-Boxel-Job-Id` (preflight blocker on the
prior commit).
With this in place, `handle-search`'s cache gate
(`jobId && consumingRealm && realms === [consumingRealm]`) can
actually evaluate true during indexing.
- Set `Access-Control-Max-Age: 86400` on the CORS middleware. Without
this @koa/cors omits the header and Chrome falls back to its ~5 s
default preflight TTL, which forces a fresh OPTIONS round-trip in
front of nearly every cross-origin QUERY the host fires during a
long indexing run. Doubles realm-server arrival count and adds a
serial RTT in front of each QUERY.
- Fold the new `__boxelJobId` assignment into the existing
`localStorage.setItem('boxel-session', ...)` `page.evaluate` so we
pay one CDP round-trip per visit instead of two. Same effect:
global is set before transitionTo, host's `_federated-search`
wrapper reads it and stamps `x-boxel-job-id` on outbound calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a job-scoped, same-realm-only cache for _federated-search during indexing/prerender to eliminate sequential duplicate searches within a single indexing job, plus the header plumbing needed to safely gate that cache.
Changes:
- Introduces
JobScopedSearchCachein realm-server and wires it into_federated-searchbehind a strict “same realm + job id + consuming realm” gate. - Adds
x-boxel-consuming-realm/x-boxel-job-idheader constants + sanitization inruntime-common, and plumbs globals/headers through host prerender renders. - Extends CORS allow-headers and adds unit tests for both the cache and header sanitizer.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/runtime-common/tests/consuming-realm-header-test.ts | Shared test suite for consuming-realm header sanitizer. |
| packages/runtime-common/prerender-headers.ts | Adds header constants and sanitizeConsumingRealmHeader(). |
| packages/runtime-common/index.ts | Re-exports prerender-headers from runtime-common entrypoint. |
| packages/realm-server/tests/job-scoped-search-cache-test.ts | Unit tests for JobScopedSearchCache. |
| packages/realm-server/tests/index.ts | Registers the new realm-server tests. |
| packages/realm-server/tests/consuming-realm-header-test.ts | Runs the shared runtime-common sanitizer tests under realm-server’s test runner. |
| packages/realm-server/server.ts | Adds new headers to CORS allow-list. |
| packages/realm-server/routes.ts | Instantiates a process-wide JobScopedSearchCache and passes it to _federated-search. |
| packages/realm-server/prerender/render-runner.ts | Injects globalThis.__boxelJobId into the page before render so host can stamp the job header. |
| packages/realm-server/prerender/prerenderer.ts | Threads jobId through prerender visit plumbing into the render runner. |
| packages/realm-server/prerender/prerender-constants.ts | Re-exports job-id header constant from runtime-common; keeps sanitizer local. |
| packages/realm-server/prerender/prerender-app.ts | Reads/sanitizes inbound job id header and forwards it into prerenderer visit args. |
| packages/realm-server/job-scoped-search-cache.ts | New cache implementation keyed by (jobId, normalized query, normalized opts) with TTL eviction. |
| packages/realm-server/handlers/handle-search.ts | Adds cache gate and searchCache.getOrPopulate() wrapping searchRealms(). |
| packages/host/app/services/store.ts | Stamps consuming-realm + job-id headers onto _federated-search requests when globals are present. |
| packages/host/app/routes/render.ts | Sets globalThis.__boxelConsumingRealm for the card being rendered. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Hard cap on total entries with FIFO eviction by populate-order sequence. Bounds worst-case memory under a synthetic-jobId flood: an authenticated reader could otherwise mint arbitrary `<digits>.<digits>` jobId values and grow the cache without bound for the full TTL window. Default 5000 entries. - Routes.ts comment said "TTL-evicts entries 10 min after last touch" — actually the timer fires 10 min after the initial populate; hits don't refresh it. Tying TTL to populate time bounds the leak deterministically, so the code stays; only the comment changes. - job-scoped-search-cache-test.ts: dropped the stale "handler-level integration test below covers the gate" claim and replaced it with a `TODO` pointing at the right home for that coverage (`realm-endpoints/search-test.ts`). Added a unit test for the cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c543f6846f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The first attempt and the post-restart retry call `prerenderVisitAttempt` with the same args, but the retry was missing `jobId`. Without it, `render-runner` injects `undefined` into `__boxelJobId` on the restarted page → the host's federated-search wrapper omits the header on outbound calls → `handle-search` bypasses the cache for the whole retried render even though those calls still belong to the same indexing job. Render-runner's `page.evaluate` is already a single statement that overwrites both globals atomically, so this fix only needs to pass the ID through the missing destructure site.
The cap-eviction test asserted "B was a cache hit (not evicted)" after re-requesting A on a full cache. That expectation was LRU-shaped — under the actual strict-FIFO semantics, B is the oldest survivor after A is evicted by the D insert; re-inserting A pushes B out next. Rewrite the assertions to match strict FIFO: - After A is evicted by D and re-inserted (seq 4), the map holds C, D, A (B is now evicted). - D was the youngest survivor before A re-entered, so D should still be a hit. Pick D rather than C as the "still hit" assertion so the test is stable under future cap-math changes. - B is the entry the re-insert pushed out, so B should miss now — add that assertion to make the eviction observable. Test now correctly characterises the implementation; impl unchanged.
Summary
Wraps
searchRealmsin_federated-searchwith a per-processJobScopedSearchCachekeyed by(jobId, normalizedQuery, normalizedOpts). Within an indexing batch,boxel_indexis snapshot-stable for same-realm reads (the writer touchesboxel_index_workinguntilapplyBatchUpdatescommits) — so once a batch's first_federated-searchfor a given query resolves, every later same-key call within the samejobIdshort-circuits to the cached doc.The cache stores resolved docs, not in-flight promises. Concurrent same-key callers each run their own populate; Phase 1's in-flight dedup at
RealmIndexQueryEngine.searchCards(#4784) already coalesces the inner SQL + loadLinks walk for same-realm calls arriving concurrently. Storing promises was tried first and produced a tail-latency stall — a slow first populate blocked every later same-key caller past its render-timeout window. Resolved-only avoids that failure mode and keeps the sequential-dedup win, which is what the cohort-render pattern actually needs.Plumbing
X_BOXEL_CONSUMING_REALM_HEADER+X_BOXEL_JOB_ID_HEADERconstants (with sanitizers) inpackages/runtime-common/prerender-headers.ts. Re-exported from realm-server'sprerender-constants.tsso existing imports ofPRERENDER_JOB_ID_HEADERkeep working.globalThis.__boxelConsumingRealminmodel(); prerender's render-runner setsglobalThis.__boxelJobIdviapage.evaluate(folded into the existing auth setup, one CDP round-trip total). The host's federated-search wrapper inservices/store.tsreads both globals and attaches the headers alongside the existingx-boxel-during-prerender.X-Boxel-Consuming-Realm+X-Boxel-Job-Idand setsAccess-Control-Max-Age: 86400— withoutmaxAge, @koa/cors falls back to Chrome's ~5 s default and the new headers force a fresh OPTIONS round-trip in front of nearly every QUERY.Cache gate
Three conditions all hold to consult the cache:
x-boxel-job-idpresent + well-formed (only the indexer worker stamps this; user / API callers never carry it and always see fresh data).x-boxel-consuming-realmpresent + well-formed (host's render route only sets it during prerender).realmsarray is exactly[consumingRealm]— cross-realm reads bypass because a peer realm can swap itsboxel_indexmid-batch and a cached value would freeze a stale snapshot.Cache bounds
<digits>.<digits>jobId values).clearJob(jobId)eviction (hook exists) will release entries the moment the worker completes a job; for now TTL is the bound.Bench (local, vite-dev host, cold tabs)
Fresh cold stack after
mise run kill-all+TRUNCATE job_reservations, jobs, job_progress+mise run dev-all+ drain + lazy-mount.user/prudent-octopus(~118 files)0119bac76dc543f6846fuser/ambitious-piranha(~461 files)cht-*.jsoncohort, attempt=1/12, 150 s prerender abortOctopus: −60 % wall-clock, no errors, identical output.
Piranha: both variants reject on the same single-card cohort-render timeout (
attempt=1/12of acht-*.jsonCohort card, 150 s prerender abort). Main lasts 652 s because more non-cohort cards process before the indexer reaches the cohort; Phase 2 lasts 183 s because the cache makes non-cohort renders so much faster that the indexer reaches the cohort sooner. The terminal failure is identical and orthogonal to this PR — see CS-11123 (pre-warm modules table) which targets the cohort-render slowness directly. Phase 2 + CS-11123 should stack: pre-warm gets the cohort under 150 s, Phase 2 dedupes the 86 sub-searches it fires.vite-dev rig is ~3–4× slower than the prod-built host setup that produced the ticket-quoted baselines (190 s octopus / ~585 s piranha). Absolute numbers aren't directly comparable to the ticket — the same-rig deltas across Main → Phase 2 are the load-bearing signal.
Follow-up: horizontal scaling
CS-11119 refactors this cache to DB-backed under the DB-Authoritative Realm Registry project. The current in-process Map works correctly for single-replica deployments (the launch configuration) but degrades to ~1/N hit rate under multi-replica because each replica only sees its own cache. The DB-backed version stores entries in a
search_cachetable keyed by(jobId, query-digest, opts-digest, consuming_realm)and trivially survives horizontal scaling.Tests
JobScopedSearchCacheunit tests: hit/miss by(jobId, query, opts), miss across different jobIds, sequential hit, concurrent each-runs-own-populate (last-write-wins),clearJobisolation, TTL eviction, opts variance, FIFO cap eviction (new in latest commit), realm-agnostic at the cache layer (the gate lives in handle-search).sanitizeConsumingRealmHeaderunit tests: accepts plain http(s) URLs, trims whitespace, rejects non-http(s) schemes, empty/null, control characters (log-injection defence), pathologically long values, non-string inputs.Test plan
Stacked PR — this is Phase 2 of CS-11115. Phase 1 (#4784) is merged.
🤖 Generated with Claude Code