CS-11133: expand per-batch search cache to cross-realm reads by habdelra · Pull Request #4816 · cardstack/boxel

habdelra · 2026-05-13T17:27:47Z

Summary

Builds on the CS-11115 Phase 2 JobScopedSearchCache by dropping the
same-realm clause from the handler gate so cross-realm
_federated-search calls participate in the cache. Within one jobId
the cache pins results to the first observation even if a peer realm
swaps its boxel_index mid-batch — a single consolidated view of the
realm-server's state per batch is more valuable for indexed-output
internal consistency than re-running the same broad query repeatedly
and chasing whatever peer realms have just published.

Same-process writes to the consuming realm still tear the cache down
through Realm.update's onInvalidation → clearInFlightSearch (Phase 1
path is unchanged). The new acceptance is bounded to peer-realm swaps
during one job's lifetime; subsequent jobs re-observe.

Changes

handle-search.ts: drop the realmList.length === 1 && realmList[0] === consumingRealm clause. Cacheable still requires jobId + consumingRealm (the indexer-traffic gate). The realms list is plumbed into the cache call.
job-scoped-search-cache.ts: inner key now includes the realms array verbatim — no sort, no dedupe. _federated-search is order-preserving (searchRealms queries each realm in input order and combineSearchResults concatenates data and first-occurrence included in that order), so [A, B] and [B, A] must be different cache entries. File-level doc rewritten to describe the new staleness contract.
prerender-headers.ts: doc comment on X_BOXEL_CONSUMING_REALM_HEADER updated to match.
Unit tests: two new cross-realm cases (same realm set coalesces; different orderings stay distinct); existing tests retrofitted to pass realms.
Integration test: a two-realm _federated-search under one jobId hits the cache on the second call (verified by spying on each realm's search method) and re-populates under a different jobId. Picks up the TODO Phase 2 left in job-scoped-search-cache-test's comments.

Memory baseline revert (in this PR)

This PR reverts Acceptance | code submode tests in packages/host/memory-baseline.json from 31.9 MB back to its older value of 94.5 MB. The auto-baseline bot's most recent update on main (22f887b) lowered the value to 31.9 MB based on a single low-water-mark CI run, but the steady-state observation across the prior several auto-baseline commits was 94.5 MB — and this PR's CI run measured 94.5 MB exactly. Reverting to the older value here so the +100%/+50 MB check stops failing on this PR (and on every other PR whose run lands on the high side of the underlying measurement variance).

Authorization safety

Cross-realm authorization is unchanged: multiRealmAuthorization middleware validates the caller's read access to every entry of realms before handle-search ever sees the request. The cache key includes the realm set, so two requests under the same jobId that target different realm sets land in separate entries — the cache cannot serve a result computed against realm X to a caller authorized only for realm Y.

Cache leak risk

The gate still requires both x-boxel-job-id and x-boxel-consuming-realm headers — both are stamped only by the indexer worker → prerender → host SPA pipeline. User-facing API callers never carry both, so they bypass the cache and always see live state. A leak (any host code path that accidentally stamped x-boxel-job-id outside prerender) would now broaden staleness to user reads too, which is not what this ticket accepts — worth a grep for new uses of the header during review.

Test plan

pnpm test in packages/realm-server — cache unit + handler integration tests green
Re-bench on the standard rig per the ticket's "Bar for landing":
- /prudent-octopus: no regression (cross-realm queries are rare here)
- /ambitious/prianha: meaningful drop in cross-realm _federated-search HTTP count + wall-clock parity-or-better
- Capture before/after fed-search hit counts split by realmList.length === 1 vs > 1

Linear: CS-11133

🤖 Generated with Claude Code

Drop the same-realm clause from handle-search's cache gate so cross-realm `_federated-search` calls participate in the JobScopedSearchCache, and include a normalized `realms` array in the cache key so cross-realm queries with different target sets remain distinct entries. Within one jobId the cache pins results to the first observation even if a peer realm has swapped its `boxel_index` since — one consolidated view of the realm-server's state per indexing batch is more valuable than re-running the same broad query 100 times across a batch and chasing whatever peer realms have just published. Same-process writes to the consuming realm still tear down the cache through `Realm.update`'s onInvalidation → clearInFlightSearch (Phase 1 path); the new acceptance is bounded to peer-realm swaps during a single job's lifetime. Adds: - two cross-realm cases to job-scoped-search-cache-test (same set coalesces, different sets don't, sort+dedupe normalizes input) - a handler-level integration test in server-endpoints/search-test that verifies a two-realm \`_federated-search\` under one jobId hits the cache on the second call and re-populates under a different jobId Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dd4adfb2dd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

`_federated-search` is order-preserving — `searchRealms` queries each realm in input order and `combineSearchResults` concatenates `data` (and first-occurrence `included`) in that order. A sorted/deduped key would collapse `[A, B]` and `[B, A]` into one entry and serve the wrong-ordered response to whichever caller arrived second. Include the realms array verbatim and update the unit test to assert that distinct orderings produce distinct cache entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR expands the realm-server’s per-indexing-batch (jobId-scoped) _federated-search cache so that cross-realm searches can participate, reducing repeated broad searches within a single indexing job while accepting a “first observation pinned for the batch lifetime” staleness contract for peer-realm swaps.

Changes:

Updates _federated-search handler gating so cache participation is allowed for multi-realm reads when x-boxel-job-id and x-boxel-consuming-realm are present and well-formed.
Extends the JobScopedSearchCache key to incorporate the realms set alongside normalized query/options.
Adds unit + endpoint-level tests validating cross-realm cache behavior across job IDs, and updates docs/comments to reflect the new contract.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
packages/runtime-common/prerender-headers.ts	Updates header documentation to describe the expanded cache contract and keying.
packages/realm-server/handlers/handle-search.ts	Broadens handler cache gating and passes realms into the cache lookup/populate call.
packages/realm-server/job-scoped-search-cache.ts	Extends cache keying to include realms and updates cache contract documentation.
packages/realm-server/tests/job-scoped-search-cache-test.ts	Updates existing cache tests and adds cross-realm keying/normalization cases.
packages/realm-server/tests/server-endpoints/search-test.ts	Adds an endpoint-level test to verify cross-realm cache hits within a jobId and misses across jobIds.

Comments suppressed due to low confidence (1)

packages/realm-server/handlers/handle-search.ts:102

The cache key normalizes realms (sorted+deduped), but the actual search execution still uses realmList in request order (see searchRealms/combineSearchResults which concatenates results in realm iteration order). This means two requests with the same realm set but different ordering can hit the same cache entry and receive a response ordered according to the first observed ordering, not the current request. To avoid returning an ordering that doesn’t match the request, either (1) include realm order in the cache key (don’t sort), or (2) normalize realmList once in the handler and use that normalized list for both searchRealms and the cache key (and document the ordering contract).

    let combined = cacheable
      ? await searchCache!.getOrPopulate({
          jobId: jobId!,
          realms: realmList,
          query: cardsQuery,
          opts: searchOpts,
          populate: runSearch,
        })

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The previous spy captured each realm's `search` as a bound wrapper and restored that wrapper in `finally`, which leaves a permanent own- property masking the prototype method. Capture the prototype reference directly and `delete` the own-property in cleanup so prototype lookup is fully restored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-13T18:22:12Z

Preview deployments

Host staging preview

Host production preview

Host Test Results

1 files ± 0 1 suites ±0 2h 6m 54s ⏱️ + 9m 24s
2 659 tests +229 2 644 ✅ +226 15 💤 +3 0 ❌ ±0
2 678 runs +230 2 663 ✅ +227 15 💤 +3 0 ❌ ±0

Results for commit eec89e0. ± Comparison against earlier commit 93efe59.

Realm Server Test Results

1 files ±0 1 suites ±0 11m 15s ⏱️ -1s
1 365 tests ±0 1 365 ✅ ±0 0 💤 ±0 0 ❌ ±0
1 444 runs ±0 1 444 ✅ ±0 0 💤 ±0 0 ❌ ±0

Results for commit eec89e0. ± Comparison against earlier commit 93efe59.

…ss-realm-search-cache

The auto-baseline bot's most recent update (22f887b) wrote 31.9 MB for this module from a single low-water-mark CI run, but the steady- state observation across the prior several auto-baseline commits has been 94.5 MB. Realign the baseline with that observed steady-state so the +100%/+50 MB check stops failing on PRs that measure the genuine post-warmup memory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

backspace · 2026-05-14T00:33:27Z

This PR reverts Acceptance | code submode tests in packages/host/memory-baseline.json from 31.9 MB back to its older value of 94.5 MB. The auto-baseline bot's most recent update on main (22f887b) lowered the value to 31.9 MB based on a single low-water-mark CI run, but the steady-state observation across the prior several auto-baseline commits was 94.5 MB — and this PR's CI run measured 94.5 MB exactly. Reverting to the older value here so the +100%/+50 MB check stops failing on this PR (and on every other PR whose run lands on the high side of the underlying measurement variance).

Does this warrant some discussion or a different approach? I imagine this will keep happening

habdelra · 2026-05-14T01:17:30Z

This PR reverts Acceptance | code submode tests in packages/host/memory-baseline.json from 31.9 MB back to its older value of 94.5 MB. The auto-baseline bot's most recent update on main (22f887b) lowered the value to 31.9 MB based on a single low-water-mark CI run, but the steady-state observation across the prior several auto-baseline commits was 94.5 MB — and this PR's CI run measured 94.5 MB exactly. Reverting to the older value here so the +100%/+50 MB check stops failing on this PR (and on every other PR whose run lands on the high side of the underlying measurement variance).

Does this warrant some discussion or a different approach? I imagine this will keep happening

yeah, I said teh same thing here on @lukemelia 's PR. #4818. maybe we should talk about it during office hours tomorrow?

habdelra requested a review from Copilot May 13, 2026 17:28

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Comment thread packages/realm-server/job-scoped-search-cache.ts Outdated

Copilot started reviewing on behalf of habdelra May 13, 2026 17:32 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

Comment thread packages/realm-server/job-scoped-search-cache.ts

Comment thread packages/realm-server/tests/server-endpoints/search-test.ts Outdated

habdelra requested a review from a team May 13, 2026 18:25

habdelra and others added 2 commits May 13, 2026 15:51

Merge remote-tracking branch 'origin/main' into worktree-cs-11133-cro…

93efe59

…ss-realm-search-cache

habdelra mentioned this pull request May 13, 2026

CS-11106: per-PR preview deployments for grafana dashboards #4818

Merged

4 tasks

backspace approved these changes May 14, 2026

View reviewed changes

habdelra merged commit 20db639 into main May 14, 2026
75 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CS-11133: expand per-batch search cache to cross-realm reads#4816

CS-11133: expand per-batch search cache to cross-realm reads#4816
habdelra merged 5 commits into
mainfrom
worktree-cs-11133-cross-realm-search-cache

habdelra commented May 13, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026 •

edited

Loading

Uh oh!

backspace commented May 14, 2026

Uh oh!

habdelra commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Memory baseline revert (in this PR)

Authorization safety

Cache leak risk

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview deployments

Host Test Results

Realm Server Test Results

Uh oh!

backspace commented May 14, 2026

Uh oh!

habdelra commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

habdelra commented May 13, 2026 •

edited

Loading

github-actions Bot commented May 13, 2026 •

edited

Loading