Skip to content

Skip query-backed linksToMany expansion in prerender requests#4934

Merged
habdelra merged 10 commits into
mainfrom
cs-11236-prerender-skip-query-field-expansion
May 25, 2026
Merged

Skip query-backed linksToMany expansion in prerender requests#4934
habdelra merged 10 commits into
mainfrom
cs-11236-prerender-skip-query-field-expansion

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 22, 2026

Summary

Inside a prerender request, the realm-server's loadLinks walker now stops short of expanding query-backed linksTo / linksToMany fields into included[], and the host's SearchResource resolves those relationships through stable per-URL GETs against the parent doc's already-serialized relationships.{field}.data IDs instead of firing a live _federated-search re-query.

Net effect on a card render whose template fans out across query-backed linksToMany fields:

Before After
_federated-search QUERY requests during the render ~240 ~40 (just the top-level template searches)
Per-card cascade every query-backed linksToMany getter fires a _federated-search and recursively expands every getter loads the listed IDs by URL; each GET is one DB lookup
Wall-clock to data-prerender-status=ready (local stack, ~900-instance realm with 5 query-backed linksToMany fields per top-level card) ~213 s ~125 s
Bulk type search TTFB (direct _federated-search with the prerender header) 25–48 s 0.2–3 s (9–60× per type)

The live-SPA path is unchanged.

Why is N+1 small GETs faster than one big _federated-search?

It's counter-intuitive that replacing one server response that side-loads the linked resources with many per-URL GETs is faster. Two reasons:

1. Today's "one big response" isn't one request — it's a cascade

The current host behavior is: every query-backed linksToMany getter fires its own live _federated-search because the resource is isLive: true. So a Dashboard with M query-backed fields per card and N matched parents fans out into M×N×… cascaded searches, each of which itself returns a doc whose query-backed fields the next render layer also re-queries.

Concrete count during the local ~900-instance Dashboard render:

_federated-search requests during render Wall-clock to ready
Before ~240 ~213 s
After ~40 (template's top-level searches only) ~125 s

The 200 requests we eliminate aren't "the parent's expansion" — they're cascaded re-queries from every child layer the template touches.

2. Each _federated-search was paying the full graph walk; per-URL GETs aren't

A _federated-search for a top-level type doesn't just read N rows — it executes the filter, then walks every query-backed linksToMany on every matched card, then walks their query-backed fields, serializing every reachable resource into included[]. Realm-server CPU and the SQL graph walk dominate; the actual filter is single-digit ms.

Measured directly with curl -X QUERY /_federated-search (TTFB, prerender header on, ~900-instance realm):

Top-level type search data rows included rows (before) TTFB (before) TTFB (after) Speedup
Heavy type A 120 660 (full transitive walk) 14.8 s <0.3 s ~50×
Heavy type B 30 780 (whole realm reachable through one query field) 21.7 s <0.3 s ~70×
Heavy type C 40 ~600 20.7 s <0.3 s ~70×
Heavy type D 380 ~580 16.9 s ~3 s ~6×
Heavy type E 200 ~580 16.2 s ~2 s ~8×
Static-only type F 30 0 0.2 s 0.2 s
Static-only type G 50 0 0.2 s 0.2 s

The per-URL GETs that replace the included-expansion hit boxel_index by primary key, return the doc already-serialized at index time, and (post-CS-11176) reuse a job-scoped search cache when one of those linked instances is itself looked up by URL more than once during the render. Each GET is bounded by its own static-linksTo closure under the same prerender skip — no recursive query-field expansion through the cascade.

So the trade is: 1 expensive _federated-search (full-graph serialization) → 1 cheap _federated-search (just the top-level rows) + N cheap GETs (PK lookups, deduped by URL, multiplexed over HTTP/2). The wall-clock number is the integral of both, and the cascade-collapse is what makes it net-faster.

The contract change

Server side: loadLinks skips query-backed expansion in prerender

packages/runtime-common/realm-index-query-engine.ts gains a new skipQueryBackedExpansion option on the loadLinks walker. When set, the walker still populates relationships.{field}.data for query-backed linksTo / linksToMany fields but does not push the linked resources onto the next layer of the BFS. Static linksTo / linksToMany continue to expand transitively.

The per-field follow point reads the umbrella relationship's links.search. applyQueryResults is the only writer of that key, so its presence on a relationship is the unambiguous "this field is query-backed" signal at follow time.

packages/realm-server/handlers/handle-search.ts sets the opt for /_federated-search when x-boxel-during-prerender is on the request, mirroring the same gating used today for cacheOnlyDefinitions. The four cardDocument(..., { loadLinks: true }) call sites in packages/runtime-common/realm.ts (writeMany / patch / patch-noop / GET) thread the same opt via isDuringPrerenderRequest(request).

The walker is gated, not the indexer. boxel_index rows are written exactly as before. Only response shape changes — and only for requests inside a prerender.

Host side: SearchResource consumes the relationship IDs via per-URL GETs

packages/base/query-field-support.ts::captureQueryFieldSeedData captures the parent's relationships.{field}.data IDs into a new seedCardURLs array on the field's state, alongside the existing seedRecords / seedSearchURL / seedRealms.

packages/host/app/resources/search.ts::SearchResource.applySeed consumes those IDs: when the seed has empty cards but non-empty cardURLs, it loads each by URL through the runtime store (store.get(url)). The realm-server's instance-GET runs the same prerender skip, so each GET returns the bare resource plus its static-linksTo closure — no transitive query-backed walk.

The "seed is authoritative" predicate now reads three signals:

  • seed.cards.length > 0 (parent serialized resolved instances in included)
  • seed.cardURLs !== undefined (parent's relationship.data array was captured, including the empty-no-items case)
  • seed.searchURL set (legacy signal — only present when the relationship is fully resolved)

In prerender, any of these short-circuits the live re-query. Outside prerender (isLive: true) this branch is bypassed entirely and the resource subscribes / re-validates as before.

Globals: one prerender flag, not two

Two globalThis flags signaled "this code is running in a prerender" at different layers and they didn't agree:

  • __boxelDuringPrerender was set by the prerender server's evaluateOnNewDocument and read by the host's fetch wrapper.
  • __boxelRenderContext was set by the host's route entry points and read by the seed-resolution logic.

The split caused the prerender header to silently miss in any test or driven render where the route flag was set but the server flag wasn't. Converged on __boxelRenderContext. The prerender server's evaluateOnNewDocument sets it; host routes (render / module / file-extract / command-runner) set it; the fetch wrapper, job-priority resolver, and seed-resolution path read it. The route teardown that previously cleared the flag now matches the other routes' isTesting() guard so the prerender server's persistent value survives between consecutive renders in a pooled tab.

How the new prerender request flow looks

  1. Indexer fires _federated-search for the card it's rendering.
  2. Server returns data + included containing only the static-linksTo closure (no recursion through query-backed fields). relationships.{field}.data still names the matched IDs for every query-backed field.
  3. Host's applySeed reads cardURLs from the captured seed and issues per-URL GETs to materialize the listed cards.
  4. Each per-URL GET also runs through the prerender skip, so it returns its own resource + static-linksTo closure without re-expanding query-backed fields.
  5. Recursion terminates at the natural boundary of what the rendering template actually reads.

The cascade depth is bounded by how far the template walks the relationship graph, not by what the server eagerly includes.

Files changed

  • packages/runtime-common/realm-index-query-engine.tsskipQueryBackedExpansion opt + per-field follow gate in loadLinks.
  • packages/runtime-common/realm.ts — thread the opt through Realm.search and all four cardDocument(..., { loadLinks: true }) call sites.
  • packages/runtime-common/search-utils.ts — extend SearchableRealm / searchRealms opts.
  • packages/realm-server/handlers/handle-search.ts — set the opt for /_federated-search based on the prerender header.
  • packages/realm-server/prerender/page-pool.ts / prerender-constants.ts — set __boxelRenderContext (was __boxelDuringPrerender).
  • packages/base/card-api.gts — extend the seed type with cardURLs.
  • packages/base/query-field-support.ts — capture relationships.{field}.data IDs into seedCardURLs.
  • packages/host/app/resources/search.ts — new per-URL load path in applySeed; widen the seed-authoritative predicate.
  • packages/host/app/lib/prerender-fetch-headers.ts — read __boxelRenderContext (was __boxelDuringPrerender).
  • packages/host/app/services/store.ts — extend the seed shape with cardURLs; same global rename.
  • packages/host/app/routes/command-runner.tsisTesting() guard on the route's __boxelRenderContext teardown so the persistent value survives between renders in a prerender tab (matches existing pattern in render.ts / module.ts / file-extract.ts).
  • packages/host/tests/unit/job-priority-header-test.ts — global rename in test descriptions / comments.
  • packages/realm-server/tests/skip-query-backed-expansion-test.ts — NEW unit test exercising both cardDocument and searchCards paths with and without the skip opt.

Test plan

  • pnpm lint clean in packages/runtime-common and packages/base.
  • CI lint on all packages (delegating realm-server / host to CI).
  • New skip-query-backed-expansion-test.ts covers default-expand vs. skip on both code paths.
  • Existing prerender, query-field, and search tests still pass on CI.
  • Manual: indexer-driven prerender of a card with query-backed linksToMany fields on a realm-server build of this branch.

Out of scope

  • Live-SPA behavior is unchanged. Outside a prerender, query-backed getters keep their live subscription semantics.
  • The host's per-URL load path uses the existing store.get(url) instance-GET. No new endpoint is introduced.
  • Cross-realm relationship data flows through the same path; cross-realm cards still resolve via federated mechanisms unchanged.

Risks

  • The "is this field query-backed?" predicate keys off the umbrella relationship's links.search. If a future code path writes that key from somewhere other than applyQueryResults, the gate would mis-classify; the test in this PR pins the current writer.
  • The host's per-URL fetch fan-out can recurse two levels for a Dashboard-shaped render (parent's query-backed linksToMany → each child's query-backed linksToMany). Termination is bounded by the template's actual reach into the relationship graph; tested locally.
  • Instance-GET responses are also consumed by the live SPA. The skip only triggers when the request is identified as prerender via isDuringPrerenderRequest; non-prerender GETs are unchanged.

🤖 Generated with Claude Code

Inside a prerender request, the realm-server's loadLinks walker now
populates relationships.{field}.data for query-backed linksTo /
linksToMany fields but stops short of pushing the linked resources
into included[]. Static linksTo / linksToMany still expand
transitively. The signal is the umbrella relationship's links.search,
written exclusively by applyQueryResults, so per-field detection at
the loadLinks follow point is unambiguous.

On the host, the SearchResource consumes the parent doc's
relationships.{field}.data IDs as the seed's cardURLs. In prerender
context the resource short-circuits the live re-query and applySeed
loads each ID via runtimeStore.get(url). Per-URL GETs are stable
(deterministic by URL) and the realm-server's instance-GET runs the
same query-field skip in prerender mode, so each GET stays cheap.

The two prerender signals on globalThis are merged into one
(__boxelRenderContext), set by both the prerender server's
evaluateOnNewDocument and the host's prerender-shaped routes
(render.ts / module.ts / file-extract.ts / command-runner.ts). The
host's fetch wrapper and job-priority resolver now read this single
flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc7212ed6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/host/app/resources/search.ts
Comment thread packages/host/app/resources/search.ts
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes prerender-time relationship resolution by preventing the realm-server from eagerly expanding query-backed linksTo / linksToMany into included[], and instead having the host prerender path resolve those relationships via stable per-URL GETs based on the parent document’s relationships.{field}.data IDs. It also consolidates prerender detection onto a single globalThis.__boxelRenderContext flag.

Changes:

  • Add skipQueryBackedExpansion plumbing from request detection → realm search / cardDocument → loadLinks walker to suppress query-backed transitive expansion during prerender.
  • Extend query-field seed capture and host SearchResource.applySeed to use captured relationship IDs (cardURLs) and fetch per-URL rather than re-querying _federated-search.
  • Rename prerender global from __boxelDuringPrerender to __boxelRenderContext and update routes/tests/headers to match.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/runtime-common/search-utils.ts Extends search options to carry skipQueryBackedExpansion through searchRealms.
packages/runtime-common/realm.ts Threads prerender-only skipQueryBackedExpansion into search and instance document generation.
packages/runtime-common/realm-index-query-engine.ts Adds skipQueryBackedExpansion option and gates query-backed link traversal in loadLinks.
packages/realm-server/handlers/handle-search.ts Enables skipQueryBackedExpansion for prerender-marked _federated-search requests.
packages/realm-server/prerender/prerender-constants.ts Updates prerender global documentation to __boxelRenderContext.
packages/realm-server/prerender/page-pool.ts Injects globalThis.__boxelRenderContext = true into pages.
packages/realm-server/tests/skip-query-backed-expansion-test.ts Adds coverage for default vs. skip behavior for cardDocument and searchCards.
packages/base/card-api.gts Extends search seed type to include cardURLs.
packages/base/query-field-support.ts Captures relationship IDs into seedCardURLs for prerender seed resolution.
packages/host/app/resources/search.ts Uses cardURLs to fetch per-URL instances in prerender seed mode; expands “seed is authoritative” predicate.
packages/host/app/lib/prerender-fetch-headers.ts Switches prerender header gating to read __boxelRenderContext.
packages/host/app/services/store.ts Renames prerender global usage and extends seed shape with cardURLs.
packages/host/app/routes/command-runner.ts Aligns __boxelRenderContext teardown with other prerender routes (testing-only clear).
packages/host/tests/unit/job-priority-header-test.ts Updates test descriptions/comments to the new global name.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/base/query-field-support.ts
Comment thread packages/host/app/lib/prerender-fetch-headers.ts
Comment thread packages/host/app/resources/search.ts
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 22, 2026

Preview deployments

Host Test Results

    1 files      1 suites   3h 38m 28s ⏱️
2 724 tests 2 706 ✅ 15 💤 0 ❌ 3 🔥
5 486 runs  5 450 ✅ 30 💤 3 ❌ 3 🔥

Results for commit f9a4060.

For more details on these errors, see this check.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   7m 48s ⏱️ - 1m 12s
1 482 tests ±0  1 482 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 573 runs  ±0  1 573 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit f9a4060. ± Comparison against earlier commit 7393d33.

habdelra and others added 5 commits May 22, 2026 03:11
…op early returns

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nder-flag check; log seed-URL load failures

- query-field-support: when shouldTreatEmptySeedAsUnresolved is true
  (seed comes from a search result with unhydrated nested query
  fields, or relationship.data is absent), leave seedCardURLs
  undefined so SearchResource falls back to a live query instead of
  treating an empty seed as authoritative.
- prerender-fetch-headers: require __boxelRenderContext === true (not
  just truthy) before stamping the prerender header, matching the
  surrounding prerender-gated logic.
- search.ts applySeed: console.warn on runtimeStore.get failures
  during the per-URL hydration path so missing relationship items are
  diagnosable in prerender logs instead of silently dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… search result

Previous fix used the broader shouldTreatEmptySeedAsUnresolved gate
which also triggers on relationshipHasUnhydratedTargets — but that
case (relationship.data names IDs, no records hydrated) is the
expected prerender-server-skip shape, not an unresolved seed.
Clearing seedCardURLs there gutted the per-URL GET path the whole
contract relies on.

Tighten the gate to seedComesFromSearch && seedRecords.length === 0
— that catches the only truly-untrustworthy case (nested query
fields on a search-result document) while leaving the prerender's
indexer-populated relationship IDs authoritative.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
With the prerender server now skipping query-backed expansion in
`included[]` and the host's SearchResource resolving the listed IDs
via per-URL GETs, the live fallback `_federated-search` no longer
fires for query-backed `linksToMany` fields during a prerender.
Flip the first assertion in the directory-ops prerender test to
expect `delayedSearchPatch.getRequestCount() === 0`, matching the
new contract. The HTML and searchDoc assertions stay — those verify
that per-URL GETs still hydrate every level of the relationship
graph correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n fires

The previous skip prevented the per-item resources from being added
to `included[]`, but left the `fieldName.N` entries on
`resource.relationships`. The host deserializer treats every
per-item entry as a follow-able relationship and expects its target
in `included[]`, so the orphan-link mismatch produced silent
"error getting instance" failures (Error.toJSON drops fields so the
log just printed `{}`).

Before traversal, when the response is in skip mode, walk each
query-backed umbrella (relationship with `links.search`) and remove
its `fieldName.N` sub-keys. The umbrella entry itself stays — it
carries `links.search` plus the `data: [array of IDs]` the host's
per-URL hydration path consumes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra marked this pull request as draft May 22, 2026 08:39
habdelra and others added 2 commits May 22, 2026 09:05
`captureQueryFieldSeedData` now treats `relationship.links.search` as
the unambiguous signal that the indexer (not the user's raw source)
populated the query-backed `linksToMany` umbrella. A raw source's
`data: []` no longer arrives as an authoritative empty seed, so the
SearchResource falls through to a live `_federated-search` instead of
silently rendering an empty field — restoring the host-side behavior
the prerender contract assumed.

Also flip the `card prerender resolves query fallback via per-URL GETs`
test back to `delayedSearchPatch.getRequestCount() > 0`: source-mode
instance loads keep each query-backed field firing one fallback search,
and the PR's win lands on the *response* side (per-search payload
drops ~10-60× via `skipQueryBackedExpansion`), not on the number of
searches. The test still verifies the HTML / searchDoc reach Bob /
Alice / Eve, which guards both the search firing and the per-URL hydration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous gate cleared `seedCardURLs` whenever the seed came from a
`_federated-search` response with empty `seedRecords` — the seed-from-
search marker stayed but the resolved IDs were thrown away. That fired
for every query-backed `linksToMany` hydration off a search-result
resource (the IDs are in `relationship.data`, the resources are NOT in
`included[]` under prerender skip), so each parent's query field
re-fanned out into its own live `_federated-search` instead of using
the per-URL GET path. Measured on the bxl-dependency-order-test
dashboard render: a Customer search returning 120 customers caused
120 follow-up Customer.policies searches → ~296 total `_federated-search`
requests, ~600s wall-clock.

`relationship.links.search` is the unambiguous "indexer wrote this"
signal (`applyQueryResults` is the only writer), and the IDs in
`relationship.data` next to it are the indexer's pre-resolved
pointers — authoritative regardless of whether `included[]` also
carries the resources. The prior secondary clause was a stand-in from
before `links.search` was the gate; with the gate in place it's
redundant and actively wrong.

Same dashboard render after this change: ~10 top-level queries + ~200
per-URL Policy GETs (the cardURLs cascade collapse) + ~230 nested
Policy.claims searches = ~117s wall-clock, matching the PR's claimed
target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra marked this pull request as ready for review May 22, 2026 13:58
habdelra and others added 2 commits May 22, 2026 10:35
The file-extract route's runtime-dep tracker was missing
`${baseRealm.url}file-api` whenever matrix-service's
`importResource(() => '…/file-api')` did not happen to fire inside the
route's active tracker context. On this branch the timing shift makes
the miss deterministic, so `prerendering-test.ts > file prerender
returns extracted metadata` fails reliably.

Explicitly importing the URL inside the file-extract route's
`withRuntimeDependencyTrackingContext` makes the dep deterministic
without touching the card prerender path. Adds a console.warn that
fires if the URL still slips out of the snapshot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ace)

The previous fix pinned `${baseRealm.url}file-api` as a runtime dep by
calling `loader.import(fileApiURL)` inside the file-extract route's
`withRuntimeDependencyTrackingContext` window. That call's synchronous
`trackRuntimeModuleDependency` sits behind three layers of loader
indirection — the moduleShims set, the `resolveImport` URL rewrite,
and the `advanceToState` state machine — and a fourth indirection if
the loader instance was replaced between page boot and this call (e.g.
after `BrowserManager.restartBrowser()` from the preceding test).
Empirically the test still flaked with that fix in place.

Stamp the tracker directly with `trackRuntimeModuleDependency(...)`.
That's one synchronous call with no moving parts and no dependency on
the current loader instance's bookkeeping. Also include `fileApiURL`
in the merged deps unconditionally as a belt-and-suspenders: even if
the tracker session is ever closed prematurely between this call and
the snapshot, the indexer's invalidation contract still gets the URL
it needs to invalidate file extracts on `file-api` changes.

The extractor doesn't need `file-api` to be physically loaded — it
imports `card-api` for the FileDef class — so dropping the
`loader.import(...)` call is safe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra requested a review from a team May 22, 2026 15:17
@habdelra habdelra merged commit e9ac607 into main May 25, 2026
96 of 101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants