Indexer: pre-warm every realm .gts/.gjs module, not just per-row deps by habdelra · Pull Request #4864 · cardstack/boxel

habdelra · 2026-05-18T18:56:52Z

Stacked on top of #4863. Merge that first (or retarget this PR's base to main after #4863 merges).

Why

The indexer's pre-warm phase seeds the modules cache before the visit loop runs, so card renders don't have to fire a sub-prerenderModule mid-render for a missing definition. The existing pre-warm walks each invalidation's boxel_index.deps array and pre-warms every executable URL it finds — i.e. import-graph dependencies. That misses modules referenced by string in card templates.

Pattern that triggered every staging failure on ctse/ambitious-piranha:

// dashboard.gts — imports only `card-api`, references cohort.gts by STRING
get cohortRef() { return { module: this.moduleUrl('cohort'), name: 'Cohort' }; }
// then in template:
<Search @query={{filter: {type: this.cohortRef}}} @realms={{this.realmHrefs}} />

cohort.gts is a parameter to a search filter; it's never an import. So:

cohort.gts is not in dashboard.gts's runtime module graph.
dashboard.json's deps row in boxel_index does not list cohort.gts.
Pre-warm walks the dashboard's deps and skips cohort.gts.
When the dashboard renders, <Search> fires _federated-search with filter.type.module = '.../cohort.gts' → cache miss → spawns a same-affinity prerenderModule for cohort.gts mid-card-render → self-referential prerender deadlock.

Verified empirically against the staging index — dashboard.json on ctse/ambitious-piranha has exactly two Tracking-realm entries in deps (.../Tracking/dashboard plus a .glimmer-scoped.css artifact), and none of the sibling card modules every dashboard render needs.

What this PR changes

Adds a realm-wide pre-warm layer that runs in addition to the existing per-row deps walk.

Source of the realm-wide list — filesystem mtimes filtered by extension.

From-scratch path: discoverInvalidations already calls reader.mtimes() and returns the result. Filter to .gts/.gjs and pass into preWarmModulesTable as a new allRealmCardModules argument. Zero added cost.
Incremental path: reader.mtimes() hadn't been called yet — add one call before pre-warm. ~< 200 ms on realistic realm sizes, one call per job, dominated by the indexing work that follows.

Why .gts / .gjs only — optimization, not correctness gate. .ts / .js files CAN host CardDef (e.g. command-input cards), but they almost never do in practice. If pre-warm misses such a module, the on-demand lookupDefinition cache read-through during the visit fires a prerenderModule for it — safe because the PagePool now materializes a tab for non-file callers instead of queueing them behind the render that triggered the lookup (the change in the parent branch). Restricting the sweep to the extensions where cards live almost exclusively avoids paying the prerender cost on every reindex for files that don't define cards (typical realms have many helper .ts files alongside their card .gts).

No-card markers in the cache. A pre-warmed module that turns out not to define any cards (rare among .gts; e.g. a .gts Glimmer template helper with no CardDef export) gets persisted to the modules table with definitions: {}. CachingDefinitionLookup.getCachedDefinitions treats that as a valid cache hit on subsequent lookups — so the wasted prerender is a one-shot, not a per-call cost.

Adds hasCardExtension / cardExtensions to runtime-common/index.ts, parallel in shape to the existing hasExecutableExtension / executableExtensions.

Stays serial. Pre-warm loop body unchanged from the existing serial-await shape; the only behavioral change is which modules end up in toWarm. Bounded-parallel pre-warm is a separate follow-up after we've measured serial.

Bench results — local cold full-reindex

Run against a freshly-restarted realm-server with clearLastModified: true so the from-scratch path walks the full filesystem mtime sweep.

realm	wall time	errored rows	deadlock-fingerprint rows
`user/ambitious-piranha`	665 s (~11 min)	0	0
`user/prudent-octopus`	145 s (~2.4 min)	0	0

Baseline before this branch on the same workload:

ambitious-piranha was rejected by the manager-side 150 s /prerender-visit abort, or completed at ~36 min with multiple errored rows whenever it didn't.
Errored rows carried the documented self-referential prerender deadlock fingerprint: renderStage = waiting-stability, queryLoadsInFlight with source: search-resource:*, and affinitySnapshot.sameAffinityActivity entries of {queue: 'module', state: 'queued'} on the same affinity.

Spot-checked timing_diagnostics on the heaviest rows post-fix (dashboard cards on both realms): no {state: queued} entries in sameAffinityActivity, confirming the deadlock path no longer fires.

Files

packages/runtime-common/index.ts — new hasCardExtension / cardExtensions exports.
packages/runtime-common/index-runner.ts — preWarmModulesTable takes an allRealmCardModules arg; the from-scratch and incremental call sites compute the realm-wide list and pass it in.

Tests

The cache-contract this PR relies on — definitions: {} persists as a no-card marker, subsequent lookups short-circuit at the cache — is pinned by 'getCachedDefinitions caches non-card modules as no-card markers' in the parent branch's definition-lookup-test.ts.

A focused integration test for the realm-wide sweep (intercept getCachedDefinitions, assert it's invoked for every .gts/.gjs in the realm including sibling-referenced ones) is worth adding as a follow-up — the bench-time validation is the immediate gate; CI catches regressions on the existing deps-driven layer.

Test plan

CI passes
Local cold full-reindex of user/prudent-octopus lands with zero errored rows
Local cold full-reindex of user/ambitious-piranha lands with zero errored rows
Spot-check timing_diagnostics on the heaviest rows post-fix — no {state: queued} entries in sameAffinityActivity

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4fbac4ddd3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copilot

Pull request overview

This PR expands the indexer’s module-cache pre-warm phase so it seeds the modules table for all card-template modules in a realm (via a realm-wide .gts/.gjs sweep), not only modules discovered via per-row deps. This reduces on-demand prerenderModule sub-renders during indexing, avoiding same-affinity deadlock conditions when _federated-search needs card definitions referenced by string in templates.

Changes:

Add cardExtensions and hasCardExtension() helper exports alongside the existing executable-extension helpers.
Update IndexRunner to compute allRealmCardModules from reader.mtimes() (reusing discoverInvalidations mtimes for from-scratch; adding a new mtimes call for incremental).
Extend preWarmModulesTable to seed its warming set from the realm-wide card-module list, then layer existing per-row deps warming on top.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
packages/runtime-common/index.ts	Introduces `.gts`/`.gjs` “card extension” constants and a `hasCardExtension()` helper for realm-wide filtering.
packages/runtime-common/index-runner.ts	Threads a realm-wide card-module list into `preWarmModulesTable` and populates it from filesystem mtimes for both from-scratch and incremental jobs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2026-05-18T19:21:14Z

Host Test Results

1 files ± 0 1 suites ±0 3h 22m 26s ⏱️ + 1h 35m 42s
2 661 tests ± 0 2 646 ✅ ± 0 15 💤 ± 0 0 ❌ ±0
5 283 runs +2 603 5 254 ✅ +2 589 29 💤 +14 0 ❌ ±0

Results for commit a89c949. ± Comparison against earlier commit 8bff668.

Realm Server Test Results

1 files ±0 1 suites ±0 8m 44s ⏱️ +5s
1 405 tests ±0 1 405 ✅ ±0 0 💤 ±0 0 ❌ ±0
1 489 runs ±0 1 489 ✅ ±0 0 💤 ±0 0 ❌ ±0

Results for commit a89c949. ± Comparison against earlier commit 8bff668.

@query

`preWarmModulesTable` previously walked each invalidation's `boxel_index.deps` and added any executable URLs it found. That misses modules referenced by *string* in card templates — a typical pattern in dashboard cards: <Search @query={{filter: {type: {module: '.../cohort.gts', name: 'Cohort'}}}}> `cohort.gts` is referenced as a string parameter to the search filter, never imported into the dashboard's runtime module graph. So it never appears in the dashboard row's `deps`, never gets pre-warmed, and when the dashboard renders during indexing the `_federated-search` call fires a same-affinity `prerenderModule` for `cohort.gts` mid-card- render. That sub-prerender queues behind the dashboard's tab — the self-referential prerender deadlock. Add a realm-wide sweep on top of the existing per-row deps walk. Source is the filesystem-mtimes map: from-scratch already pays for the walk via `discoverInvalidations`, so we just filter and reuse it; incremental adds a fresh `reader.mtimes()` call before pre-warm (typical ~200 ms on a 500-file realm, one call per job, completely dominated by the indexing work that follows). Filter to `.gts` + `.gjs` only — cards can only live in template-bearing modules, so `.ts`/`.js` helpers are correctly excluded from the realm-wide layer. Helpers that ARE needed by a card already get pre-warmed through the existing per-row deps walk; the realm-wide layer is purely additive on top. Adds `hasCardExtension` / `cardExtensions` next to `hasExecutableExtension` / `executableExtensions` in `runtime-common/index.ts`, same shape — `.gts` and `.gjs` are the extensions a `CardDef` / `FieldDef` can live in because they're the ones with a Glimmer template surface. The modules cache treats `definitions: {}` (an empty result from a non-card module — e.g. a rare `.gts` helper) as a valid cache hit at `definition-lookup.ts:561-563`. So the worst case is one wasted prerender per non-card `.gts` per `clearRealmCache` cycle; subsequent lookups skip the prerender entirely. That contract is pinned by `getModuleCacheEntry caches non-card modules as no-card markers` in PR A. Cost analysis for ambitious-piranha (the staging realm where the deadlock fired): 9 `.gts` files + 9 `.ts` files = 18 executables, but only 9 `.gts` files in the realm-wide sweep (the `.ts` files are `bxl/*` helpers). Pre-warm is still serial (concurrency comes in a follow-up PR after baseline measurement). Roughly ~200 ms per file × 9 files = ~1.8 s additional pre-warm cost per from-scratch reindex. Stacked on top of PR #4863 (PagePool tab-materialization + priority plumbing). The two changes attack the same bug from different angles: PR #4863 makes sure the deadlock cannot happen when a sub-prerender DOES fire; this PR makes sure it almost never has to fire in the first place. PR #4863 is the load-bearing fix; this is the preventive layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…adlock-pre-warm-realm-modules

Drops the `hasCardExtension` helper added in the previous commit. Copilot's review caught a real case I'd missed: `packages/catalog-realm/commands/collect-submission-files.ts` defines `class CollectSubmissionFilesInput extends CardDef` — a card-defining `.ts` file, no Glimmer template. The `.gts`/`.gjs` extension filter would have skipped it on every realm-wide pre-warm sweep, and any `_federated-search` filter referencing it as a `type:` would have fired a same-affinity `prerenderModule` mid-card-render — the exact deadlock pattern this PR aims to prevent. Fix: realm-wide sweep now uses the existing `hasExecutableExtension` (`.gts` / `.gjs` / `.ts` / `.js`, excluding `.d.ts`). Non-card modules pre-warmed by the sweep persist to the modules table as empty- definitions rows (no-card markers — `definition-lookup.ts:561-563`), so subsequent `getCachedDefinitions` / `lookupDefinition` calls short-circuit at the cache without re-firing the prerenderer. That contract is pinned by the `getCachedDefinitions caches non-card modules as no-card markers` test in PR #4863. Variable renamed `allRealmCardModules` → `allRealmExecutables` and comments updated. The `hasCardExtension` helper + `cardExtensions` constant are removed entirely. Cost analysis update: ambitious-piranha goes from 9 candidates (just the `.gts` files) to 18 (9 `.gts` + 9 `.ts` helpers in `bxl/`). At ~200 ms per prerender × 18 modules with serial pre-warm = ~3.6 s additional cost on top of the from-scratch reindex. After the first sweep, the modules table caches the helper rows; subsequent reindexes only pay for modules that actually have card defs. Bounded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reverts the previous "broaden to all executables" commit. The case that broadening fixed — `.ts` files that host CardDef (rare; the only example in the repo today is command-input cards) — is handled correctly by the on-demand `lookupDefinition` read-through during the visit. The PagePool's new tab-materialization for module/command callers makes that on-demand path safe (the sub-prerender gets its own tab instead of queueing behind the render that triggered it), so the cost trade favors the narrow sweep: - Realm-wide pre-warm of `.gts` / `.gjs` only: ~9 prerenders × ~200 ms on a typical card-heavy realm. - Broader sweep including `.ts` / `.js` helpers: ~18+ prerenders, most of them producing no-card-marker rows for files that almost never define a card. This is an optimization on every reindex, not a correctness change. Comments updated to call out the trade-off honestly: `.ts` / `.js` CAN host card defs, the narrow filter exists because the on-demand path is now safe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…adlock-pre-warm-realm-modules

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

Comment thread packages/runtime-common/index-runner.ts

habdelra requested a review from Copilot May 18, 2026 19:02

Copilot started reviewing on behalf of habdelra May 18, 2026 19:05 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

Comment thread packages/runtime-common/index.ts Outdated

Comment thread packages/runtime-common/index-runner.ts

habdelra force-pushed the prerender-deadlock-pre-warm-realm-modules branch from 4fbac4d to 95e988e Compare May 18, 2026 19:41

habdelra and others added 3 commits May 18, 2026 15:52

Merge branch 'prerender-deadlock-pagepool-priority' into prerender-de…

978b020

…adlock-pre-warm-realm-modules

habdelra requested a review from a team May 18, 2026 21:24

Merge branch 'prerender-deadlock-pagepool-priority' into prerender-de…

a89c949

…adlock-pre-warm-realm-modules

habdelra mentioned this pull request May 18, 2026

Indexer: bounded-parallel pre-warm (concurrency=4) #4871

Closed

4 tasks

backspace approved these changes May 19, 2026

View reviewed changes

habdelra merged commit de0548f into prerender-deadlock-pagepool-priority May 19, 2026
94 of 96 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexer: pre-warm every realm .gts/.gjs module, not just per-row deps#4864

Indexer: pre-warm every realm .gts/.gjs module, not just per-row deps#4864
habdelra merged 5 commits into
prerender-deadlock-pagepool-priorityfrom
prerender-deadlock-pre-warm-realm-modules

habdelra commented May 18, 2026 •

edited by backspace

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented May 18, 2026 • edited by backspace Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What this PR changes

Bench results — local cold full-reindex

Files

Tests

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Host Test Results

Realm Server Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

habdelra commented May 18, 2026 •

edited by backspace

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading