Indexer: bounded-parallel pre-warm (concurrency=4) by habdelra · Pull Request #4871 · cardstack/boxel

habdelra · 2026-05-18T22:13:00Z

Stacked on top of #4864. Merge that first (or retarget this PR's base to main after #4864 merges).

⚠️ Bench regression — do not merge. First local bench against ctse/ambitious-piranha showed a wall-time regression vs. #4864's serial baseline; the job was rejected after 12 min with dashboards stuck in renderStage = waiting-stability carrying ~99 in-flight queries. Same fingerprint family as the self-referential prerender deadlock #4863 addresses, but firing at a different point in the lifecycle. Investigation needed before re-trying. See the Bench section below.

Why

preWarmModulesTable's loop is currently strictly serial: each getCachedDefinitions(moduleUrl) round-trips through the prerender server before the next URL fires. On a realm whose pre-warm set is dominated by the realm-wide .gts/.gjs sweep (the layer #4864 introduces), wall-time is roughly |toWarm| × per-module-prerender-cost. For piranha-class realms that's hundreds of modules in series before the visit phase even begins.

The prerender server has spare capacity during pre-warm — the visit phase hasn't started yet so the realm's affinity-tab budget is fully idle. Sweeping with bounded parallelism uses that headroom.

What changes

Replaces the for-loop in preWarmModulesTable with a worker-pool helper at concurrency = 4. The bound is chosen to match the prerender server's default #fileAdmissionCap (affinityTabMax − 1, default 4 with the default PRERENDER_AFFINITY_TAB_MAX=5). The intent was:

Keep pre-warm from oversubscribing the per-affinity tab budget. At most #fileAdmissionCap file/render tabs can be in flight in the realm's affinity; pre-warm borrows that same headroom and releases it before the visit phase starts.
Bound contention. The existing serial shape already absorbs a prior experiment that regressed under wider parallelism. Capping at 4 was meant to keep the experiment's failure mode out of reach.

No semantic change to the cache contract: same toWarm set, same per-call getCachedDefinitions invocation, same failed += 1 accounting. Only the loop shape differs. The pre-warm complete perfLog line now reports concurrency=N for ops attribution.

Bench

Result: regressed. Do not merge until investigated.

Local cold full-reindex against https://localhost:4201/user/ambitious-piranha/ on this branch (with #4863 + #4864 merged in, CORS fix applied):

approach	wall time	outcome
serial pre-warm (#4864 baseline)	~665 s (~11 min)	0 errored rows
bounded-parallel pre-warm, concurrency=4 (this PR)	rejected at 12+ min	dashboards stuck in `waiting-stability`

The rejected run's slowest rows pulled via boxel_index_working.timing_diagnostics:

url                                                | render_ms | stage              | queries_stuck
Tracking/Cohort/cohort-2023-plus.json              |   131 407 | waiting-stability  |             0
PolicyTrackingDashboard/dashboard-2-3.json         |   111 417 | waiting-stability  |            99
Tracking/Cohort/cht-2023-2023-2.json               |   111 156 |                    |             0

Snapshot of same-affinity activity on the stuck dashboard:

portfolio.gts          kind=module  state=running  priority=0  ageMs=2193
book-of-business.gts   kind=module  state=running  priority=0  ageMs=2190
cohort.gts             kind=module  state=queued   priority=0  ageMs=1949
policy.gts             kind=module  state=queued   priority=0  ageMs=1925
claim.gts              kind=module  state=queued   priority=0  ageMs=1541

Five module renders against the realm's affinity at the time the dashboard timed out — two running, three queued. Same fingerprint family as the original self-referential prerender deadlock #4863 addresses, except these modules are firing during the dashboard's <Search> block instead of from a fresh visit. Pre-warm completed (18 passes=fileExtract visits before the visit phase kicked off), and the modules table did have cached definitions for the heavy .gts files, but something during the dashboard's search path is still triggering module sub-renders in the same affinity.

The clear directive from the plan applies: roll back to serial, investigate before re-trying.

Hypotheses worth checking:

The 4 concurrent pre-warm fileExtract tabs leave the prerender server in a Chrome-state that's not equivalent to what serial pre-warm produces — e.g., LRU eviction patterns, GC pressure, or warm-tab affinity ordering. This would manifest as visit-phase renders being slower-than-serial despite a fully populated modules table.
A search-time lookupDefinition cache-scope mismatch — pre-warm caches modules under cacheScope=realm-auth, authUserId=<realm owner>, but the dashboard's <Search> triggers a search whose lookupDefinition resolves a different cacheScope/authUserId tuple, missing the cache and firing a fresh prerenderModule. Would explain why the modules are running in the same affinity after pre-warm completed.
A race between pre-warm completion and the visit phase: parallel pre-warm's await Promise.all(...) returns after the slowest worker, but the modules table writes may not have propagated to the lookup path used by the visit phase by the time the first card visit fires. Less plausible since the writes are synchronous within getCachedDefinitions, but worth ruling out.

Files

packages/runtime-common/index-runner.ts — single function body change inside preWarmModulesTable.

Test plan

Investigate the hypothesis above and identify the root cause before merging.
If addressable: re-run cold full-reindex on prudent-octopus and ambitious-piranha, three trials each, both branches.
If not addressable cleanly: close this PR. The serial baseline in Indexer: pre-warm every realm .gts/.gjs module, not just per-row deps #4864 is already a major win over main.
CI green.

🤖 Generated with Claude Code

Replaces the serial `for (let moduleUrl of toWarm)` loop in `preWarmModulesTable` with a bounded-parallel sweep at concurrency = 4. **Bound rationale.** Matches the prerender server's default `#fileAdmissionCap` (`affinityTabMax − 1`, default 4 with the default `PRERENDER_AFFINITY_TAB_MAX=5`). Tying parallelism to the per-affinity file-admission cap keeps pre-warm from oversubscribing the realm's tab budget — at most `#fileAdmissionCap` file/render tabs can be in flight in the realm's affinity, and pre-warm borrows the same headroom strictly *before* the visit phase starts. The visit phase pays no penalty because the pre-warm has fully drained by the time the first visit fires. Higher concurrency re-introduces the prerender-server contention that motivated the original serial shape (an earlier experimental parallel pre-warm regressed reindex time on these realms — that experiment ran with the *wrong* set of modules in `toWarm`, but the contention story still applied for the modules it did warm). Lower concurrency leaves prerender tabs idle for the duration of the sweep. **No semantic change.** The set of URLs in `toWarm` is unchanged, the per-call `getCachedDefinitions` invocation is unchanged, and the failure handling (`failed += 1`, log on any non-zero) is unchanged. Only the loop shape differs. The `pre-warm complete` perfLog line now also reports `concurrency=N` so wall-time deltas between serial and parallel runs are easy to attribute. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

habdelra · 2026-05-18T22:36:09Z

Closing — bench result showed a clear wall-time regression vs. the serial baseline (#4864), with dashboards stuck in waiting-stability while pre-warmed modules were re-rendering inside the same affinity. The serial pre-warm in #4864 is already the production target. If we revisit parallel pre-warm later, the regression analysis in the closed PR description plus the three hypotheses there are the starting point.

Copilot

Pull request overview

This PR updates the indexer’s preWarmModulesTable phase to run module-cache warmups with bounded parallelism (worker pool) instead of strictly serial awaits, aiming to reduce wall time when many realm modules need warming.

Changes:

Replace the serial for..of await pre-warm loop with a bounded-parallel worker pool (intended concurrency = 4).
Add detailed in-code rationale for the chosen concurrency bound relative to prerender PagePool behavior.
Extend the pre-warm completion perf log to include a concurrency= field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    // Bounded-parallel sweep. The bound matches the default
+    // `#fileAdmissionCap` of the prerender server's PagePool
+    // (`affinityTabMax − 1`, default 4). Tying it to the
+    // file-admission cap keeps the parallel pre-warm from
+    // oversubscribing the per-affinity tab budget — at most
+    // `#fileAdmissionCap` file/render tabs will be in flight in the
+    // realm's affinity, and pre-warm modules borrow the same headroom
+    // before the visit phase starts. The visit phase pays no penalty
+    // because pre-warm runs strictly before it. Concurrency above 4
+    // re-introduces the contention that motivated the original serial
+    // shape; below 4 leaves the prerender server's idle tabs unused
+    // for the duration of the sweep.
+    let prewarmConcurrency = 4;
+    let urls = [...toWarm];
+    let cursor = 0;


vojtabiberle · 2026-05-18T22:38:32Z

+    await Promise.all(
+      Array.from({ length: Math.min(prewarmConcurrency, urls.length) }, worker),
+    );
    if (failed > 0) {
      this.#log.warn(
        `${jobIdentity(this.#jobInfo)} ${failed} of ${toWarm.size} module pre-warm lookups failed; the visit phase will retry on-demand if needed`,
      );
    }

    this.#perfLog.debug(
-      `${jobIdentity(this.#jobInfo)} pre-warm complete in ${Date.now() - preWarmStart} ms (candidates=${toWarm.size} failed=${failed})`,
+      `${jobIdentity(this.#jobInfo)} pre-warm complete in ${Date.now() - preWarmStart} ms (candidates=${toWarm.size} failed=${failed} concurrency=${prewarmConcurrency})`,
    );


Same — index keys are fine for a read-only list; ${type}:${email} triggers dup-key warnings Copilot earlier asked us to avoid.

@vojtabiberle ???

+    let prewarmConcurrency = 4;
+    let urls = [...toWarm];


habdelra requested a review from Copilot May 18, 2026 22:26

Copilot started reviewing on behalf of habdelra May 18, 2026 22:26 View session

habdelra closed this May 18, 2026

habdelra deleted the bounded-parallel-prewarm branch May 18, 2026 22:36

Copilot AI reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Indexer: bounded-parallel pre-warm (concurrency=4)#4871

Indexer: bounded-parallel pre-warm (concurrency=4)#4871
habdelra wants to merge 1 commit into
prerender-deadlock-pre-warm-realm-modulesfrom
bounded-parallel-prewarm

habdelra commented May 18, 2026 •

edited

Loading

Uh oh!

habdelra commented May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

vojtabiberle May 18, 2026

Uh oh!

habdelra May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

habdelra commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What changes

Bench

Files

Test plan

Uh oh!

habdelra commented May 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

vojtabiberle May 18, 2026

Choose a reason for hiding this comment

Uh oh!

habdelra May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

habdelra commented May 18, 2026 •

edited

Loading