Indexer: bounded-parallel pre-warm (concurrency=4)#4871
Conversation
Replaces the serial `for (let moduleUrl of toWarm)` loop in `preWarmModulesTable` with a bounded-parallel sweep at concurrency = 4. **Bound rationale.** Matches the prerender server's default `#fileAdmissionCap` (`affinityTabMax − 1`, default 4 with the default `PRERENDER_AFFINITY_TAB_MAX=5`). Tying parallelism to the per-affinity file-admission cap keeps pre-warm from oversubscribing the realm's tab budget — at most `#fileAdmissionCap` file/render tabs can be in flight in the realm's affinity, and pre-warm borrows the same headroom strictly *before* the visit phase starts. The visit phase pays no penalty because the pre-warm has fully drained by the time the first visit fires. Higher concurrency re-introduces the prerender-server contention that motivated the original serial shape (an earlier experimental parallel pre-warm regressed reindex time on these realms — that experiment ran with the *wrong* set of modules in `toWarm`, but the contention story still applied for the modules it did warm). Lower concurrency leaves prerender tabs idle for the duration of the sweep. **No semantic change.** The set of URLs in `toWarm` is unchanged, the per-call `getCachedDefinitions` invocation is unchanged, and the failure handling (`failed += 1`, log on any non-zero) is unchanged. Only the loop shape differs. The `pre-warm complete` perfLog line now also reports `concurrency=N` so wall-time deltas between serial and parallel runs are easy to attribute. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Closing — bench result showed a clear wall-time regression vs. the serial baseline (#4864), with dashboards stuck in |
There was a problem hiding this comment.
Pull request overview
This PR updates the indexer’s preWarmModulesTable phase to run module-cache warmups with bounded parallelism (worker pool) instead of strictly serial awaits, aiming to reduce wall time when many realm modules need warming.
Changes:
- Replace the serial
for..of awaitpre-warm loop with a bounded-parallel worker pool (intended concurrency = 4). - Add detailed in-code rationale for the chosen concurrency bound relative to prerender PagePool behavior.
- Extend the pre-warm completion perf log to include a
concurrency=field.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Bounded-parallel sweep. The bound matches the default | ||
| // `#fileAdmissionCap` of the prerender server's PagePool | ||
| // (`affinityTabMax − 1`, default 4). Tying it to the | ||
| // file-admission cap keeps the parallel pre-warm from | ||
| // oversubscribing the per-affinity tab budget — at most | ||
| // `#fileAdmissionCap` file/render tabs will be in flight in the | ||
| // realm's affinity, and pre-warm modules borrow the same headroom | ||
| // before the visit phase starts. The visit phase pays no penalty | ||
| // because pre-warm runs strictly before it. Concurrency above 4 | ||
| // re-introduces the contention that motivated the original serial | ||
| // shape; below 4 leaves the prerender server's idle tabs unused | ||
| // for the duration of the sweep. | ||
| let prewarmConcurrency = 4; | ||
| let urls = [...toWarm]; | ||
| let cursor = 0; |
| await Promise.all( | ||
| Array.from({ length: Math.min(prewarmConcurrency, urls.length) }, worker), | ||
| ); | ||
| if (failed > 0) { | ||
| this.#log.warn( | ||
| `${jobIdentity(this.#jobInfo)} ${failed} of ${toWarm.size} module pre-warm lookups failed; the visit phase will retry on-demand if needed`, | ||
| ); | ||
| } | ||
|
|
||
| this.#perfLog.debug( | ||
| `${jobIdentity(this.#jobInfo)} pre-warm complete in ${Date.now() - preWarmStart} ms (candidates=${toWarm.size} failed=${failed})`, | ||
| `${jobIdentity(this.#jobInfo)} pre-warm complete in ${Date.now() - preWarmStart} ms (candidates=${toWarm.size} failed=${failed} concurrency=${prewarmConcurrency})`, | ||
| ); |
There was a problem hiding this comment.
Same — index keys are fine for a read-only list; ${type}:${email} triggers dup-key warnings Copilot earlier asked us to avoid.
| let prewarmConcurrency = 4; | ||
| let urls = [...toWarm]; |
Why
preWarmModulesTable's loop is currently strictly serial: eachgetCachedDefinitions(moduleUrl)round-trips through the prerender server before the next URL fires. On a realm whose pre-warm set is dominated by the realm-wide.gts/.gjssweep (the layer #4864 introduces), wall-time is roughly|toWarm| × per-module-prerender-cost. For piranha-class realms that's hundreds of modules in series before the visit phase even begins.The prerender server has spare capacity during pre-warm — the visit phase hasn't started yet so the realm's affinity-tab budget is fully idle. Sweeping with bounded parallelism uses that headroom.
What changes
Replaces the for-loop in
preWarmModulesTablewith a worker-pool helper at concurrency = 4. The bound is chosen to match the prerender server's default#fileAdmissionCap(affinityTabMax − 1, default 4 with the defaultPRERENDER_AFFINITY_TAB_MAX=5). The intent was:#fileAdmissionCapfile/render tabs can be in flight in the realm's affinity; pre-warm borrows that same headroom and releases it before the visit phase starts.No semantic change to the cache contract: same
toWarmset, same per-callgetCachedDefinitionsinvocation, samefailed += 1accounting. Only the loop shape differs. Thepre-warm completeperfLog line now reportsconcurrency=Nfor ops attribution.Bench
Result: regressed. Do not merge until investigated.
Local cold full-reindex against
https://localhost:4201/user/ambitious-piranha/on this branch (with #4863 + #4864 merged in, CORS fix applied):waiting-stabilityThe rejected run's slowest rows pulled via
boxel_index_working.timing_diagnostics:Snapshot of same-affinity activity on the stuck dashboard:
Five module renders against the realm's affinity at the time the dashboard timed out — two running, three queued. Same fingerprint family as the original self-referential prerender deadlock #4863 addresses, except these modules are firing during the dashboard's
<Search>block instead of from a fresh visit. Pre-warm completed (18passes=fileExtractvisits before the visit phase kicked off), and themodulestable did have cached definitions for the heavy.gtsfiles, but something during the dashboard's search path is still triggering module sub-renders in the same affinity.The clear directive from the plan applies: roll back to serial, investigate before re-trying.
Hypotheses worth checking:
modulestable.lookupDefinitioncache-scope mismatch — pre-warm caches modules undercacheScope=realm-auth, authUserId=<realm owner>, but the dashboard's<Search>triggers a search whoselookupDefinitionresolves a different cacheScope/authUserId tuple, missing the cache and firing a freshprerenderModule. Would explain why the modules are running in the same affinity after pre-warm completed.await Promise.all(...)returns after the slowest worker, but themodulestable writes may not have propagated to the lookup path used by the visit phase by the time the first card visit fires. Less plausible since the writes are synchronous withingetCachedDefinitions, but worth ruling out.Files
packages/runtime-common/index-runner.ts— single function body change insidepreWarmModulesTable.Test plan
prudent-octopusandambitious-piranha, three trials each, both branches.main.🤖 Generated with Claude Code