Skip to content

CS-11123 Phase 1: Pre-warm modules table before indexing (serial)#4799

Merged
habdelra merged 1 commit into
mainfrom
cs-11123-prewarm-phase1
May 13, 2026
Merged

CS-11123 Phase 1: Pre-warm modules table before indexing (serial)#4799
habdelra merged 1 commit into
mainfrom
cs-11123-prewarm-phase1

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 13, 2026

Summary

  • Walk the invalidation set before the file-visit loop and ensure every module URL the upcoming renders will need has a populated modules row.
  • Reads go through CachingDefinitionLookup.getModuleCacheEntry — same read-through path lookupDefinition uses, so DB hits return immediately and misses share the in-flight dedup + cross-process coalescer with regular lookups.
  • Serial pass first: validates that pre-warm as a concept clears the nested-prerender deadlock in isolation from concurrency-driven variability. Phase 2 (separate PR) bound-parallelizes this.

Why this matters: A file render that fires _federated-searchpopulateQueryFieldslookupDefinition for a cold definition spawns a nested prerender into the same affinity-scoped tab queue the original render is holding — deadlock. cacheOnlyDefinitions:true papers over this; pre-warming kills it at the root. With the table primed, lookupDefinition hits a populated DB row instead of spawning a sub-prerender.

Signal sources for the URL set, in priority order:

  1. Existing boxel_index.deps — the runtime-captured dep list from the URL's prior successful render.
  2. adoptsFrom.module read via reader.readFile — for novel .json URLs without a prior deps row.
  3. The URL itself — for novel executable files; the file IS a module.

Failures in the pre-warm phase are warned but do not fail the batch. A mid-render sub-prerender will still fire on demand if pre-warm misses a module — degrading to the pre-PR behavior for that specific module, not deadlocking the batch.

Wired into both fromScratch and incremental, called once after orderInvalidationsByDependencies and before the visit loop opens.

Test plan

  • Bench /prudent-octopus from-scratch reindex on top of this PR — wall_seconds no worse than the main baseline (~190 s prod-built).
  • Bench /ambitious/prianha from-scratch reindex on top of this PR — must complete (no rejection at the 150 s prerender abort on the heavy Tracking/Cohort/*.json render). Capture wall_seconds, instancesIndexed, instanceErrors, and the _federated-search HTTP count.
  • Confirm pre-warm walk wall-clock cost: how many module URLs, how many prerenders fired, total time. Should be a small fraction of the total reindex time.
  • Existing realm-server indexing integration tests still pass.

This is Phase 1 of a 3-PR stack:

  • Phase 1 (this PR): serial pre-warm.
  • Phase 2: bound-parallelize the pre-warm walk.
  • Phase 3: unwind the cacheOnlyDefinitions workaround now that pre-warm covers the same ground.

🤖 Generated with Claude Code


Bench results (vite-dev rig, 2026-05-12)

Protocol: mise run kill-allTRUNCATE job_reservations, jobs, job_progressmise run dev-all → boot drain (unfulfilled=0 active=0) → readiness-check on each realm → mount drain → time curl /_grafana-reindex?realm=user/<realm>. Same-rig as the prior main baseline (commit 0119bac76d) captured in the Linear ticket comment.

Phase 1 — current SHA (c85633d237), post the extensionless-adoptsFrom fix

Realm SHA Wall (s) Status fed-search hits filesIndexed instancesIndexed instanceErrors
user/prudent-octopus main 0119bac76d 617 resolved 120 118 91 0
user/prudent-octopus Phase 1 c85633d237 467 resolved 176 118 91 0
user/ambitious-piranha main 0119bac76d 652 rejected (500) 614
user/ambitious-piranha Phase 1 c85633d237 195 rejected (500) 18

Octopus

Phase 1 trims 150 s (~24%) off the main baseline — pre-warm front-loads the modules a cohort render needs so the visit phase doesn't have to spawn nested prerenders inside populateQueryFields. 91 instances / 0 errors matches main exactly; the extensionless-adoptsFrom fix on the latest push resolved the earlier 90/1-error result (adoptsFrom.module was being filtered out for the common relative-extensionless form like "../author", so the cohort instance referencing it never got its module pre-warmed and errored at index time). fed-search count goes 120 → 176 (+47%) — consistent with more renders completing the live populateQueryFields path rather than degrading.

Piranha

Still rejects with the 150 s prerender-server abort during the first cohort render — same outcome as main, which also rejects piranha. The bar is piranha completes, and pre-warm alone does not clear it (only 18 fed-search hits before the 150 s single-render budget is exhausted on the heavy Tracking/Cohort/*.json render). Comparing wall-clock-to-rejection (195 s vs 652 s) is noise — both runs fail the bar, so time-to-failure is not meaningful.

Pre-warm Phase 1 alone is insufficient to clear piranha. The next dependent ticket (CS-11117 — width-aware parallel visit loop) plus other cohort-render optimizations will be needed. Phase 2 (bounded-parallel pre-warm) and Phase 3 (unwind cacheOnlyDefinitions) bench results will be appended as they come in.

Pass-bar status for Phase 1

  • ✅ Octopus wall-clock no worse than baseline (improved by 24%, no instanceErrors).
  • ❌ Piranha does not complete; same end state as main, so pre-warm alone is not sufficient. Tracking as a follow-up signal for the broader pre-warm + parallel-visit-loop combination.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new “modules table pre-warm” step in the indexing pipeline to proactively populate module-cache rows for modules that upcoming renders are likely to need, aiming to prevent nested-prerender deadlocks during indexing (especially those triggered by cold lookupDefinition paths).

Changes:

  • Add a serial pre-warm pass (preWarmModulesTable) before the main visit loop in both fromScratch and incremental indexing.
  • Derive pre-warm candidates from existing dependency rows, plus adoptsFrom.module for novel JSON cards.
  • Add debug/warn logging for pre-warm duration and failures.
Comments suppressed due to low confidence (1)

packages/runtime-common/index-runner.ts:608

  • This adoptsFromModule path is also gated on hasExecutableExtension(...), but adoptsFrom.module values can be extensionless (the runtime normalizes modules by stripping executable extensions, and DefinitionLookup.getModuleCacheEntry() already supports extensionless URLs via its candidate-extension fallback logic). Consider adding adoptsFromModule to toWarm even when it’s extensionless (or normalizing consistently), otherwise novel .json cards may not actually pre-warm their base module.
      let adoptsFromModule = await this.#readAdoptsFromModuleFromDisk(url);
      if (adoptsFromModule && hasExecutableExtension(adoptsFromModule)) {
        toWarm.add(adoptsFromModule);
      }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/runtime-common/index-runner.ts
@habdelra habdelra force-pushed the cs-11123-prewarm-phase1 branch 2 times, most recently from 004c9b8 to bdd34dc Compare May 13, 2026 00:27
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

Host Test Results

    1 files  ±    0      1 suites  ±0   2h 6m 0s ⏱️ - 1h 26m 56s
2 658 tests ±    0  2 643 ✅ ±    0  15 💤 ± 0  0 ❌ ±0 
2 677 runs   - 2 677  2 662 ✅  - 2 662  15 💤  - 15  0 ❌ ±0 

Results for commit c85633d. ± Comparison against earlier commit bdd34dc.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   11m 57s ⏱️ -45s
1 345 tests ±0  1 345 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 424 runs  ±0  1 424 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit c85633d. ± Comparison against earlier commit bdd34dc.

@habdelra habdelra marked this pull request as ready for review May 13, 2026 02:45
@habdelra habdelra requested a review from a team May 13, 2026 02:46
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bdd34dca64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/runtime-common/index-runner.ts Outdated
Walk the invalidation set before the file-visit loop fires and ensure
every module URL the renders are likely to need has a populated
`modules` row. With the table primed, `lookupDefinition` calls fired
from inside renders (via `populateQueryFields` → `_federated-search`)
hit a populated DB row instead of spawning a nested prerender —
clearing the affinity-scoped deadlock that PR #4777 papered over with
`cacheOnlyDefinitions:true`.

Signal sources for the URL set, in priority order:
  1. Existing `boxel_index.deps` — the runtime-captured dep list from
     the URL's prior successful render. Used for any URL with a row
     in `boxel_index_working` ∪ `boxel_index`.
  2. `adoptsFrom.module` read from disk — used for novel `.json`
     URLs without a prior `deps` row.
  3. The URL itself — used for novel executable files; the file IS a
     module, pre-warm it directly.

Reads go through `CachingDefinitionLookup.getModuleCacheEntry`, which
is the URL-only entry into the `loadModuleCacheEntry` pipeline that
`lookupDefinition` itself uses. On a DB hit the call returns
immediately; on a miss it walks through the in-flight dedup map, the
cross-process coalescer (CS-10953), the prerender, and the post-
prerender persist — so two callers asking for the same URL share one
prerender, and a pre-warm racing an invalidation never overwrites the
new generation snapshot.

Phase 1: serial. The pre-warm loop awaits each URL in turn. Validates
that pre-warm as a concept clears the deadlock in isolation from any
concurrency-driven variability. Phase 2 will bound-parallelize this
loop.

Failures in the pre-warm phase are warned but do not fail the batch.
A mid-render sub-prerender will still fire on demand if a module
fails to pre-warm — degrading to the pre-PR behavior for that
specific module, not deadlocking the batch.

Wired into both `fromScratch` and `incremental`, called once after
`orderInvalidationsByDependencies` and before the visit loop opens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra merged commit 9dea086 into main May 13, 2026
67 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants