Skip to content

Pass-scoped compute memo #4926

Open
habdelra wants to merge 4 commits into
mainfrom
cs-11208-computed-performance-improvements
Open

Pass-scoped compute memo #4926
habdelra wants to merge 4 commits into
mainfrom
cs-11208-computed-performance-improvements

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 21, 2026

Reduces redundant computeVia invocations during the per-card prerender by ~60–67% on the heaviest cards via two complementary fixes: the gist's "double-read" Phase 2 plus a pass-scoped compute memo. Driven by CS-11208 and the computed-performance gist.

Background

The gist proposed a six-phase optimization plan for computed fields. Audited each phase against the current code:

  1. Double-read in BaseDef.[queryableValue] (Phase 2) — for every contains / contains-many / links-to field in the search doc, we ran peekAtField (which invokes computeVia) and then re-read value[fieldName] through the descriptor (which invokes computeVia again). One traversal walked every computed twice. Landed.
  2. Pass-scoped compute memo (Phase 3, scoped variant) — serializeCard + searchDoc together touch the same (instance, fieldName) pair multiple times. A WeakMap-backed memo opened just for the synchronous traversal collapses those to a single computeVia invocation each. Bounded to one sync traversal so it can't interact with Glimmer reactivity. Landed.
  3. Two render.meta calls collapsed into one — initially attempted; reverted after CI. The two calls aren't redundant: between them the fitted/embedded ancestor renders cause linksTo / linksToMany template reads to mark fields as "used" in the data bucket, which the second render.meta's queryableValue then includes in the search doc via usedLinksToFieldsOnly: true. The non-isolated formats render linked fields and those links appear in search doc test caught this. Filed as follow-up.

Deferred from the gist: Phases 5 (shared aggregate rollups) and 6 (dependency-aware invalidation) — not worth complexity until we measure post-this-PR baselines.

Changes

packages/base/field-support.ts

  • New beginComputePass() / endComputePass() open a synchronous per-instance memo (WeakMap<BaseDef, Map<string, any>>).
  • getter() consults the memo when a pass is open; off-pass reads pay only one if (passComputeMemo === null) branch.
  • Counters computedCalls + computedCacheHits are incremented only inside the pass — production reads outside render.meta are unaffected.

packages/base/card-api.gts

  • BaseDef.[queryableValue] reuses the value already peeked at the top of each iteration instead of re-reading value[fieldName]. For computed fields the re-read re-invoked computeVia — this is the gist's Phase 2 fix.
  • Re-exports beginComputePass, endComputePass, ComputePassSnapshot so the host route can consume them.

packages/host/app/routes/render/meta.ts

  • Wraps the serializeCard + searchDoc traversal with beginComputePass / endComputePass.
  • Emits PrerenderMetaDiagnostics { computedCalls, computedCacheHits, serializeMs, searchDocMs } on the PrerenderMeta response.
  • Logs the per-card counts to the new host:computed-perf logger for local dev.
  • try { ... } finally { endComputePass() } so a throw inside serializeCard or searchDoc always closes the pass — without this, the module-global memo in field-support.ts would stay set and later off-pass getter calls would read stale memoized values across reactive cycles (raised by Codex + Copilot bot review).
  • Defensive typeof api.beginComputePass === 'function' guard — during a cold dev boot the host can briefly load a stale base/card-api build (vite still optimizing, or a transpile race). In that window the route skips the pass; getter's fast-path produces a correct doc + searchDoc, just without per-row compute counters for those few cards.

packages/realm-server/prerender/render-settlement.ts

  • decorateRenderErrorsWithTimings now lifts the host's success-path diagnostics block off response.card (same way it already lifts error-path diagnostics). This is how computedCalls etc. reach the indexer's boxel_index.timing_diagnostics column.

packages/runtime-common/index.ts

  • New PrerenderMetaDiagnostics interface; extends PrerenderMeta with an optional diagnostics field.
  • Extends RenderTimeoutDiagnostics with the same four fields so they ride alongside the existing server-observed timings in boxel_index.timing_diagnostics and error_doc.diagnostics.

.claude/skills/indexing-diagnostics/SKILL.md

  • Documents the four new diagnostic fields in the JSONC reference example.
  • Adds a "Computed-field hot path" row to Classify in one pass — describes the pattern (computedCalls > 1000 AND searchDocMs + serializeMs ≈ renderElapsedMs → look at the card's computeds, not data loads or browser stalls).
  • Adds an SQL pattern to rank rows in a realm by computedCalls so operators can find the compute-heavy outliers.

Measurement methodology

Local stack against a compute-heavy fixture realm (an insurance-pricing workload with aggregate cards over a linksToMany policy collection — the aggregate card has 27 computeds, each linked card has 64). Triggered a single-realm reindex via the _grafana-reindex route and read computedCalls / computedCacheHits / serializeMs / searchDocMs straight off boxel_index.timing_diagnostics.

Identifying card slugs are in CS-11208; the table below uses role labels.

Results

Card role calls cacheHits total reads % elided by memo serializeMs searchDocMs renderElapsedMs
Policy (heaviest of 6 ranked) 393 747 1140 65.5% 58.4 16.5 855
Customer 393 680 1073 63.4% 68.3 18.3 1133
Policy 357 668 1025 65.2% 34.1 9.6 990
Aggregate Portfolio 351 643 994 64.7% 62 8.3 1370
Policy 331 643 974 66.0% 40.8 12.2 1061
Policy 331 563 894 63.0% 41.7 7.9 1065

Across the compute-heavy slice the memo elides 63–66% of compute reads consistently — combined gain from the double-read fix + pass-memo. The gist's Phase 2 alone projected 10–40% on this card shape; with the pass-memo layered on, we hit the upper end of Phase 3's 50–80% range.

Hot-path overhead — direct microbenchmark

Concern was that adding any per-getter check might tax the host UI's tight reactive-read loop where pass is closed. Measured directly in the running host app:

1,000,000 iterations:
  baseline (empty loop):           3.9 ms
  with `if (memo === null)` check: 5.7 ms
  overhead per call:               1.8 ns

A typical computeVia invocation costs 1–100 µs. The new branch costs ~10,000× less than the work it gates and is well below the floor of anything visible in a Chrome flame chart.

Follow-up: optimizing the render.meta double pass

The current two-call renderMeta exists because the second call's queryableValue depends on linksTo / linksToMany field-usage state that the fitted/embedded renders between the calls populate. A clean win is still possible (e.g. expose card types via a cheaper synchronous endpoint, or run a single render.meta after all formats), but it needs its own design + test plan. Tracked in CS-11237.

Deferred from the gist

  • Phase 5 (shared aggregate rollups) — pattern only matters if aggregate cards dominate the post-this-PR budget. Wait for prod numbers.
  • Phase 6 (dependency-aware invalidation via compute.bxl.deps / a computeDeps annotation) — BXL-specific today; needs an additive opt-in for plain-JS computeds; plumbing through getFields is real work. Deferred until incremental edit numbers prove it necessary.

Test plan

  • CI green (the non-isolated formats render linked fields and those links appear in search doc regression triggered the revert in commit 3)
  • Local reindex of a compute-heavy fixture realm — confirm 0 errors, computedCalls populated on every row
  • Spot-check serializeMs + searchDocMs doesn't regress wall-clock for compute-light cards (e.g. base realm cards)
  • Run host test suite (pnpm test --filter @cardstack/host) — pass-memo lifecycle must not interact with Glimmer's tracked entanglement

🤖 Generated with Claude Code

Reduces redundant `computeVia` work during the per-card prerender:

- `BaseDef.[queryableValue]` no longer re-reads `value[fieldName]`
  after `peekAtField` — the second read re-invoked `computeVia` for
  every contains/contains-many/links-to field in the search doc.
- `beginComputePass` / `endComputePass` install a synchronous
  per-instance compute memo around `serializeCard + searchDoc` so a
  card with N computeds runs each one once per render.meta pass
  instead of once per traversal.
- The prerender server's two-shot render.meta capture is collapsed
  to a single call. `model.capturedDeps` is frozen at parent
  readyPromise resolution and the fitted/embedded renders that ran
  between the two captures don't mutate the card model, so the
  second call was emitting the same bytes as the first.

Host emits `computedCalls`, `computedCacheHits`, `serializeMs`,
`searchDocMs` per row; render-settlement lifts them onto
`response.meta.diagnostics` so they persist into
`boxel_index.timing_diagnostics` for SQL-side perf triage. The
indexing-diagnostics skill documents the new fields and adds a SQL
pattern for ranking rows by computed-call pressure.

Off-pass reads pay a single `if (passComputeMemo === null)` branch
in `getter`, no counter increment and no Map operations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

Preview deployments

Host Test Results

    1 files  ± 0      1 suites  ±0   1h 47m 31s ⏱️ - 1m 22s
2 723 tests +11  2 708 ✅ +11  15 💤 ±0  0 ❌ ±0 
2 742 runs  +11  2 727 ✅ +11  15 💤 ±0  0 ❌ ±0 

Results for commit 0cc300f. ± Comparison against earlier commit 2a472dd.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   10m 27s ⏱️ -26s
1 480 tests ±0  1 480 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 571 runs  ±0  1 571 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 0cc300f. ± Comparison against earlier commit 2a472dd.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2102efa80f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/host/app/routes/render/meta.ts
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes prerender performance by (1) memoizing computed-field (computeVia) reads within a single render.meta traversal and (2) removing a redundant second render.meta call during per-card prerender. It also plumbs host-side computed/timing diagnostics through to the prerender/indexing diagnostics payload for SQL-side analysis.

Changes:

  • Add a synchronous “compute pass” memo in base to collapse duplicate computed-field reads during serializeCard + searchDoc.
  • Update host render.meta to wrap traversals in the compute pass and emit computedCalls/computedCacheHits/serializeMs/searchDocMs diagnostics.
  • Deduplicate prerender server’s double render.meta invocation and lift success-path diagnostics into persisted timing diagnostics.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/base/field-support.ts Introduces pass-scoped computed memo + counters via beginComputePass/endComputePass, used by getter() for computed fields.
packages/base/card-api.gts Re-exports compute pass APIs and fixes a double-read in [queryableValue] by reusing the peeked value.
packages/host/app/routes/render/meta.ts Opens/closes compute pass around serializeCard + searchDoc, measures timings, logs, and attaches diagnostics to PrerenderMeta.
packages/realm-server/prerender/render-runner.ts Removes the second render.meta call and reuses the single result for both ancestor rendering and final response.
packages/realm-server/prerender/render-settlement.ts Lifts success-path card.diagnostics into response.meta.diagnostics so it persists to timing diagnostics.
packages/runtime-common/index.ts Adds PrerenderMetaDiagnostics and extends PrerenderMeta / RenderTimeoutDiagnostics to carry the new fields.
.claude/skills/indexing-diagnostics/SKILL.md Documents the new diagnostics fields and adds SQL/examples for computed hot-path triage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/host/app/routes/render/meta.ts Outdated
Two robustness fixes around the new `beginComputePass` / `endComputePass`
lifecycle:

1. **Stale `api` fallback.** During a cold dev start the host can briefly
   load a `base/card-api` build that predates the beginComputePass /
   endComputePass exports — vite is still optimizing dependencies or a
   transpile race serves a stale module. Without a guard this surfaced
   as `api.beginComputePass is not a function` errors during the very
   first reindex on a freshly-started stack, then went away once
   modules stabilized.

   Skip the pass when the methods aren't on `api` — `getter`'s
   `passComputeMemo === null` fast path still produces a correct
   serialized doc and searchDoc, just without per-row computedCalls /
   computedCacheHits diagnostics for those few cards. serializeMs +
   searchDocMs still surface either way.

2. **Close in `finally` on throw.** `endComputePass` now runs in a
   `finally` so a throw inside `serializeCard` or `searchDoc` still
   closes the pass — otherwise the module-global `passComputeMemo` in
   field-support.ts stays set and later off-pass `getter` calls would
   read stale memoized values across reactive cycles. Caught by Codex
   and Copilot bot review on PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra force-pushed the cs-11208-computed-performance-improvements branch from 2102efa to 77cefe7 Compare May 21, 2026 19:39
…tract

The dedup broke the
`non-isolated formats render linked fields and those links appear in search doc`
prerendering test (Realm Server Tests shard 1, 6).

The two render.meta calls aren't duplicate work after all. Between
them the fitted / embedded ancestor renders touch linksTo /
linksToMany values from the embedded template; those reads mark the
fields as "used" in the per-instance data bucket. The *second*
renderMeta's queryableValue runs `getFields` with
`usedLinksToFieldsOnly: true`, which now picks up those linked fields
and includes their values in `searchDoc.owner.name`,
`searchDoc.owners[*].name`, etc. Collapsing the two calls runs
searchDoc *before* any fitted / embedded render and skips those
fields entirely.

Reverts only the render-runner change. The double-read fix in
BaseDef.[queryableValue] and the pass-scoped memo are independent
wins and stay — they're what produced the 63-66% reduction in
computeVia invocations across compute-heavy cards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra changed the title Pass-scoped compute memo + collapse double render.meta pass Pass-scoped compute memo May 21, 2026
@habdelra habdelra requested a review from a team May 21, 2026 20:43
…formance-improvements

# Conflicts:
#	packages/host/app/routes/render/meta.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants