Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/host/memory-baseline.json
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
"delta_mb": 69
},
"Acceptance | code submode tests": {
"delta_mb": 31.9
"delta_mb": 94.5
},
"Acceptance | code-submode | card playground": {
"delta_mb": 119.1
Expand Down
31 changes: 19 additions & 12 deletions packages/realm-server/handlers/handle-search.ts
Original file line number Diff line number Diff line change
Expand Up @@ -63,32 +63,39 @@ export default function handleSearch(opts?: {
searchOpts,
);

// Job-scoped same-realm cache. Gated on all three:
// Job-scoped cache. Gated on:
// (a) `x-boxel-job-id` is present and well-formed (only the
// indexer worker stamps this; live user / API callers never
// carry it and therefore always see fresh data),
// (b) `x-boxel-consuming-realm` is present and well-formed (the
// host's render route only sets it during prerender),
// (c) the request's `realms` list is exactly `[consumingRealm]`
// — cross-realm reads bypass the cache because a peer
// realm can swap its `boxel_index` mid-batch and the cached
// value would freeze a stale snapshot.
// host's render route only sets it during prerender).
//
// Cross-realm reads participate too. The contract is "one
// consolidated view of the realm-server's state per indexing
// batch": within a single jobId we pin search results to the
// first observation, even if a peer realm swaps its `boxel_index`
// mid-batch. A batch producing one consistent snapshot is more
// valuable than chasing post-swap state across repeated reads of
// the same query. Same-process writes (this batch's own swap)
// still trip `Realm.update`'s onInvalidation → clearInFlightSearch
// (Phase 1 path), so the cache only freezes *peer-realm* swaps
// within the job's lifetime.
//
// `multiRealmAuthorization` has already validated read access to
// every entry of `realmList` for this caller, so the cache cannot
// surface results across an authorization boundary.
let jobId = searchCache
? sanitizePrerenderJobId(ctxt.get(PRERENDER_JOB_ID_HEADER))
: null;
let consumingRealm = searchCache
? sanitizeConsumingRealmHeader(ctxt.get(X_BOXEL_CONSUMING_REALM_HEADER))
: null;
let cacheable =
searchCache &&
jobId &&
consumingRealm &&
realmList.length === 1 &&
realmList[0] === consumingRealm;
let cacheable = searchCache && jobId && consumingRealm;

let combined = cacheable
? await searchCache!.getOrPopulate({
jobId: jobId!,
realms: realmList,
query: cardsQuery,
opts: searchOpts,
populate: runSearch,
Expand Down
66 changes: 43 additions & 23 deletions packages/realm-server/job-scoped-search-cache.ts
Original file line number Diff line number Diff line change
Expand Up @@ -35,24 +35,34 @@ type CachedEntry = {
fifoSeq: number;
};

// Same-realm read cache used during indexing. Each entry is keyed by
// `(jobId, normalizedQuery, normalizedOpts)` and represents one
// `_federated-search` populate computed during the lifetime of one
// indexing job. Safe because within an indexing batch the writer
// touches `boxel_index_working`, not `boxel_index` — so every read of
// the same realm's `boxel_index` returns identical bytes until the
// batch's `applyBatchUpdates` swap fires. The job-id boundary scopes
// the cache to a single batch; a subsequent job hashes to different
// keys and never reuses a stale value.
// Per-batch read cache used during indexing. Each entry is keyed by
// `(jobId, normalizedRealms, normalizedQuery, normalizedOpts)` and
// represents one `_federated-search` populate computed during the
// lifetime of one indexing job. The job-id boundary scopes the cache
// to a single batch; a subsequent job hashes to different keys and
// never reuses a stale value.
//
// The handler gates entry into this cache on three conditions all
// holding: `x-boxel-job-id` present, `x-boxel-consuming-realm` present,
// and the request's `realms` array is exactly `[consumingRealm]`.
// Cross-realm reads bypass the cache because peer realms can swap
// independently — a cached read against a foreign realm could freeze
// a stale snapshot. Anonymous (no jobId) reads also bypass: those
// callers are not inside the batch's snapshot-stable read window and
// must always see live state.
// Same-realm reads are safe by construction: within an indexing batch
// the writer touches `boxel_index_working`, not `boxel_index`, so
// every read of the consuming realm's `boxel_index` returns identical
// bytes until the batch's `applyBatchUpdates` swap fires (at which
// point `Realm.update`'s onInvalidation tears down the cache via
// `clearInFlightSearch`, Phase 1's path — unchanged here).
//
// Cross-realm reads accept a different staleness contract: within one
// jobId, results are pinned to the *first* observation regardless of
// whether a peer realm has swapped since. The rationale is "one
// consolidated view of the realm-server's state per batch" — repeated
// reads of the same broad cross-realm query during one batch are
// strictly better when they all see the same snapshot than when they
// each chase whatever a peer realm has just published. The bound is
// the job's lifetime; a subsequent job re-observes.
//
// The handler gates entry into this cache on two conditions:
// `x-boxel-job-id` present and well-formed, and
// `x-boxel-consuming-realm` present and well-formed. Both headers are
// only stamped by indexer-driven prerender requests, so user-facing
// API callers always bypass and see live state.
//
// Entries store the *resolved* doc, not the in-flight promise.
// Concurrent same-key callers each run their own `populate` (Phase 1's
Expand Down Expand Up @@ -86,11 +96,12 @@ export class JobScopedSearchCache {

async getOrPopulate(args: {
jobId: string;
realms: string[];
query: Query;
opts: unknown | undefined;
populate: () => Promise<LinkableCollectionDocument>;
}): Promise<LinkableCollectionDocument> {
let innerKey = buildInnerKey(args.query, args.opts);
let innerKey = buildInnerKey(args.realms, args.query, args.opts);
let jobMap = this.#byJob.get(args.jobId);
let existing = jobMap?.get(innerKey);
if (existing) {
Expand Down Expand Up @@ -193,12 +204,21 @@ export class JobScopedSearchCache {

// Compose the per-job inner key. Excludes jobId since the outer Map is
// already partitioned by jobId — this keeps inner-key length bounded
// regardless of how the call site formats the jobId. Excludes the
// realms array (the cache gate already enforces same-realm-only), so
// two requests with `realms: [R]` produce the same inner key
// regardless of array identity.
function buildInnerKey(query: Query, opts: unknown | undefined): string {
// regardless of how the call site formats the jobId. The realms array
// is included verbatim (no sort, no dedupe): `_federated-search`
// preserves input order in its `data` array and first-occurrence
// `included`, so `[A, B]` and `[B, A]` are *different* responses and
// must hash to different cache entries. A duplicated realm entry
// likewise contributes duplicate per-realm searches at the handler
// layer — preserve that observable shape too rather than silently
// canonicalising it here.
function buildInnerKey(
realms: string[],
query: Query,
opts: unknown | undefined,
): string {
return JSON.stringify([
realms,
Comment thread
habdelra marked this conversation as resolved.
normalizeQueryForSignature(query),
opts ? sortKeysDeep(opts) : null,
]);
Expand Down
Loading
Loading