Skip to content

createRealm: enqueue exactly one priority-10 index job (CS-11157)#4849

Merged
habdelra merged 3 commits into
cs-11157-server-refactorfrom
cs-11157-skip-mount-indexing
May 15, 2026
Merged

createRealm: enqueue exactly one priority-10 index job (CS-11157)#4849
habdelra merged 3 commits into
cs-11157-server-refactorfrom
cs-11157-skip-mount-indexing

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 15, 2026

Summary

Fixes the "realm creation hangs during deploy-triggered reindex storms" shape from CS-11157.

Realm creation now enqueues exactly one from-scratch-index job for the new realm, at userInitiatedPriority, so a backed-up queue of system-priority indexing work (e.g. a deploy-triggered reindex storm) does not stall the HTTP create-realm request.

Stacked on #4846 (the server.ts refactor) because the change lands in the extracted handlers/create-realm.ts.

How

Thread a new fromScratchIndexPriority option from the realm-creation handler through the mount pipeline so the realm's own #startup produces the single canonical job at the chosen priority:

createRealm
  → reconciler.lookupOrMount(url, { fromScratchIndexPriority: userInitiatedPriority })
    → ensureMounted(row, opts)
      → realm.start(opts)
        → #startup(opts)
          → realmIndexUpdater.fullIndex(
              opts.fromScratchIndexPriority ?? this.#fromScratchIndexPriority
            )
            → publishFullIndex(priority)   // .then() updates
                                            // #stats / #ignoreData /
                                            // #ignoreDataVersion

The option defaults to this.#fromScratchIndexPriority (currently systemInitiatedPriority) at every layer, so callers that don't pass it — boot-time pinned-realm mount, lazy-mount on first request — keep their existing behaviour.

prepareRealmFromRow (in main.ts) publishes the realm into realms[] + virtualNetwork synchronously, before realm.start() runs. So a worker _mtimes fetch that races the mount resolves via the existing mount in findOrMountRealm and never re-enters the lazy-mount path — no duplicate enqueue is possible from inside this server-instance during the mount window.

Regression test

realm-lifecycle-test.ts tightens the POST /_create-realm assertion from "at least one priority-10 job exists" to exactly one job, at userInitiatedPriority. With the previous shape (explicit enqueue at 10 + implicit enqueue at 0 via realm.start(), intended to coalesce), a worker claim landing between the two enqueues would have produced two rows and failed this assertion.

Files

  • packages/runtime-common/realm.tsstart() / #startup() accept { fromScratchIndexPriority?: number }
  • packages/realm-server/lib/realm-registry-reconciler.tslookupOrMount / ensureMounted pass the option through
  • packages/realm-server/handlers/create-realm.ts — passes fromScratchIndexPriority: userInitiatedPriority to lookupOrMount; drops the deps it no longer needs (queue)
  • packages/realm-server/routes.ts — drops queue from the CreateRealmDeps bag
  • packages/realm-server/tests/server-endpoints/realm-lifecycle-test.ts — tightened assertion

Out of scope

handlers/handle-publish-realm.ts follows the same pattern (explicit enqueueReindexRealmJob followed by reconciler.lookupOrMount that re-enqueues via realm.start). Same fix would apply; deferred to a follow-up since this PR is scoped to realm creation per the ticket.

🤖 Generated with Claude Code

Threads a skipFromScratchIndex option through
reconciler.lookupOrMount → ensureMounted → realm.start → #startup.
When set, #startup mounts the realm without enqueuing its own
from-scratch-index job.

createRealm now:
  1. enqueues one from-scratch-index job at userInitiatedPriority
  2. lookupOrMount({ skipFromScratchIndex: true }) — mounts without
     enqueuing a duplicate
  3. awaits the priority-10 job's completion before returning

Prior behaviour had two enqueue sites (explicit at priority 10, plus
the implicit one via realm.start at default priority). They were
intended to coalesce via chooseFromScratch keeping maxPriority, but a
worker claim landing between the two inserts moved the first job into
the in-flight bucket — which the from-scratch coalesce ignores — so
the second job survived as a separate priority-0 row that could sit
behind any backlog of system-priority indexing work.

Tightened realm-lifecycle-test to assert exactly one job exists at
userInitiatedPriority (was: "at least one, at least one at p10").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b62b6e8e0a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/realm-server/handlers/create-realm.ts Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent duplicate from-scratch indexing jobs during realm creation by allowing startup indexing to be skipped when createRealm has already enqueued a high-priority job.

Changes:

  • Adds a skipFromScratchIndex option through realm startup and reconciler mounting.
  • Updates createRealm to enqueue one priority-10 index job, mount with startup indexing skipped, and await that job.
  • Tightens lifecycle test expectations to exactly one user-priority from-scratch job.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
packages/runtime-common/realm.ts Adds startup option to skip automatic from-scratch indexing.
packages/realm-server/lib/realm-registry-reconciler.ts Threads mount options from lookup through realm startup.
packages/realm-server/handlers/create-realm.ts Changes realm creation to enqueue and await a single explicit indexing job.
packages/realm-server/tests/server-endpoints/realm-lifecycle-test.ts Updates regression assertion for exactly one high-priority job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/realm-server/handlers/create-realm.ts Outdated
habdelra and others added 2 commits May 15, 2026 10:32
A worker claiming the priority-10 job between publish and the
lookupOrMount call would fetch the new realm's _mtimes against this
server-instance. That fetch would land in findOrMountRealm, whose
lazy-mount path calls reconciler.lookupOrMount(url) without the
skipFromScratchIndex option — re-introducing the priority-0 duplicate
enqueue this PR is trying to eliminate.

Reorder so the realm is mounted (via skipFromScratchIndex
lookupOrMount) before the index job is published. Once the realm is
in realms[] / virtualNetwork / the reconciler's `mounted` map, a
worker fetch routed here resolves via the existing mount and never
triggers the lazy-mount path. The sibling-instance race (reconciler
NOTIFY in flight) remains but is the same pre-existing window that
exists for any newly-created realm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Earlier shape (skipFromScratchIndex + explicit enqueueReindexRealmJob
in createRealm) bypassed RealmIndexUpdater.publishFullIndex, so the
mounted realm's #stats, #ignoreData, and #ignoreDataVersion never
updated when the job completed.

Replace skipFromScratchIndex with fromScratchIndexPriority. The
mount pipeline itself drives the one-and-only from-scratch job at
the chosen priority:

  lookupOrMount(url, { fromScratchIndexPriority })
    → ensureMounted(row, opts)
      → realm.start(opts)
        → #startup(opts)
          → #realmIndexUpdater.fullIndex(opts.fromScratchIndexPriority
                                         ?? this.#fromScratchIndexPriority)
            → publishFullIndex(...)  // .then updates #stats/#ignoreData/...

publishFullIndex remains the single source of truth for full-index
state updates; createRealm just picks the priority. prepareRealmFromRow
publishes the realm into realms[] / virtualNetwork synchronously, so
worker self-fetches that race the mount still resolve via the existing
mount and never re-enter the lazy-mount path.

Drop the now-unused enqueueReindexRealmJob + queue dep from
create-realm.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 15, 2026

Host Test Results

    1 files      1 suites   3h 25m 5s ⏱️
2 659 tests 2 644 ✅ 15 💤 0 ❌
5 262 runs  5 232 ✅ 30 💤 0 ❌

Results for commit 392ea9d.

Realm Server Test Results

    1 files  ±    0      1 suites  +1   9m 21s ⏱️ + 9m 21s
1 377 tests +1 377  1 377 ✅ +1 377  0 💤 ±0  0 ❌ ±0 
1 458 runs  +1 458  1 458 ✅ +1 458  0 💤 ±0  0 ❌ ±0 

Results for commit 392ea9d. ± Comparison against earlier commit acd64ac.

@habdelra habdelra requested a review from a team May 15, 2026 16:41
@habdelra habdelra merged commit 0c9a30c into cs-11157-server-refactor May 15, 2026
94 of 96 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants