Skip to content

realm-server: reset retrieveIndexHTML cache when work throws#4859

Merged
habdelra merged 1 commit into
mainfrom
worktree-cs-11159-serve-index-deferred
May 18, 2026
Merged

realm-server: reset retrieveIndexHTML cache when work throws#4859
habdelra merged 1 commit into
mainfrom
worktree-cs-11159-serve-index-deferred

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

@habdelra habdelra commented May 18, 2026

Summary

Closes CS-11159.

Fixes a latent bug in packages/realm-server/handlers/serve-index.ts that can wedge the realm-server: a single transient throw inside retrieveIndexHTML poisons the in-process cache so that every subsequent request to / (and any host-app HTML route) hangs forever until the process is restarted. The bug ships every deploy; it just doesn't fire unless the rewrite work happens to throw.

The bug

retrieveIndexHTML builds the production HTML shell once and memoizes it. The shape (pre-fix):

let promiseForIndexHTML: Promise<string> | undefined;

async function retrieveIndexHTML(): Promise<string> {
  let isDev = assetsURL.hostname === 'localhost';

  if (!isDev && promiseForIndexHTML) {
    return promiseForIndexHTML;       // subsequent requests block here
  }

  let deferred = new Deferred<string>();
  if (!isDev) {
    promiseForIndexHTML = deferred.promise;  // cached BEFORE the work runs
  }

  let indexHTML = (await getIndexHTML()).replace(/.../, ...);   // can throw
  // ... more rewrite work that can also throw ...

  deferred.fulfill(indexHTML);   // never reached on throw
  return indexHTML;
}

The cache slot is set to a Deferred<string>.promise before the work that will resolve it runs. If anything between that assignment and deferred.fulfill(...) throws, the exception bubbles out of retrieveIndexHTML and the request handler returns 500 — but promiseForIndexHTML is now permanently set to a pending promise that nothing will ever resolve or reject. Every subsequent caller takes the if (!isDev && promiseForIndexHTML) return promiseForIndexHTML branch and awaits that dead promise. The request hangs until process restart.

Throw sources that can trigger this

  • getIndexHTML() itself — reads the host-app shell from disk / assets URL; any I/O or asset-fetch failure throws.
  • JSON.parse(decodeURIComponent(g2)) inside the meta-rewrite replacer if the embedded @cardstack/host/config/environment content is malformed.
  • import('crypto') failing on an exotic Node build (extremely unlikely but technically possible).

How it surfaced

Copilot flagged the pattern on #4846 (thread). The reachability claim was checked against the code — this is a real reachable bug, not a defensive guard for an impossible state. The same shape existed in server.ts before the #4846 refactor and was moved verbatim, so this is not a regression — just finally being addressed.

The fix

Drop the manually-controlled Deferred and cache the in-flight Promise directly via an IIFE. Hook work.catch(...) to clear the cache slot on rejection so the next caller retries getIndexHTML() instead of awaiting a permanently-broken promise. The Deferred wrapper only existed to bridge "I want to set the cache before the work has resolved" — which is exactly the requirement that creates the bug.

if (!isDev && promiseForIndexHTML) {
  return promiseForIndexHTML;
}

let work = (async () => {
  let indexHTML = (await getIndexHTML()).replace(/.../, ...);
  // ... rewrite work ...
  return indexHTML;
})();

if (!isDev) {
  promiseForIndexHTML = work;
  work.catch(() => {
    promiseForIndexHTML = undefined;   // next caller retries
  });
}

return work;

Single-flight behaviour is preserved: concurrent callers still share one in-flight getIndexHTML() run, and repeat callers reuse the cached value on success. The only behavioural change is that a thrown error no longer poisons the cache forever.

Tests

New packages/realm-server/tests/serve-index-test.ts:

  1. Async throw recoverygetIndexHTML throws on the first call and succeeds on the second; the first retrieveIndexHTML() rejects, the second resolves, and getIndexHTML is invoked twice.
  2. Sync rewrite throw recoverygetIndexHTML returns HTML with a malformed embedded config so JSON.parse(decodeURIComponent(...)) throws inside the replacer; same recovery shape.
  3. Single-flight memoization — three concurrent retrieveIndexHTML() callers share one cached promise (getIndexHTML invoked exactly once); a subsequent call still reuses the cache.

Test plan

  • pnpm --filter @cardstack/realm-server test (filtered to serve-index-test + the existing server-config-test that also exercises retrieveIndexHTML) — 4/4 passing
  • pnpm --filter @cardstack/realm-server lint:js clean
  • prettier --check clean on changed files
  • Manual: monkey-patch getIndexHTML in dev to throw once, verify the realm-server doesn't enter a permanent hang on /

🤖 Generated with Claude Code

The Deferred-based cache assigned promiseForIndexHTML before the rewrite
work ran, so a throw inside getIndexHTML or the meta-rewrite block left
the cache permanently pointing at a never-resolving promise. Subsequent
requests awaited the dead promise instead of retrying, hanging
index.html for the lifetime of the process.

Replace the manual Deferred with an IIFE promise cached only after the
work is kicked off, with a catch hook that clears the slot on rejection
so the next caller re-runs getIndexHTML. Adds tests for the async
getIndexHTML throw path, the sync rewrite throw path, and the
single-flight memoization invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a latent bug where retrieveIndexHTML cached an unresolved Deferred.promise before the work began. If getIndexHTML() or the meta-rewrite (JSON.parse(decodeURIComponent(...))) threw, the slot was permanently set to a pending promise, causing the realm-server to hang on / and host-app HTML routes forever. The fix replaces the Deferred with an IIFE promise that is cached only after work starts, and uses work.catch(...) to clear the cache on rejection.

Changes:

  • Replace Deferred<string> with an IIFE-produced Promise<string> that is cached only after the work starts; clear cache on rejection so retries are possible.
  • Expose retrieveIndexHTML from createServeIndex for isolated testing.
  • Add new tests covering async failure recovery, synchronous rewrite-step failure recovery, and the single-flight memoization invariant.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/realm-server/handlers/serve-index.ts Removes the broken Deferred-based cache and introduces a self-clearing IIFE promise to fix the hang on rewrite/getIndexHTML failures.
packages/realm-server/tests/serve-index-test.ts New test file verifying cache-clear on async/sync failure and single-flight memoization on success.
packages/realm-server/tests/index.ts Registers the new serve-index-test in the realm-server live test runner.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Host Test Results

    1 files      1 suites   1h 28m 33s ⏱️
2 661 tests 2 646 ✅ 15 💤 0 ❌
2 680 runs  2 665 ✅ 15 💤 0 ❌

Results for commit eb57706.

Realm Server Test Results

    1 files      1 suites   7m 58s ⏱️
1 389 tests 1 389 ✅ 0 💤 0 ❌
1 470 runs  1 470 ✅ 0 💤 0 ❌

Results for commit eb57706.

@habdelra habdelra requested a review from a team May 18, 2026 16:50
@habdelra habdelra merged commit e448747 into main May 18, 2026
71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants