Skip to content

fix(isr): KVCacheHandler.set() now awaits KV put to prevent perpetual STALE#425

Merged
james-elicx merged 2 commits intomainfrom
fix/isr-kv-stale-regen
Mar 10, 2026
Merged

fix(isr): KVCacheHandler.set() now awaits KV put to prevent perpetual STALE#425
james-elicx merged 2 commits intomainfrom
fix/isr-kv-stale-regen

Conversation

@james-elicx
Copy link
Copy Markdown
Collaborator

Problem

After a cache entry's TTL expires, every subsequent request returned X-Vinext-Cache: STALE and never transitioned to HIT. Background regeneration appeared to trigger but the cache was never updated.

Root cause: KVCacheHandler.set() called _putInBackground() and immediately returned Promise.resolve(). In the background regen path in app-rsc-entry.ts:

```js
await Promise.all([
__isrSet(htmlKey, ...), // resolves immediately — KV put not done
__isrSet(rscKey, ...),
]);
// renderFn() resolves here, before KV write completes
```

ctx.waitUntil(renderFn()) expired as soon as renderFn() resolved. The Cloudflare Workers runtime tore down the isolate while the kv.put() network operation was still in flight, silently killing the write. The entry stayed STALE forever.

The MISS path was not affected — it tees the HTML stream and builds an explicit __cachePromise that fully awaits the write before ctx.waitUntil can expire.

Fix

Replace the fire-and-forget _putInBackground() with _put() which returns the actual kv.put() promise. set() now returns that promise instead of Promise.resolve().

```ts
// Before
this._putInBackground(kvKey, json, opts);
return Promise.resolve(); // resolves before KV write

// After
return this._put(kvKey, json, opts); // resolves when KV write completes
```

_put() also registers with ctx.waitUntil() as belt-and-suspenders (same as before), so the Workers runtime keeps the isolate alive even if a caller doesn't await the returned promise.

The full regen chain is now properly awaited end-to-end:
ctx.waitUntil(renderFn())await Promise.all([__isrSet(...)])await handler.set(...)return this._put(...)return kv.put(...)

Tests

Adds three regression tests in tests/kv-cache-handler.test.ts under "STALE → regen → HIT lifecycle":

  1. set() resolves only after the KV put completes
  2. set() returned promise is the actual kv.put promise, not an immediately-resolved stub (uses a latch-controlled mock to prove this)
  3. Background regen waitUntil covers the actual KV write with a delayed put — regen promise does not settle until the write completes

… STALE

Background ISR regen was calling ctx.waitUntil(renderFn()) where renderFn()
resolved before the KV put network operation completed. set() returned
Promise.resolve() immediately, so await __isrSet() in the regen resolved early,
the waitUntil expired, and the Workers runtime killed the in-flight KV write —
leaving the cache entry perpetually STALE on every request after the TTL.

Fix: replace the fire-and-forget _putInBackground() with _put() which returns
the actual kv.put() promise. set() now returns that promise, so await __isrSet()
only resolves after the write is fully persisted. ctx.waitUntil() is also still
registered as belt-and-suspenders.

Adds three regression tests covering the STALE → regen → HIT lifecycle with a
delayed KV put mock to prove the set() promise is not an immediately-resolved stub.
@james-elicx
Copy link
Copy Markdown
Collaborator Author

/bigbonk review

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Mar 10, 2026

Open in StackBlitz

npm i https://pkg.pr.new/vinext@425

commit: c241bd5

@github-actions
Copy link
Copy Markdown

Example Preview Production Original
app-router-cloudflare preview production
pages-router-cloudflare preview production
app-router-playground preview production original
realworld-api-rest preview production
nextra-docs-template preview production
benchmarks preview production
hackernews preview production original

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix(isr): KVCacheHandler.set() now awaits KV put to prevent perpetual STALE

Good fix. The root cause analysis is correct and the change is minimal and targeted. The await chain from ctx.waitUntil(renderFn()) through __isrSet() through handler.set() to kv.put() is now properly connected end-to-end.

Correctness

The fix is sound. Before this change, set() returned Promise.resolve() which meant the renderFn() promise resolved immediately, ctx.waitUntil had nothing left to keep the isolate alive, and the actual kv.put() network operation was silently killed by the Workers runtime. Now set() returns this._put(...) which returns the real kv.put() promise, so the full chain stays alive until the write completes.

The dual registration pattern (return the promise AND ctx.waitUntil(promise)) is fine — waitUntil just keeps the isolate alive, it doesn't interfere with promise resolution semantics. In the regen path, both the outer ctx.waitUntil(renderFnPromise) and the inner ctx.waitUntil(kvPutPromise) point at the same underlying work, which is harmless redundancy.

The Node.js path (no ctx) now correctly depends on the caller awaiting the returned promise, which isrSet() and __isrSet() both do via await handler.set(...). This is actually better than the old behavior where Node.js was also fire-and-forget on put.

The asymmetry with _deleteInBackground (still fire-and-forget) is intentional and correct — deletes are cleanup for corrupted/invalidated entries and don't affect correctness if they're delayed or lost. Cache writes during regen must complete or the entry stays stale forever, which is exactly the bug this fixes.

Tests

The three regression tests are well-structured and prove the right things:

  1. The basic smoke test is simple but establishes the baseline invariant.
  2. The latch-controlled test (test 2) is the key one — it definitively proves set() is coupled to kv.put() completion by showing the promise stays pending while the latch is held.
  3. The regen simulation (test 3) validates the full chain from renderFn through ctx.waitUntil, matching the actual pattern in app-rsc-entry.ts.

Minor nits

Two small things that don't block merge:

  1. Dead variable in test 2 (line 504): putResolvedBeforeSetAwaited is declared but never read. Looks like it was part of an earlier draft of the test. Can be removed in a follow-up.

  2. Extra blank line at line 282-283 in kv-cache-handler.ts — there's a double blank line between _deleteInBackground and the _put doc comment. Cosmetic only.

LGTM — clean fix with good test coverage for a real production bug.

});

it("set() returned promise is the kv.put promise (not an immediately-resolved stub)", async () => {
// Swap out kv.put with a delayed version so we can verify that the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: putResolvedBeforeSetAwaited is declared but never read — leftover from an earlier draft?

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented Mar 10, 2026

Review posted with approval and inline comment. Here's a summary:

Approved PR #425 — The fix is correct and well-tested.

What the PR does: KVCacheHandler.set() was returning Promise.resolve() instead of the actual kv.put() promise. This meant the ISR background regeneration chain (ctx.waitUntil(renderFn()) -> await __isrSet() -> await handler.set()) resolved before the KV write completed, causing the Workers runtime to tear down the isolate and silently kill the write. Entries stayed STALE forever.

The fix: Renames _putInBackground() to _put(), returns the real kv.put() promise from set(), and keeps ctx.waitUntil() registration as belt-and-suspenders. Minimal, targeted change.

One nit flagged: Dead variable putResolvedBeforeSetAwaited in test 2 (declared but never read).

github run

@james-elicx james-elicx enabled auto-merge (squash) March 10, 2026 23:07
@james-elicx james-elicx merged commit aa878d5 into main Mar 10, 2026
27 checks passed
@james-elicx james-elicx deleted the fix/isr-kv-stale-regen branch March 10, 2026 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant