Skip to content

fix: Pages Router ISR background regeneration re-renders HTML#487

Merged
james-elicx merged 5 commits intocloudflare:mainfrom
NathanDrake2406:fix/isr-pages-router-stale-html
Mar 12, 2026
Merged

fix: Pages Router ISR background regeneration re-renders HTML#487
james-elicx merged 5 commits intocloudflare:mainfrom
NathanDrake2406:fix/isr-pages-router-stale-html

Conversation

@NathanDrake2406
Copy link
Copy Markdown
Contributor

Summary

  • Pages Router ISR background regeneration was caching stale HTML permanently. The regen re-ran getStaticProps to get fresh data but stored the old HTML back into the cache — only pageData JSON was updated, never the rendered HTML. Every subsequent HIT served the original HTML forever.
  • Fixed in both dev server (dev-server.ts) and production server entry (pages-server-entry.ts).
  • Also fixed two related issues found during review:
    • getStaticProps was called without locale context during regeneration (missing locale, locales, defaultLocale)
    • Dev server didn't update the revalidate duration map after regeneration, so Cache-Control headers could diverge if revalidate changed

Test plan

  • Added "background regeneration re-renders HTML with fresh props" integration test
    • Populates ISR cache, waits for TTL expiry, triggers STALE response
    • Waits for background regeneration, asserts the HIT response has a different timestamp in both HTML and __NEXT_DATA__
    • Proves both server-rendered HTML and hydration data are fresh and in sync
  • All existing ISR tests pass (14/14)
  • ISR cache unit tests pass (31/31)
  • Typecheck clean
  • CI: full Vitest suite + Playwright E2E

Background regeneration only re-ran getStaticProps but cached the old
HTML, so server-rendered content was permanently stale after the first
revalidation cycle. The pageData JSON was updated but never used to
re-render the page.

Now the regeneration callback:
- Re-renders the page component with fresh props
- Rebuilds __NEXT_DATA__ with fresh props
- Updates the revalidate duration map (dev server)
- Passes full locale context to getStaticProps

Fixed in both dev server and production server entry.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Mar 12, 2026

Open in StackBlitz

npm i https://pkg.pr.new/vinext@487

commit: c79efda

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f3f09f4b85

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

);
const hydrationScript = hydrationMatch?.[0] ?? "";

const freshHtml = `<!DOCTYPE html><html><head></head><body><div id="__next">${freshBody}</div>${freshNextData}\n ${hydrationScript}</body></html>`;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve document head when caching regenerated dev HTML

The regeneration path now caches a hard-coded shell (<!DOCTYPE html><html><head></head>...) instead of reusing the rendered document structure, so after the first STALE request the next HIT can lose next/head output, font/meta tags, and custom _document head/body wrappers in dev. This is a regression from the previous behavior (which at least preserved the existing shell), and it makes ISR pages render without their expected document chrome during local testing.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a regression — the dev ISR cache has always used a hardcoded simple template (see dev-server.ts:971):

const isrHtml = `<!DOCTYPE html><html><head></head><body><div id="__next">${isrBodyHtml}</div>${allScripts}</body></html>`;

There was never any _document, font links, or head content in the dev ISR cache to begin with. The regen path matches the miss path exactly.

Comment on lines +860 to +864
var _cachedStr = cached.value.value.html;
var _btag = '<div id="__next">';
var _bidx = _cachedStr.indexOf(_btag);
var _head = _bidx >= 0 ? _cachedStr.slice(0, _bidx + _btag.length) : '<!DOCTYPE html>\\n<html>\\n<head>\\n</head>\\n<body>\\n <div id="__next">';
var _freshHtml = _head + _freshBody + '</div>\\n ' + _freshNDS + '\\n</body>\\n</html>';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep custom _document suffix during ISR regeneration

This rebuild only keeps the cached prefix up to <div id="__next"> and then appends a fixed </div>... </body></html> suffix, so any custom _document markup after <Main /> (for example extra body nodes or scripts) is dropped on regenerated cache entries. Production HITs served after background regeneration will therefore return a different/broken document structure for apps relying on _document body content.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 20774da. The regen now splits the cached HTML at <div id="__next"> and <script>window.__NEXT_DATA__ markers to preserve the full document structure:

  • Head section (fonts, assets, _document head): carried forward from cached HTML
  • Gap content (custom elements between <Main /> and <NextScript />): preserved via lastIndexOf('</div>') to find the __next closer
  • Tail (anything after __NEXT_DATA__ </script>): carried forward

Only the body HTML and __NEXT_DATA__ are replaced with fresh content.

The prod entry's regeneration was hardcoding a simple HTML suffix,
losing custom _document content between <Main /> and <NextScript />.

Now splits the cached HTML at known markers (<div id="__next"> and
__NEXT_DATA__ script) to preserve the full document structure: head
section, gap content, and tail are carried forward from the original
cached entry.

Also updates the entry-templates snapshot.
@james-elicx
Copy link
Copy Markdown
Collaborator

/bigbonk review

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Good bug fix. The core issue is real and well-diagnosed: background ISR regeneration was calling getStaticProps to get fresh data but caching the old HTML — only pageData JSON was ever updated. The fix correctly re-renders the page component with fresh props and stores the new HTML.

CI is fully green (Vitest, all 7 Playwright projects, typecheck, lint, format). The new test directly proves the fix works by asserting timestamp divergence between STALE and subsequent HIT responses.

I have a few observations, but nothing blocking.

Observations

1. Dev regen doesn't preserve _document shell (matches existing behavior)

The dev regen path (line 669) builds a hardcoded <!DOCTYPE html><html><head></head>... shell. This is intentional and matches the initial ISR cache population at line 971 — both dev paths use the same simple template. Codex flagged this but the PR author correctly explained it's not a regression.

2. Prod regen preserves the shell via HTML slicing (good)

The production entry does the right thing: it splits the cached HTML at <div id="__next"> and <script>window.__NEXT_DATA__ markers to preserve the document head, _document markup, and tail content. The lastIndexOf('</div>') approach to find the __next closing tag is sound for well-formed HTML.

3. Dev regen re-loads _app module (line 616-623)

The regen callback loads _app via server.ssrLoadModule() independently, which means if _app was edited between the original render and the regen, the regen picks up the new version. This is actually better behavior for dev — HMR consistency. But it does mean the regen closure captures pageModule from the outer scope (already loaded) while loading _app fresh. This asymmetry is fine in practice since page modules are also HMR-aware through Vite's module graph.

4. Variable naming style in generated entry

The production entry uses underscore-prefixed var declarations (_fp, _el, _freshBody, etc.) which matches the existing ES5-style conventions in the generated server entry. Consistent with the rest of the file.

5. Test is timing-dependent but reasonable

The test uses setTimeout(1200) to wait for TTL expiry and setTimeout(500) for regen completion. These are generous margins for a 1-second revalidate period and should be reliable in CI. The existing ISR tests use the same pattern (lines 471, 483).

)}${i18nConfig ? `;window.__VINEXT_LOCALE__=${safeJsonStringify(locale ?? i18nConfig.defaultLocale)};window.__VINEXT_LOCALES__=${safeJsonStringify(i18nConfig.locales)};window.__VINEXT_DEFAULT_LOCALE__=${safeJsonStringify(i18nConfig.defaultLocale)}` : ""}</script>`;

const hydrationMatch = cachedHtml.match(
/<script type="module">[\s\S]*?<\/script>/,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the regex /<script type="module">[\s\S]*?<\/script>/ uses a non-greedy quantifier, so if the cached HTML has multiple <script type="module"> tags (e.g. from Vite's HMR client injection), this only captures the first one. In practice this works because the ISR cache HTML at line 971 only contains one hydration <script type="module"> block. But if that ever changes (e.g. additional module scripts), this would silently drop them.

Not blocking — just something to keep in mind if the dev ISR cache format evolves.

var _lastClose = _region.lastIndexOf('</div>');
var _gap = _lastClose >= 0 ? _region.slice(_lastClose + 6) : '';
// Tail: everything after the old __NEXT_DATA__ </script>
var _ndEnd = _cachedStr.indexOf('</script>', _ndStart) + 9;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indexOf('</script>', _ndStart) + 9 calculation assumes the </script> tag immediately follows __NEXT_DATA__'s JSON content without any attributes or whitespace variations. This is safe here since vinext controls the exact format of the __NEXT_DATA__ script tag (no attributes, no extra whitespace), but it's worth a comment noting that assumption.

Also: if indexOf returns -1 (no </script> found — shouldn't happen with well-formed HTML), _ndEnd becomes 8, and _tail would be almost the entire cached string. The if (_bodyStart >= 0 && _ndStart >= 0) guard above protects against the _ndStart case but not against a missing closing </script> tag. Extremely unlikely in practice since we wrote that HTML ourselves.

// update the cache so the next request is a HIT with fresh data.
// Trigger background regeneration: re-run getStaticProps,
// re-render the page, and cache the fresh HTML.
triggerBackgroundRegeneration(cacheKey, async () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good: the regen callback correctly captures cacheKey, pageModule, route, params, locale, i18nConfig, routerShim, matcher, pagesDir, and cachedHtml from the enclosing scope. These are all stable for the lifetime of the regen since they were computed from the request that triggered it.

One subtlety: pageModule is the Vite SSR module reference captured at callback creation time. If the page file is edited between the STALE response and the regen execution, pageModule.getStaticProps still calls the old version (Vite SSR modules are cached objects). This is actually correct — the regen should use the version that was active when the stale entry was created, not a potentially half-edited module.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented Mar 12, 2026

Review posted and approved. Here's a summary:

Verdict: Approved

The PR fixes a real bug where ISR background regeneration was caching stale HTML — getStaticProps ran with fresh data but the old HTML was stored back into the cache, meaning pageData JSON updated but server-rendered HTML never did.

The fix correctly:

  • Re-renders the page component with fresh props during background regeneration (both dev and prod paths)
  • Passes locale context to getStaticProps during regeneration (was missing before)
  • Updates the revalidate duration map in dev after regeneration
  • Preserves the _document shell in production via HTML string slicing at known markers

All CI checks pass (Vitest, 7 Playwright E2E projects, typecheck, lint, format). The new test directly proves the fix by asserting that timestamps in both HTML and __NEXT_DATA__ diverge after regeneration.

I left three non-blocking inline comments noting:

  1. The hydration script regex only captures the first <script type="module"> tag (fine today, fragile if format changes)
  2. The </script> tag search in prod could theoretically fail on malformed HTML (extremely unlikely since vinext writes it)
  3. An observation that pageModule capture semantics in the dev regen closure are actually correct

github run

@james-elicx james-elicx merged commit d093c33 into cloudflare:main Mar 12, 2026
19 checks passed
@NathanDrake2406 NathanDrake2406 deleted the fix/isr-pages-router-stale-html branch March 18, 2026 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants