Skip to content

fix(toolkit-docs-generator): preserve previous summary when regen fails#927

Merged
jottakka merged 3 commits intomainfrom
fix/toolkit-docs-preserve-summary
Apr 20, 2026
Merged

fix(toolkit-docs-generator): preserve previous summary when regen fails#927
jottakka merged 3 commits intomainfrom
fix/toolkit-docs-preserve-summary

Conversation

@jottakka
Copy link
Copy Markdown
Contributor

@jottakka jottakka commented Apr 17, 2026

Summary

  • maybeGenerateSummary silently wiped the toolkit summary field in two fallback paths — when no LLM generator was configured and when the LLM call threw.
  • Both paths now carry the previous toolkit's summary forward, so a signature change never loses hand-refined content on its own.
  • Adds two regression tests: missing-generator + failing-generator.

Why

Rendered toolkit pages only show the overview prose block when data.summary is truthy (app/_components/toolkit-docs/components/toolkit-page.tsx). The silent wipe meant that every `[AUTO] Adding MCP Servers docs update` merge that touched a toolkit's tool set could drop its overview on Vercel with no warning surfaced to the reader.

An earlier version of this fix (missing-generator path only) was written on the PR #907 branch but got dropped during merge to main. This PR restores it and additionally covers the LLM-failure path.

Scope

Changes are isolated to toolkit-docs-generator/src/merger/data-merger.ts (+ tests). No generator output regenerated in this PR — summary restoration is split into follow-up PRs so reviewers can evaluate content separately from the code fix.

Test plan

  • `pnpm vitest run toolkit-docs-generator/tests/merger/data-merger.test.ts` — 58 passed (+2 new)
  • Confirm CI green

Closes #926

🤖 Generated with Claude Code


Note

Medium Risk
Changes toolkit JSON output semantics by preserving prior summary on LLM-unavailable/failure paths and introducing summaryStale flags; downstream consumers/CI may start failing if stale summaries are produced.

Overview
Prevents toolkit overviews from being silently dropped by carrying forward the previous summary when summary regeneration is skipped (no LLM configured) or fails, instead of wiping the field.

Introduces summaryStale/summaryStaleReason on MergedToolkit and propagates these through previous-output parsing; summaries reused via signature match are now only considered fresh if the prior summary was not already flagged stale.

Adds a CI gate (tests/stale-summaries.test.ts) and a companion script (scripts/check-stale-summaries.ts) to fail/report when committed toolkit JSON contains stale summaries, plus expanded DataMerger tests covering stale-flag behavior and clearing on successful regeneration.

Reviewed by Cursor Bugbot for commit f5a58e6. Bugbot is set up for automated code reviews on this repo. Configure here.

maybeGenerateSummary silently wiped the summary field in two paths:

1. When no LLM generator was configured (e.g. --skip-summary or missing
   ANTHROPIC_API_KEY), and the tool signature had changed since the last
   run — causing the summary-reuse fast path to miss.
2. When the LLM call threw (rate limit, network error, invalid JSON) —
   only a warning was recorded and the field stayed undefined.

Both paths produce the observed regression where toolkits like github,
jira, salesforce, googledocs, linear, and daytona went from rich,
hand-refined summaries to `null` after an auto docs update. The
rendered docs (app/_components/toolkit-docs/components/toolkit-page.tsx)
then omit the prose block entirely.

Carry the previous toolkit's summary forward in both fallback paths.
A slightly stale summary is better than losing it; the next run with a
working LLM will regenerate based on the new signature.

Add two regression tests covering the missing-generator and failing-
generator paths.

Closes ARC-TOOLKIT-DOCS-SUMMARY-LOSS

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Apr 18, 2026 6:58pm

Request Review

jottakka added a commit that referenced this pull request Apr 17, 2026
The summary field for github, googledocs, jira, salesforce, daytona,
and linear was silently wiped to null across recent [AUTO] docs update
merges (see #926 for the root cause — regeneration fallback paths did
not preserve the previous summary).

Each restored summary was rewritten from the last known-good version in
git history and then updated to reflect the toolkit's CURRENT tools.
Notable capability drift accounted for:

- github: drop notifications (no notification tools currently); drop
  CLASSIC_PERSONAL_ACCESS_TOKEN (only GITHUB_SERVER_URL remains); add
  explicit Projects V2 fields/items and user-centric views (review
  workload, open items, recent activity).
- googledocs: clarify SearchAndRetrieveDocuments returns body content
  while SearchDocuments is metadata-only; keep File Picker recovery.
- jira: no capability drift; tighten wording around single-call batching
  and name/key/email reference resolution.
- salesforce: no capability drift; add explicit mention of related-
  object enrichment (contact roles, line items) visible in current
  descriptions.
- daytona: drop abstract "metadata for coordination"; add ListRegions,
  signed vs. standard port preview URLs, snapshots as a separate
  lifecycle.
- linear: drop the comment-anchoring caveat (no longer present in tool
  descriptions); add pagination for project/initiative descriptions and
  ManageIssueSubscription.

This restoration is intentionally separate from #927 (the code fix) so
content review can happen independently of the regression fix.

Refs #926

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When maybeGenerateSummary carries a previous summary forward because the
LLM generator is unavailable or the LLM call threw, the signature-mismatch
fallback produces a summary that no longer accurately describes the
toolkit's current tools. The previous commit made that fallback happen at
all (so summaries stopped being silently wiped); this commit marks those
cases visibly so they don't linger.

New fields on MergedToolkit:
- summaryStale: boolean (optional)
- summaryStaleReason: string (optional — "llm_generator_unavailable" or
  "llm_generation_failed")

Both are set together when a previous summary is preserved as a fallback,
cleared together when (a) a fresh summary is successfully generated,
(b) the signature-match reuse path runs and confirms the existing summary
is still accurate, or (c) an overview chunk takes precedence. A warning
with the "Summary is stale for <id>" prefix is pushed onto the merge
result for in-run visibility.

previous-output.ts carries the flags through the fallback parser so the
next run can see an already-stale summary and either refresh it (success
path clears the flag) or leave it flagged.

CI gate:
- tests/stale-summaries.test.ts reads committed data/toolkits/*.json and
  fails if any toolkit has summaryStale: true, with a human-readable
  list of offenders and their reasons. This intentionally lets the
  scheduled [AUTO] Adding MCP Servers docs update PR open with the stale
  flag set — its CI run will fail, which is the signal for a reviewer
  to rerun generation with a working LLM or fix the summary by hand.
- scripts/check-stale-summaries.ts: standalone tsx script that runs the
  same scan locally and exits non-zero on findings.

Tests (data-merger.test.ts) now assert:
- preservation-without-generator sets summaryStale + reason + warning
- preservation-on-throw sets summaryStale + reason
- successful regen clears an existing stale flag
- signature-match reuse clears an existing stale flag (because the
  signature itself proves freshness)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jottakka
Copy link
Copy Markdown
Contributor Author

Follow-up: stale-summary flag + CI gate

Added in commit 9e392f5 — scoped to this preservation PR as requested; the deterministic scans/enforcement live in #932.

Shape:

  • MergedToolkit gets optional summaryStale: boolean + summaryStaleReason: string.
  • Set together whenever maybeGenerateSummary carries the previous summary forward because regen was skipped (no LLM) or threw. A warning (`Summary is stale for : . Previous summary carried forward.`) is also pushed onto the merge result.
  • Cleared whenever a fresh summary is generated, the signature-match reuse path confirms the existing summary is still accurate, or an overview chunk takes over.

CI gate:

Tests added:

  • Existing preservation tests now assert summaryStale/reason are set.
  • New test: successful regen clears an existing stale flag.
  • New test: signature-match reuse clears an existing stale flag.

526 tests passing, type-check clean.

…y flag

Five findings from acr-run on the stale-summary flag PR:

1. HIGH — Signature-match reuse was clearing legitimate stale flags.
   Scenario: run N-1 carried summary S_AB forward into toolset
   {A,B,C} with summaryStale=true. Run N with the same {A,B,C} would
   match the signature and mark the summary fresh — but S_AB still
   describes the wrong toolset. Guard the reuse fast path with
   `!previousToolkit.summaryStale` so a stale-then-matching-signature
   run correctly falls through to regeneration (or, if no generator is
   available, keeps the stale flag).

2. MEDIUM — previous-output.ts used `!== undefined` for summaryStale
   but a truthy check for summaryStaleReason, so an empty-string reason
   would break the pair invariant. Match both guards to `!== undefined`.

3. MEDIUM — Moved the `result.toolkit.summary` defensive guard to the
   top of maybeGenerateSummary so the no-generator fallback path can
   never clobber an already-set summary.

4. LOW — Wrapped readdirSync and JSON.parse in check-stale-summaries.ts
   in try/catch so a missing directory or a malformed JSON file produces
   a clean exit-1 with the filename instead of an uncaught exception.

5. LOW — stale-summaries.test.ts now resolves TOOLKITS_DIR from
   import.meta.url rather than process.cwd(), so it works from any
   working directory (repo root or toolkit-docs-generator/).

Tests updated:
- Replaced the former "does not flag stale when signature matches" test
  (which validated the broken behavior) with three tests covering the
  three reuse scenarios:
  a) signature match + previous fresh → reuse, no regen
  b) signature match + previous stale + generator → regen clears flag
  c) signature match + previous stale + no generator → keep stale flag

528 tests pass, type-check clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jottakka jottakka marked this pull request as ready for review April 18, 2026 20:15
@jottakka jottakka requested a review from a team April 18, 2026 20:16
@jottakka jottakka merged commit 84f7511 into main Apr 20, 2026
6 checks passed
@jottakka jottakka deleted the fix/toolkit-docs-preserve-summary branch April 20, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Toolkit docs generator silently wipes summary on LLM unavailable / failure

2 participants