fix(toolkit-docs-generator): preserve previous summary when regen fails#927
Merged
fix(toolkit-docs-generator): preserve previous summary when regen fails#927
Conversation
maybeGenerateSummary silently wiped the summary field in two paths: 1. When no LLM generator was configured (e.g. --skip-summary or missing ANTHROPIC_API_KEY), and the tool signature had changed since the last run — causing the summary-reuse fast path to miss. 2. When the LLM call threw (rate limit, network error, invalid JSON) — only a warning was recorded and the field stayed undefined. Both paths produce the observed regression where toolkits like github, jira, salesforce, googledocs, linear, and daytona went from rich, hand-refined summaries to `null` after an auto docs update. The rendered docs (app/_components/toolkit-docs/components/toolkit-page.tsx) then omit the prose block entirely. Carry the previous toolkit's summary forward in both fallback paths. A slightly stale summary is better than losing it; the next run with a working LLM will regenerate based on the new signature. Add two regression tests covering the missing-generator and failing- generator paths. Closes ARC-TOOLKIT-DOCS-SUMMARY-LOSS Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
jottakka
added a commit
that referenced
this pull request
Apr 17, 2026
The summary field for github, googledocs, jira, salesforce, daytona, and linear was silently wiped to null across recent [AUTO] docs update merges (see #926 for the root cause — regeneration fallback paths did not preserve the previous summary). Each restored summary was rewritten from the last known-good version in git history and then updated to reflect the toolkit's CURRENT tools. Notable capability drift accounted for: - github: drop notifications (no notification tools currently); drop CLASSIC_PERSONAL_ACCESS_TOKEN (only GITHUB_SERVER_URL remains); add explicit Projects V2 fields/items and user-centric views (review workload, open items, recent activity). - googledocs: clarify SearchAndRetrieveDocuments returns body content while SearchDocuments is metadata-only; keep File Picker recovery. - jira: no capability drift; tighten wording around single-call batching and name/key/email reference resolution. - salesforce: no capability drift; add explicit mention of related- object enrichment (contact roles, line items) visible in current descriptions. - daytona: drop abstract "metadata for coordination"; add ListRegions, signed vs. standard port preview URLs, snapshots as a separate lifecycle. - linear: drop the comment-anchoring caveat (no longer present in tool descriptions); add pagination for project/initiative descriptions and ManageIssueSubscription. This restoration is intentionally separate from #927 (the code fix) so content review can happen independently of the regression fix. Refs #926 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This was referenced Apr 17, 2026
When maybeGenerateSummary carries a previous summary forward because the LLM generator is unavailable or the LLM call threw, the signature-mismatch fallback produces a summary that no longer accurately describes the toolkit's current tools. The previous commit made that fallback happen at all (so summaries stopped being silently wiped); this commit marks those cases visibly so they don't linger. New fields on MergedToolkit: - summaryStale: boolean (optional) - summaryStaleReason: string (optional — "llm_generator_unavailable" or "llm_generation_failed") Both are set together when a previous summary is preserved as a fallback, cleared together when (a) a fresh summary is successfully generated, (b) the signature-match reuse path runs and confirms the existing summary is still accurate, or (c) an overview chunk takes precedence. A warning with the "Summary is stale for <id>" prefix is pushed onto the merge result for in-run visibility. previous-output.ts carries the flags through the fallback parser so the next run can see an already-stale summary and either refresh it (success path clears the flag) or leave it flagged. CI gate: - tests/stale-summaries.test.ts reads committed data/toolkits/*.json and fails if any toolkit has summaryStale: true, with a human-readable list of offenders and their reasons. This intentionally lets the scheduled [AUTO] Adding MCP Servers docs update PR open with the stale flag set — its CI run will fail, which is the signal for a reviewer to rerun generation with a working LLM or fix the summary by hand. - scripts/check-stale-summaries.ts: standalone tsx script that runs the same scan locally and exits non-zero on findings. Tests (data-merger.test.ts) now assert: - preservation-without-generator sets summaryStale + reason + warning - preservation-on-throw sets summaryStale + reason - successful regen clears an existing stale flag - signature-match reuse clears an existing stale flag (because the signature itself proves freshness) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
Follow-up: stale-summary flag + CI gateAdded in commit 9e392f5 — scoped to this preservation PR as requested; the deterministic scans/enforcement live in #932. Shape:
CI gate:
Tests added:
526 tests passing, type-check clean. |
…y flag
Five findings from acr-run on the stale-summary flag PR:
1. HIGH — Signature-match reuse was clearing legitimate stale flags.
Scenario: run N-1 carried summary S_AB forward into toolset
{A,B,C} with summaryStale=true. Run N with the same {A,B,C} would
match the signature and mark the summary fresh — but S_AB still
describes the wrong toolset. Guard the reuse fast path with
`!previousToolkit.summaryStale` so a stale-then-matching-signature
run correctly falls through to regeneration (or, if no generator is
available, keeps the stale flag).
2. MEDIUM — previous-output.ts used `!== undefined` for summaryStale
but a truthy check for summaryStaleReason, so an empty-string reason
would break the pair invariant. Match both guards to `!== undefined`.
3. MEDIUM — Moved the `result.toolkit.summary` defensive guard to the
top of maybeGenerateSummary so the no-generator fallback path can
never clobber an already-set summary.
4. LOW — Wrapped readdirSync and JSON.parse in check-stale-summaries.ts
in try/catch so a missing directory or a malformed JSON file produces
a clean exit-1 with the filename instead of an uncaught exception.
5. LOW — stale-summaries.test.ts now resolves TOOLKITS_DIR from
import.meta.url rather than process.cwd(), so it works from any
working directory (repo root or toolkit-docs-generator/).
Tests updated:
- Replaced the former "does not flag stale when signature matches" test
(which validated the broken behavior) with three tests covering the
three reuse scenarios:
a) signature match + previous fresh → reuse, no regen
b) signature match + previous stale + generator → regen clears flag
c) signature match + previous stale + no generator → keep stale flag
528 tests pass, type-check clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
byrro
approved these changes
Apr 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
maybeGenerateSummarysilently wiped the toolkitsummaryfield in two fallback paths — when no LLM generator was configured and when the LLM call threw.summaryforward, so a signature change never loses hand-refined content on its own.Why
Rendered toolkit pages only show the overview prose block when
data.summaryis truthy (app/_components/toolkit-docs/components/toolkit-page.tsx). The silent wipe meant that every `[AUTO] Adding MCP Servers docs update` merge that touched a toolkit's tool set could drop its overview on Vercel with no warning surfaced to the reader.An earlier version of this fix (missing-generator path only) was written on the PR #907 branch but got dropped during merge to
main. This PR restores it and additionally covers the LLM-failure path.Scope
Changes are isolated to
toolkit-docs-generator/src/merger/data-merger.ts(+ tests). No generator output regenerated in this PR — summary restoration is split into follow-up PRs so reviewers can evaluate content separately from the code fix.Test plan
Closes #926
🤖 Generated with Claude Code
Note
Medium Risk
Changes toolkit JSON output semantics by preserving prior
summaryon LLM-unavailable/failure paths and introducingsummaryStaleflags; downstream consumers/CI may start failing if stale summaries are produced.Overview
Prevents toolkit overviews from being silently dropped by carrying forward the previous
summarywhen summary regeneration is skipped (no LLM configured) or fails, instead of wiping the field.Introduces
summaryStale/summaryStaleReasononMergedToolkitand propagates these through previous-output parsing; summaries reused via signature match are now only considered fresh if the prior summary was not already flagged stale.Adds a CI gate (
tests/stale-summaries.test.ts) and a companion script (scripts/check-stale-summaries.ts) to fail/report when committed toolkit JSON contains stale summaries, plus expandedDataMergertests covering stale-flag behavior and clearing on successful regeneration.Reviewed by Cursor Bugbot for commit f5a58e6. Bugbot is set up for automated code reviews on this repo. Configure here.