Skip to content

rc-docs-sync: pre-agent fast-path, strip version bumps, cover AI Content Planner JS#398

Merged
enricobattocchi merged 9 commits into
mainfrom
rc-docs-sync-reduce-agent-work
May 11, 2026
Merged

rc-docs-sync: pre-agent fast-path, strip version bumps, cover AI Content Planner JS#398
enricobattocchi merged 9 commits into
mainfrom
rc-docs-sync-reduce-agent-work

Conversation

@enricobattocchi
Copy link
Copy Markdown
Member

@enricobattocchi enricobattocchi commented May 11, 2026

Summary

Reduce unnecessary work by the RC docs-sync agent, based on patterns observed across the entire 27.6 RC cycle (RC1–RC8). Three original concerns from the proposal — fast-path, AGENT_MAP gap, version-bump noise — plus refinements discovered while validating locally.

What changed

  1. Pre-agent fast-path (Detect new public surface in filtered diff step). When the filtered diff has no new register_rest_route(...), no new WP_CLI::add_command(...), and every apply_filters / do_action symbol referenced in added lines is already in symbol-index.txt, the workflow posts the run-summary itself with the rc-docs-sync:v1 marker and skips the Claude Code Action.
  2. Strip version-string bumps from rc.diff.filtered via git diff -I<regex> for Version: header, WPSEO_VERSION, "version":, "pluginVersion":, CURRENT_RELEASE, MINIMUM_SUPPORTED.
  3. AGENT_MAP coverage for AI Content Planner JS — add packages/js/src/ai-*/** to the ai area's source paths so the gap flagged in 4 consecutive RCs is no longer flagged each run.
  4. Fix pathspec semantics & add spec/**:(exclude)**/__tests__ and :(exclude)**/__snapshots__ only matched the directory entry itself, not its contents (latent bug — no __tests__ in 27.6-cycle diffs so unobserved). Fixed both, plus added **/spec/** (the yoastseo JS package's spec/ directory contains a 46,806-line sampleVocabulary.json fixture that was driving ~78% of the stable→RC diff size).
  5. Filename-suffix test exclusions**/*.test.*, **/*.spec.*, **/*.stories.* per the AGENT_MAP convention. Catches packages/js/tests/*.test.js that fell through the directory-based exclusions.
  6. Widen safety-net condition — the earlier fast-path commit narrowed the safety-net to fire only on has_public_surface == 'true', which would have left a missing marker if the new detect step itself crashed. Reverted to any_content == 'true'; the step body already self-deduplicates.
  7. Weekday-only schedule — cron changed from '0 6 * * *' to '0 6 * * 1-5'. Weekend runs almost always find nothing; workflow_dispatch still works for urgent off-hours backfill.

Measured impact (validated locally against the real Yoast/wordpress-seo repo at each tag)

Filtered diff sizes — original vs. final:

Prev → RC Original Final Reduction
27.5 → 27.6-RC1 62,175 6,775 89%
27.5 → 27.6-RC2 62,314 6,904 89%
RC1 → RC2 391 356 9%
RC2 → RC3 154 62 60%
RC3 → RC4 711 467 34%
RC4 → RC5 1,804 1,042 42%
RC5 → RC6 264 168 36%
RC6 → RC7 927 376 59%
RC7 → RC8 918 643 30%

Fast-path decisions — has_public_surface matches actual agent behavior in every case:

Prev → RC Real-world Fast-path verdict Match
27.5 → 27.6-RC1 agent ran ($4.51, opened PR #393) agent (2 new routes)
27.5 → 27.6-RC2 agent ran ($2.10, 0 PRs) agent (2 routes + new hook)
RC2 → RC3 agent ran ($0.89, 0 PRs) fast-path — skipped
RC3 → RC4 agent ran ($1.08, 0 PRs) fast-path — skipped
RC4 → RC5 agent ran ($1.30, 0 PRs) fast-path — skipped
RC5 → RC6 infra fail (no agent output) agent (new public filter wpseo_enable_ai_content_planner_inline_banner) ✓ correct call
RC6 → RC7 agent ran ($1.40, 0 PRs) fast-path — skipped
RC7 → RC8 agent ran ($0.60, 0 PRs, route classified internal) agent (1 new route)

Retroactive saving across RC3, RC4, RC5, RC7: ~$4.67 / 179 turns without changing any outcome. Future stable→RC runs (e.g. when 27.7-RC1 is cut against 27.6 stable) will have ~7k filtered lines instead of ~62k for the agent to read.

Risk and mitigation

The fast-path could miss "behavior-only" changes that don't introduce new symbols. The agent prompt already flags those as uncertain "Needs human verification" cases, and across 7 zero-PR runs none of them surfaced one. The fast-path comment includes a "Top changed files" details block so a maintainer sweeping the tracking issue can spot anomalies without reading the diff.

The weekday-only schedule introduces up to ~3 days of latency for an RC tagged late Friday (in the 27.6 cycle, RC4 was tagged Fri 2026-05-01 and would slip from Saturday to Monday under the new schedule). workflow_dispatch remains available for urgent backfill.

Test plan

  • Workflow YAML parses (yaml.safe_load).
  • Embedded Python parses (ast.parse).
  • End-to-end local test against a real Yoast/wordpress-seo clone for all 9 historical RC-pair scenarios (27.5→RC1, 27.5→RC2, RC1→RC2, RC2→RC3, …, RC7→RC8) — every has_public_surface decision matches the actual agent run's outcome.
  • Version-bump filtering verified file-by-file: wp-seo.php / wp-seo-main.php / package.json no longer appear in rc.diff.filtered.
  • Test/spec exclusions verified: packages/yoastseo/spec/**/*.json and packages/js/tests/**/*.test.js no longer appear.
  • Optional smoke test on a real run via workflow_dispatch with product=wordpress-seo rc_tag=27.6-RC4 (or any past RC) before/instead of waiting for the next 06:00 UTC schedule.

🤖 Generated with Claude Code

enricobattocchi and others added 3 commits May 11, 2026 09:38
The Content Planner JS implementation (packages/js/src/ai-content-planner/**)
fell through every area's source_paths, so the RC docs-sync agent flagged it
as a coverage gap in four consecutive runs (27.6-RC4 / RC5 / RC7 / RC8). Add
the per-feature `packages/js/src/ai-*/**` pattern to the `ai` area for both
Yoast SEO and Yoast SEO Premium so frontend AI work is attributed to the right
area instead of repeatedly re-flagged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every recent zero-PR run-summary called out "Version bumps in package.json,
wp-seo-main.php, wp-seo.php — noise" — the agent was reading those hunks each
RC just to dismiss them. Apply `-I<regex>` to the filtered diff so hunks
containing *only* version-string changes are dropped at diff-compute time:

  -I' \* Version: '       # PHP file header (" * Version: 27.6-RC8")
  -I'WPSEO_VERSION'       # define( 'WPSEO_VERSION', '...' );
  -I'"version":'          # package.json / composer.json
  -I'CURRENT_RELEASE'
  -I'MINIMUM_SUPPORTED'

Hunks that mix a version line with anything else are preserved (semantics of
`-I` require *all* changed lines in a hunk to match). `rc.diff.full` keeps
the bumps as a cross-check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five of the six successful zero-PR runs in the 27.6 cycle so far burned an
average of ~40 turns and ~$1/run for the agent to conclude that an
incremental RC contains no new public symbols, REST routes, or CLI commands.
Detect that case deterministically before invoking Claude.

After the bundle is built, scan `rc.diff.filtered` for *added* lines matching:
- `register_rest_route(...)`
- `WP_CLI::add_command(...)`
- `apply_filters('<sym>', ...)` / `do_action('<sym>', ...)` where `<sym>` is
  not already in `symbol-index.txt` (i.e., not already documented).

If none of these hit, the workflow posts the run-summary comment itself (with
the rc-docs-sync:v1 marker, diff size, symbol-index size, and the top-changed
file list for human spot-check) and skips the `anthropics/claude-code-action`
step entirely. The safety-net marker step is also gated on the new condition
so it only fires when the agent was actually supposed to run.

Validated against the 27.6 cycle's real diffs:
- RC2→RC3, RC3→RC4, RC4→RC5, RC6→RC7 → has_public_surface=false (fast-path)
- RC7→RC8 → has_public_surface=true (new internal-classified route — agent
  still needed to apply the Step 1.6 internal-vs-public discrimination)
- 27.5→27.6-RC2 → has_public_surface=true (2 new routes + 1 new hook symbol)

Risk: misses behavior-only changes that don't introduce new symbols. The
agent prompt already flags those as uncertain "Needs human verification" cases
and they haven't been observed in any zero-PR run so far. Mitigation: the
fast-path comment includes the "Top changed files" list so a maintainer
sweeping the tracking issue can spot anomalies without reading the diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 11, 2026

Deploying yoast-developer with  Cloudflare Pages  Cloudflare Pages

Latest commit: 77fea3a
Status: ✅  Deploy successful!
Preview URL: https://1517eddc.yoast-developer.pages.dev
Branch Preview URL: https://rc-docs-sync-reduce-agent-wo.yoast-developer.pages.dev

View logs

enricobattocchi and others added 6 commits May 11, 2026 10:00
Local test against wordpress-seo at 27.6-RC2..RC3 showed the package.json
"pluginVersion" change slipped through the version-bump filter — the
"version" regex doesn't match because pluginVersion lives in the "yoast"
block, separate from npm's top-level "version" field. Add an explicit
'"pluginVersion":' regex so the package.json hunk is also dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing :(exclude)tests pathspec only matches a top-level tests/ dir,
so packages/js/tests/**/*.test.js leaked into the filtered diff. The agent
prompt and AGENT_MAP both list filename-suffix patterns (*.test.*, *.spec.*,
*.stories.*) in the noise-exclusion convention; mirror that in the workflow.

Local test against the 27.6 cycle's diffs:
- RC2→RC3 filtered: 141 → 62 lines  (56% smaller)
- RC3→RC4 filtered: 698 → 467 lines (33% smaller)
- RC4→RC5 filtered: 1804 → 1042 lines (42% smaller)
- RC6→RC7 filtered: 927 → 376 lines (59% smaller)
- RC7→RC8 filtered: 918 → 643 lines (30% smaller)

All has_public_surface decisions unchanged (verified: 4 fast-paths and 5
agent-invocations still classified the same way), confirming the exclusion
doesn't change semantics, only trims noise the agent had to read.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues in the existing diff-bundle exclusions:

1. ':(exclude)**/__tests__' and ':(exclude)**/__snapshots__' only match the
   directory entry itself, not files inside it, because git pathspec globs
   at non-root depth need a trailing '/**'. Latent bug (no __tests__ in any
   27.6-cycle diff, so unobserved) but now fixed.

2. The yoastseo JS package puts its specs under packages/yoastseo/spec/,
   which wasn't excluded. A single sample-vocabulary.json fixture inside it
   contributes 46,806 of the ~60,000 lines in a stable→RC filtered diff —
   the agent reads it just to ignore it.

Local test against 27.5 → 27.6-RC1 (which has the big spec fixture):
  Before: 59,777 filtered lines (62k full)
  After:    6,775 filtered lines (89% smaller)

All 9 has_public_surface decisions across the 27.6 cycle remain identical
(verified before/after).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier fast-path commit narrowed the safety-net step's condition to
'has_public_surface == true', meaning the step would not fire if the new
detect step itself crashed (output unset != 'true'). That regressed the
"no marker is ever missing" invariant the safety net exists to preserve.

The step body already deduplicates by checking for an existing rc-docs-sync:v1
marker on the tracking issue and exiting early if found. So the safest
condition is just 'any_content == true' — the step then runs after every
"agent-or-fast-path-was-supposed-to-handle-this" case and is a no-op when
something earlier did post a marker. Failure modes now covered: fast-path
step crash, detect step crash, agent step OOM, all return-to-the-net.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Saturday and Sunday runs almost always find nothing to process (RCs aren't
typically cut over the weekend) and just add zero-PR noise to the tracking
issue and run history. Restrict the schedule to Mon-Fri 06:00 UTC.

Trade-off: an RC tagged late Friday won't be picked up until Monday morning
(up to ~3 days delay). If a weekend tag needs faster turnaround, the
workflow_dispatch path still works.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both "Post no-op summary if filtered diff is empty" and "Post fast-path
marker" steps unconditionally posted a comment to the tracking issue. That
made manual backfills via workflow_dispatch destructive: re-running for an
already-processed RC produced a duplicate marker comment.

Mirror the safety-net's dedup pattern in both steps: check whether a marker
for this (product, rc_tag) already exists on the tracking issue and exit
cleanly if so. Now workflow_dispatch backfills are safe to re-run, including
as a smoke test of this branch's code against a real past RC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@enricobattocchi enricobattocchi merged commit f83376a into main May 11, 2026
3 checks passed
@enricobattocchi enricobattocchi deleted the rc-docs-sync-reduce-agent-work branch May 11, 2026 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant