Harden skill-availability check: require two exact-form failures before noop#133
Merged
theletterf merged 1 commit intomainfrom May 5, 2026
Merged
Conversation
…re noop
The openings sweep noop'd on its first real run on
elastic/docs-content despite the skill being available, because
the agent stochastically invoked the skill with a non-exact form
the first time:
11:33:22 ✗ skill(docs-page-opening-optimizer) Skill not found
11:33:27 ● noop "docs-page-opening-optimizer skill unavailable..."
11:33:33 ✓ skill(skill: docs-page-opening-optimizer) succeeded
The prompt's "abort with noop on skill failure" rule fired on the
first non-exact attempt, before the agent retried with the correct
`skill(skill: …)` form and got results. By the time the exact
invocation succeeded, the noop was already committed.
Frontmatter, applies-to, and style produced fix-issues fine in the
same orchestrator run. The difference: the frontmatter sweep
prompt explicitly tolerates partial skill failures ("if one fails,
report only the other"); applies-to and style happened to land on
the exact form first try. Openings was the unlucky one.
Lift the docs-review pattern (PRs #117/#118) where skill
availability is only declared after a confirmed exact-form
failure. Apply to the three sweeps that had the strict
abort-on-first-failure rule:
- openings (was hit by this in real run 25373751367)
- applies-to (would hit it next time the agent picks the wrong form)
- style (same)
Each now spells out the procedure: try the exact form, retry
once on failure, only noop after the second exact-form failure.
A single non-exact failure is not sufficient evidence.
The frontmatter prompt already had a softer fallback ("merge what
works, note any failure in Notes"); coherence and staleness use
different mechanisms (>50% MCP failure threshold and full-repo
deterministic respectively); typos doesn't import any skill.
None of those need this change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
theletterf
added a commit
that referenced
this pull request
May 5, 2026
Reverses the strict abort rule from #133 in the openings, applies-to, and style sweeps. Real-run evidence (elastic/docs-content runs 25377518194 + 25378183549, both openings) showed the hardened "two failures = noop" rule firing prematurely: - 13:09:34 ✗ skill(docs-page-opening-optimizer) Skill not found - 13:09:37 ✗ skill(docs-page-opening-optimizer) Skill not found - 13:09:41 ● noop "skill unavailable — confirmed after exact-form attempts" - 13:14:36 ● create_issue "shard 20/28 — 20 pages" ← agent eventually produced an issue, but suppressed because noop already fired The agent's tool serialization keeps logging the skill call as `skill(docs-X)` — without the `skill:` prefix — which always returns "Skill not found" from Copilot CLI. Whether the agent's underlying invocation was actually reformatted, or whether it's a log-rendering quirk, doesn't matter: my hardening committed the agent to noop after two of these "failures", before it had a chance to do useful work. Replace with frontmatter's softer pattern: try the skill once; if "Skill not found" comes back, fall back to manual analysis using bash + the agent's own judgment, and note the skill failure once in the issue body's Notes section. Only noop if even manual analysis produces no high-confidence findings. Applied uniformly to: - gh-aw-docs-openings-sweep.md (the sweep that's been noop'ing) - gh-aw-docs-applies-to-sweep.md - gh-aw-docs-style-sweep.md Frontmatter, coherence, staleness, typos already had softer or deterministic-only fallback paths and don't need this change. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The openings sweep noop'd on its first run after the master+subs refactor on elastic/docs-content despite
docs-page-opening-optimizerbeing available. Cause was a stochastic agent retry pattern interacting with an over-aggressive abort rule:The agent first invoked the skill with a shortened form (no
skill:prefix), got "Skill not found", honored the prompt's "abort with noop on skill failure" rule — and then 6 seconds later retried with the correct exact formskill(skill: docs-page-opening-optimizer)which worked. By that point the noop was already committed; no fix-issue ever landed.Frontmatter, applies-to, and style each produced their fix-issues in the same orchestrator run. They got lucky on the exact form first try, OR (in frontmatter's case) the prompt explicitly tolerates partial skill failures. Openings just had the strictest abort rule.
Fix
Lift the docs-review hardening from PRs #117 / #118: only treat a skill as unavailable after a confirmed exact-form failure. Each of the three strict-abort sweeps now spells out the procedure:
skill(skill: docs-X).noop.A single first-attempt failure (especially with a non-exact form) is explicitly not sufficient evidence to noop.
Applied to:
gh-aw-docs-openings-sweep.md(the sweep that actually got bitten)gh-aw-docs-applies-to-sweep.md(same abort wording, would hit next time the agent picks the wrong form)gh-aw-docs-style-sweep.md(same)Sweeps left unchanged because their failure handling is already softer or uses a different mechanism:
frontmatter— explicitly tolerates partial failures ("merge what works, note inNotes")coherence— uses a >50%-MCP-failure threshold instead of single-call abortstaleness— deterministic pre-step + MCP for one optional category; no skill importtypos— no skill import at allTest plan
v1.docs-openings-sweepdirectly viagh workflow run docs-openings-sweep.ymlfrom elastic/docs-content. Confirm adocs-fix:openingsissue is opened (or at minimum that the agent log shows the second exact-form retry happening before any noop).docs-applies-to-sweepanddocs-style-sweepsimilarly. Confirm no regression — they kept producing issues in the previous run, should keep doing so now.🤖 Generated with Claude Code