fix: resolve-failures ci_only_failure verdict when fix was applied#1969
Merged
Trecek merged 2 commits intoMay 6, 2026
Merged
Conversation
- Add Step 2c: Verdict Override Rule — any fix committed + tests pass → verdict = real_fix unconditionally, regardless of failure_subtype - Rename Step 2d to "No-Fix Verdict Decision Tree" and scope it to fixes_applied == 0 only - Remove confusing rows from decision table (fix-applied path is now handled exclusively by Step 2c/Step 3) - Add Step 2c override to Step 3 green exit condition - Add Step 2c override to Step 2.5 fix-loop entry - Update verdict invariant: ci_only_failure never emitted when fix applied - Add four new structural tests in test_resolve_failures_ci_aware.py (REQ-RF-001/002 guards) - Minor rewording of Step 2a, 2c, 2d, 2.5 body references to avoid spurious regex matches in existing test patterns Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Trecek
commented
May 6, 2026
Collaborator
Author
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit PR Review — Verdict: approved_with_comments
Trecek
commented
May 6, 2026
Collaborator
Author
Trecek
left a comment
There was a problem hiding this comment.
AutoSkillit review: warning-only findings detected. See inline comments — no blocking changes required.
Info findings (no action required):
tests/skills/test_resolve_failures_ci_aware.py
- L229 [info/tests]: test_step2d_table_scoped_to_no_fix_path includes "without entering step 3" as a valid matching phrase, but SKILL.md uses "the fix loop" in all prose after the rename. This phrase is dead and will never match. Replace with "fix loop was entered" or similar.
- L204 [info/tests]: test_skill_fix_applied_overrides_to_real_fix uses assert re.search(...) or re.search(...), msg with two alternatives. If both fail, the error message gives no hint as to which pattern was expected. Consider splitting or embedding pattern strings in the failure message.
…truncation Regex r"Step 3.*?Step 4" stopped at the first inline "Step 4" cross-reference inside the Step 3 body rather than at the ### Step 4 section boundary, making the test fragile. Replace with r"### Step 3.*?(?=\n### Step [45]|\Z)" which anchors on Markdown section headers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek
added a commit
that referenced
this pull request
May 8, 2026
…1969) ## Summary The `resolve-failures` SKILL.md has an ambiguous verdict decision flow that allows an LLM executor to emit `ci_only_failure` even after successfully applying a fix. The fix restructures the Step 2d decision tree to make the override rule explicit: **any time a code change is committed and tests pass, the verdict is `real_fix`, regardless of `failure_subtype`**. The Step 2d table is clarified to apply ONLY to the "no fix applied" path, and a post-fix-loop verdict override is added to prevent re-evaluation through the wrong decision path. ## Requirements - REQ-RF-001: When `resolve-failures` applies a code change AND the subsequent CI run passes, the verdict MUST be `real_fix`, not `ci_only_failure` - REQ-RF-002: `ci_only_failure` should only be emitted when no fix was applied or when the applied fix did not resolve the CI failure - REQ-RF-003: The fix must not break the existing `ci_only_failure` path for genuinely unfixable CI failures Closes #1954 ## Implementation Plan Plan file: `/home/talon/projects/autoskillit-runs/impl-20260505-180603-520539/.autoskillit/temp/make-plan/resolve_failures_ci_only_failure_verdict_fix_plan_2026-05-05_181000.md` 🤖 Generated with [Claude Code](https://claude.com/claude-code) via AutoSkillit <!-- autoskillit:pipeline-signature steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr --> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
resolve-failuresSKILL.md has an ambiguous verdict decision flow that allows an LLM executor to emitci_only_failureeven after successfully applying a fix. The fix restructures the Step 2d decision tree to make the override rule explicit: any time a code change is committed and tests pass, the verdict isreal_fix, regardless offailure_subtype. The Step 2d table is clarified to apply ONLY to the "no fix applied" path, and a post-fix-loop verdict override is added to prevent re-evaluation through the wrong decision path.Requirements
resolve-failuresapplies a code change AND the subsequent CI run passes, the verdict MUST bereal_fix, notci_only_failureci_only_failureshould only be emitted when no fix was applied or when the applied fix did not resolve the CI failureci_only_failurepath for genuinely unfixable CI failuresCloses #1954
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-20260505-180603-520539/.autoskillit/temp/make-plan/resolve_failures_ci_only_failure_verdict_fix_plan_2026-05-05_181000.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary
Token Efficiency