fix(maestro): prefer terminal failure over transient retries in parseMaestroFailure (closes #118)#175
Merged
Conversation
…MaestroFailure Closes #118. PR #115's parseMaestroFailure ran each pattern against the whole buffer and returned the first lexical match anywhere in the output. On maestro-runner runs with retry hooks active, this captured a transient `[INFO] Element with id "X" not found in current screen — retrying` line earlier in the buffer — even when the actual terminal failure was on a different selector. The transient (already-resolved) selector then got sent to cdp_repair_action, burning a 24h-budget slot on a non-existent problem and missing the real failure. Fix: nested-loop selection that preserves BOTH invariants: 1. Outer loop walks PATTERNS in order (most-specific first). The first pattern that hits any line wins, regardless of line position. Keeps the existing pattern-specificity invariant (e.g. 1.0.9 `id=` shape outranks the catch-all `Element 'X' not found`). 2. Inner loop scans lines from END to START. Within a single pattern, the last matching line — the terminal failure — wins over earlier transient retries. Falls back to a whole-buffer scan if no line matches a known pattern, preserving prior behavior for single-line inputs and any future pattern that spans a `\n`. Tests: - Inverted `returns first match` → `returns LAST match` (the existing test encoded the broken behavior) - Added GH #118 transient-retry-then-real-failure shape using the issue's exact example - Added single-line input test confirming the fallback whole-buffer scan path still works - Updated the section comment that documented the old "first-match" semantic - Existing 1.0.9 `id=` priority test still passes (proves pattern specificity is preserved over line position) Verified: 1467/1467 cdp-bridge unit tests passing (+2 net new). Note: codex-pair flagged a hypothetical edge case (multiple known failure snippets on a SINGLE line — `line.match()` returns leftmost). Not addressing in this PR — no real-world maestro-runner output emits multiple failure snippets per line, and the recommended fix (collect all match indexes and return rightmost) materially complicates the code for an unobserved case. Can revisit with real captures if it ever surfaces. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #118 —
parseMaestroFailurewas returning the first lexical match anywhere in the buffer, which on real maestro-runner output captured a transient[INFO] ... not found ... retryingline and sent the (already-resolved) selector tocdp_repair_action. Burned auto-repair budget on a non-existent problem; missed the real failure.Fix
Nested-loop selection that preserves both invariants:
Falls back to a whole-buffer scan when no line matches — preserves prior behavior for single-line inputs and defensively covers any future pattern that straddles a
\n.Concretely, for the issue's example:
transient-foo(first lexical match) →cdp_repair_actiontries to repair a working selectorreal-failure(terminal match) → auto-repair targets the actual broken selectorThe 1.0.9
Element not found: id=...pattern still wins over the genericElement 'X' not foundfallback even when they appear on adjacent lines — covered by the pre-existing1.0.9 id= shape has priority over the generic fallbacktest, which would have caught a naive line-first-then-pattern selection.Test plan
returns first matchtest toreturns LAST match(the test had been encoding the regression behavior)id=priority, generic fallback, quote handling, etc.)Out of scope
line.match()would return the leftmost. No real-world maestro-runner output emits this shape; the recommended fix (collect all match indexes per line + return rightmost) materially complicates the code for an unobserved case. Can revisit with real captures.[ERROR]/FAILED:prefix-preference layer (strategy 1 in the issue) — terminal-match alone fixes the reported behavior without depending on prefix detection that would miss vanilla Maestro CLI output (which doesn't emit[ERROR]markers).Refs
scripts/cdp-bridge/src/domain/maestro-error-parser.ts:parseMaestroFailurescripts/cdp-bridge/test/unit/maestro-error-parser.test.js🤖 Generated with Claude Code