Skip to content

fix(maestro): prefer terminal failure over transient retries in parseMaestroFailure (closes #118)#175

Merged
Lykhoyda merged 1 commit into
mainfrom
fix/gh-118-maestro-parser-last-failure
May 19, 2026
Merged

fix(maestro): prefer terminal failure over transient retries in parseMaestroFailure (closes #118)#175
Lykhoyda merged 1 commit into
mainfrom
fix/gh-118-maestro-parser-last-failure

Conversation

@Lykhoyda
Copy link
Copy Markdown
Owner

Summary

Closes #118parseMaestroFailure was returning the first lexical match anywhere in the buffer, which on real maestro-runner output captured a transient [INFO] ... not found ... retrying line and sent the (already-resolved) selector to cdp_repair_action. Burned auto-repair budget on a non-existent problem; missed the real failure.

Fix

Nested-loop selection that preserves both invariants:

Outer Inner Effect
PATTERNS in declared order (most-specific first) Lines reverse (end → start) Pattern specificity outranks line position; within a single pattern the terminal (last) match wins

Falls back to a whole-buffer scan when no line matches — preserves prior behavior for single-line inputs and defensively covers any future pattern that straddles a \n.

Concretely, for the issue's example:

[INFO] Tapping on element with id \"transient-foo\"
[INFO] Element with id \"transient-foo\" not found in current screen — retrying
[INFO] Tapping on element with id \"transient-foo\"   ← retry succeeded
[ERROR] Element with id \"real-failure\" not found
Test FAILED
  • Before: parser returns transient-foo (first lexical match) → cdp_repair_action tries to repair a working selector
  • After: parser returns real-failure (terminal match) → auto-repair targets the actual broken selector

The 1.0.9 Element not found: id=... pattern still wins over the generic Element 'X' not found fallback even when they appear on adjacent lines — covered by the pre-existing 1.0.9 id= shape has priority over the generic fallback test, which would have caught a naive line-first-then-pattern selection.

Test plan

  • 1467/1467 cdp-bridge unit tests passing locally (+2 net new)
  • Inverted the existing returns first match test to returns LAST match (the test had been encoding the regression behavior)
  • Added the issue's exact transient-retry shape as a regression test
  • Added single-line input test to exercise the fallback whole-buffer path
  • Updated stale section comment that documented the old first-match semantic
  • All pre-existing pattern-specificity tests still pass (1.0.9 id= priority, generic fallback, quote handling, etc.)
  • CI green
  • Live verification: any future maestro-runner failure with retry hooks active now picks the terminal selector

Out of scope

  • Hypothetical edge case flagged by codex-pair: a single line containing multiple failure snippets where line.match() would return the leftmost. No real-world maestro-runner output emits this shape; the recommended fix (collect all match indexes per line + return rightmost) materially complicates the code for an unobserved case. Can revisit with real captures.
  • The [ERROR]/FAILED: prefix-preference layer (strategy 1 in the issue) — terminal-match alone fixes the reported behavior without depending on prefix detection that would miss vanilla Maestro CLI output (which doesn't emit [ERROR] markers).

Refs

🤖 Generated with Claude Code

…MaestroFailure

Closes #118.

PR #115's parseMaestroFailure ran each pattern against the whole buffer
and returned the first lexical match anywhere in the output. On
maestro-runner runs with retry hooks active, this captured a transient
`[INFO] Element with id "X" not found in current screen — retrying` line
earlier in the buffer — even when the actual terminal failure was on a
different selector. The transient (already-resolved) selector then got
sent to cdp_repair_action, burning a 24h-budget slot on a non-existent
problem and missing the real failure.

Fix: nested-loop selection that preserves BOTH invariants:

  1. Outer loop walks PATTERNS in order (most-specific first). The first
     pattern that hits any line wins, regardless of line position. Keeps
     the existing pattern-specificity invariant (e.g. 1.0.9 `id=` shape
     outranks the catch-all `Element 'X' not found`).
  2. Inner loop scans lines from END to START. Within a single pattern,
     the last matching line — the terminal failure — wins over earlier
     transient retries.

Falls back to a whole-buffer scan if no line matches a known pattern,
preserving prior behavior for single-line inputs and any future pattern
that spans a `\n`.

Tests:
- Inverted `returns first match` → `returns LAST match` (the existing
  test encoded the broken behavior)
- Added GH #118 transient-retry-then-real-failure shape using the issue's
  exact example
- Added single-line input test confirming the fallback whole-buffer scan
  path still works
- Updated the section comment that documented the old "first-match"
  semantic
- Existing 1.0.9 `id=` priority test still passes (proves pattern
  specificity is preserved over line position)

Verified: 1467/1467 cdp-bridge unit tests passing (+2 net new).

Note: codex-pair flagged a hypothetical edge case (multiple known
failure snippets on a SINGLE line — `line.match()` returns leftmost).
Not addressing in this PR — no real-world maestro-runner output emits
multiple failure snippets per line, and the recommended fix (collect
all match indexes and return rightmost) materially complicates the
code for an unobserved case. Can revisit with real captures if it
ever surfaces.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Lykhoyda Lykhoyda merged commit 3af569f into main May 19, 2026
7 checks passed
@Lykhoyda Lykhoyda deleted the fix/gh-118-maestro-parser-last-failure branch May 19, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Maestro error parser: prefer terminal failure lines over INFO/DEBUG noise

1 participant