Skip to content

fix(browser): __piloRefMap fallback when data-pilo-ref is stripped#450

Open
lmorchard wants to merge 1 commit into
mainfrom
fix/refmap-fallback
Open

fix(browser): __piloRefMap fallback when data-pilo-ref is stripped#450
lmorchard wants to merge 1 commit into
mainfrom
fix/refmap-fallback

Conversation

@lmorchard
Copy link
Copy Markdown
Collaborator

Summary

  • When React-style DOM reconciliation strips the data-pilo-ref attribute between snapshot and click, the attribute-selector lookup in validateElementRef returned 0 matches and threw InvalidRefException even though the Element reference held in globalThis.__piloRefMap was still valid.
  • This conflated real failures (LLM hallucination, element genuinely removed) with recoverable failures (attribute stripped, element still on the page), making stale_element_refs-classified failures in eval logs a mix of distinct root causes — and unnecessarily aborting agent tasks on React-heavy sites (Booking, Google Flights, Apple).

Design Decisions

  • Fall through to __piloRefMap only on the 0-match path. Preserves the happy-path cost (one selector lookup) when the attribute is intact (common case) and the loud-failure contract on the >1 match path.
  • Re-attach the attribute on recovery rather than returning an ElementHandle. Keeps validateElementRef's return type stable as Locator and helps any subsequent lookup in the same iteration find the element via the existing selector path.
  • Verify Element.isConnected and ownerDocument === document before recovering. Detached elements still fail loudly; same-origin iframe elements (which the snapshot walks inline but the main-frame locator can't reach) continue to fail with immediate InvalidRefException rather than recovering and then timing out at click time with an opaque error.
  • No changes to error messages or InvalidRefException shape. Recovery is silent; callers see either a working locator or the same exception as today.

Changes

  • packages/core/src/browser/playwrightBrowser.tsvalidateElementRef falls back to page.evaluate consulting globalThis.__piloRefMap when the attribute selector returns 0. Re-attaches and re-queries via the locator on success.
  • packages/core/test/playwrightBrowser.test.ts — three new tests (recovery success path, no-map-entry miss, detached-element miss). Two pre-existing tests updated to include evaluate in their mock page, since the new code path consults it on the 0-match branch.
  • packages/extension/src/background/ExtensionBrowser.ts — same recovery logic in the injected scripting.executeScript function. Simpler than the Playwright case because the injected code already runs in the page context (direct Map.get + setAttribute, no evaluate round-trip).

Test Plan

  • pnpm --filter pilo-core run test passes — 672/672, including 3 new tests
  • pnpm --filter pilo-extension run test passes — 266/266
  • pnpm run check passes cleanly across all four packages
  • gitleaks detect clean
  • Manual eval spot-check on a React-heavy site (Booking or Google Flights) after merge — confirm fewer stale_element_refs classifications from attribute-strip cases. Not a merge blocker; will run via the standard eval pipeline.

Why no automated test for the extension change

The injected function runs in the Chrome page context via scripting.executeScript and isn't exercised by the extension package's Vitest unit suite. Refactoring to extract a separately-testable helper was explicitly out of scope per the spec. The pnpm --filter pilo-extension run test:e2e harness covers the live execution path.

References

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves element-ref validation for React-heavy pages by recovering from cases where data-pilo-ref gets stripped between snapshot generation and an action, using the page-side globalThis.__piloRefMap as a fallback while keeping existing failure behavior for true invalid refs.

Changes:

  • Core (Playwright): validateElementRef falls back to __piloRefMap only when the selector finds 0 matches, reattaches data-pilo-ref, then proceeds via the selector path again.
  • Core tests: adds coverage for the recovery path and the fallback miss cases; updates existing mocks to include evaluate where needed.
  • Extension: adds equivalent recovery logic inside the scripting.executeScript page-context function.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
packages/core/src/browser/playwrightBrowser.ts Adds __piloRefMap fallback + attribute reattachment when [data-pilo-ref=...] lookup returns 0.
packages/core/test/playwrightBrowser.test.ts Adds unit tests for recovery/miss paths and updates mocks for the new evaluate call.
packages/extension/src/background/ExtensionBrowser.ts Adds __piloRefMap fallback in the injected action executor when data-pilo-ref is missing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}, ref);

if (!reattached) throw new InvalidRefException(ref);
return this.page.locator(`[data-pilo-ref="${ref}"]`);
…is stripped

When React-style DOM reconciliation strips the data-pilo-ref attribute
between snapshot generation and click execution, the attribute-selector
lookup in validateElementRef returned 0 matches and threw
InvalidRefException — even though the Element reference held in
globalThis.__piloRefMap was still valid. The agent then logged the
failure as 'Invalid element reference', mixing it together with
genuine LLM hallucinations in failure traces.

Adds a fallback path in both validateElementRef (playwrightBrowser.ts)
and the injected lookup function (ExtensionBrowser.ts): when the
attribute selector returns 0, consult globalThis.__piloRefMap, verify
the mapped Element is connected to the DOM and belongs to the current
frame's document, re-attach the attribute, and proceed.

Post-recovery the Playwright path re-runs count() on the new locator
and throws InvalidRefException for 0 or >1 — preserving the same
validation semantics as the non-fallback path against a reconciliation
race that strips the attribute again, or against an unexpected
duplicate attribute landing.

Genuine hallucinations (ref absent from the map) and genuine DOM
removals (Element.isConnected === false) still fail loudly via the
existing InvalidRefException path.

The ownerDocument check excludes same-origin iframe elements that the
snapshot may have walked inline — those can't be acted on from the
main-document context anyway, and recovering them would push the
failure to click time as an opaque 30s locator timeout. With the
guard, iframe refs continue to fail with the same immediate
InvalidRefException as before.

Closes #449
@lmorchard lmorchard force-pushed the fix/refmap-fallback branch from b9c128b to 15571c6 Compare May 15, 2026 14:24
@lmorchard
Copy link
Copy Markdown
Collaborator Author

Addressed Copilot's review comment on the post-recovery validation. After re-attaching the attribute, validateElementRef now re-runs count() on the new locator and throws InvalidRefException for 0 / >1 matches — preserving the same validation semantics as the non-fallback path against reconciliation races and unexpected duplicates. Added two tests covering the new branches (post-recovery zero / multiple).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fall back to __piloRefMap when [data-pilo-ref] selector returns 0 in validateElementRef

2 participants