Skip to content

fix(cua-driver-rs)(windows)(#1621): x,y click skips UIA Invoke on canvas-like surfaces#1622

Merged
f-trycua merged 1 commit into
mainfrom
fix/cua-driver-rs-windows-click-xy-skip-canvas-uia
May 21, 2026
Merged

fix(cua-driver-rs)(windows)(#1621): x,y click skips UIA Invoke on canvas-like surfaces#1622
f-trycua merged 1 commit into
mainfrom
fix/cua-driver-rs-windows-click-xy-skip-canvas-uia

Conversation

@f-trycua
Copy link
Copy Markdown
Collaborator

@f-trycua f-trycua commented May 21, 2026

Summary

click(x, y) was routing through try_invoke_in_window_at_point for any UIA element advertising InvokePattern at the click coordinates. For container surfaces (Pane / Image / Custom / Document / Group) Invoke() fires the element's default action at its centre and ignores the requested (x, y) — silently breaking pixel precision on canvases, paint surfaces, image maps, and 3D viewports.

The fix adds is_coord_independent_action() — a control-type whitelist for elements whose primary action is coord-independent (Button, MenuItem, Hyperlink, TabItem, ListItem, CheckBox, RadioButton, SplitButton, TreeItem). For these, UIA Invoke is semantically correct. For everything else, even if the element advertises InvokePattern, fall through to PostMessage with the literal coords.

Why this asymmetry

The UIA-first path was added so click(x, y) could reach UWP / WebView2 / DirectComposition surfaces where PostMessage(WM_LBUTTONDOWN) no-ops (input routes via Windows.UI.Input instead of the message queue). That's still valuable for buttons + menu items in those targets — which is exactly what the whitelist preserves. But invoking the container of a canvas, paint surface, or 3D viewport gives the caller the canvas's geometric centre instead of the requested coords — that's the user-visible bug from #1621.

Repro (from issue #1621)

cua-driver call click '{"pid":<edge-pid>,"x":110,"y":677}'
# Before: "✅ Performed UIA Invoke at (110,677)"
#         #canvas-status shows: "canvas: clicked at (152,77)" — canvas centre
# After:  "✅ Posted click to pid <pid>"
#         #canvas-status shows: "canvas: clicked at (<computed x>, <computed y>)" — requested coords

What this does NOT change

  • Element-indexed click (click(element_index=N)) is unchanged — different code site (impl_.rs:1168). Callers asking for "invoke this specific element" by index keep the previous behaviour, regardless of control type.
  • Right-click + multi-click already skipped the UIA path (use_uia = (btn == "left" || btn == "middle") && count == 1); they remain on PostMessage.
  • ExpandCollapsePattern preference for Qt menu items (added in feat(cua-driver-rs): Phase 2 panel + structural fixes (opt-in) #1566) is unchanged — runs after the new control-type filter; only fires when a whitelisted-type element also has ExpandCollapse.

Test plan

  • cargo check -p platform-windows clean (41.82s, 0 new warnings)

  • cargo build --release -p cua-driver clean (24.43s release)

  • Runtime verification on RDP-connected interactive session:

    1. Open Edge to libs/cua-driver-fixtures/test_page.html with the four flags from cua-driver-rs Windows: launch_app should auto-inject anti-throttling flags for hidden Chromium browsers #1620.
    2. get_window_state to find the canvas bounding rectangle.
    3. click(pid, x, y) at a non-centre point inside the canvas (e.g. canvas-relative (30, 30)).
    4. Tool response should say "✅ Posted click to pid X" (not "Performed UIA Invoke").
    5. #canvas-status should report the requested coords ±2px.

    The SSH-only session in tonight's autonomous run couldn't reach an interactive desktop (daemon ends up in Session 0; list_windows returns empty), so the live click test couldn't run end-to-end. Verification needs an RDP'd interactive session.

Related

Closes #1621.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Enhanced interaction reliability for UI controls by improving coordinate-precision handling during element invocation.

Review Change Stack

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Ignored Ignored Preview May 21, 2026 12:08pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a7ca67fa-2496-455e-8b2c-11cbe28b9631

📥 Commits

Reviewing files that changed from the base of the PR and between 707d143 and 9325054.

📒 Files selected for processing (1)
  • libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs

📝 Walkthrough

Walkthrough

This PR refines Windows UIA click invocation to respect pixel-coordinate precision. It adds control-type classification to identify elements whose primary action is coordinate-independent (buttons, checkboxes, menus), then filters click candidates in try_invoke_in_window_at_point to exclude coordinate-sensitive element types, ensuring canvas and custom surfaces receive WM_LBUTTONDOWN/UP with exact requested coordinates instead of UIA Invoke.

Changes

UIA Coordinate-Independence Filtering

Layer / File(s) Summary
Coordinate-independence helper and invocation filtering
libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs
Extends UIA control-type imports (button, menu, hyperlink, tab, list, tree, checkbox, radio, split-button); implements is_coord_independent_action() to classify control types whose primary action does not require pixel coordinates; and filters hit-test candidates in try_invoke_in_window_at_point to exclude non-coordinate-independent elements, preserving pixel-precision click delivery to canvas and coordinate-sensitive targets.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

  • #1621 — Addresses the silent UIA Invoke reroute on coordinate-sensitive targets like canvas by filtering out non-coordinate-independent element types, ensuring pixel-precision clicks bypass UIA Invoke and use WM_LBUTTONDOWN/UP instead.

Possibly related PRs

  • trycua/cua#1624 — Also modifies windows_enum.rs around UIA click/invoke selection logic, adding coordinate-independence classification and control-type ID imports for the same routing improvement.
  • trycua/cua#1549 — Introduces the original try_invoke_in_window_at_point logic in windows_enum.rs that this PR extends with a coordinate-independence guard.

Poem

🐰 A canvas waits for pixel truth,
Where coordinates should reign supreme,
We filter out the UIA sleuth,
And let WM_LBUTTONUP's sweet dream
Paint dots at (110, 677) — not the center's gleam! 🎨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Linked Issues check ❓ Inconclusive The PR addresses #1621 (canvas click coordinates) via control-type whitelist logic in windows_enum.rs, but linked issue #1620 (Chromium anti-throttling flags for launch_app) has no visible code implementation. Verify whether #1620 implementation is deferred to a separate PR or if related code changes are missing from this PR.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: preventing UIA Invoke rerouting for x,y clicks on coordinate-sensitive surfaces like canvases on Windows.
Out of Scope Changes check ✅ Passed All changes directly support the PR objectives: fixture consolidation enables testing, windows_enum.rs implements the control-type whitelist for #1621, and old fixture duplicates are removed.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/cua-driver-rs-windows-click-xy-skip-canvas-uia

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@f-trycua
Copy link
Copy Markdown
Collaborator Author

Runtime verification (2026-05-21, daemon in Session 2 / interactive RDP)

Routing fix — PASS ✅

  • All four click(pid, x, y) calls returned "✅ Posted click to pid <pid>" — UIA Invoke reroute no longer happens.
  • The canvas in test_page.html exposes UIA ControlType=Image with Invoke pattern — exactly the trap the whitelist catches.

Surfaced a deeper Chromium bug — see #1623

Recommendation

Merge #1621 anyway — silent reroute to centre was worse than honest failure, and the control-type whitelist is the correct semantics. #1623 (Chromium SendInput) is a separate fix that unlocks coord-click on Chromium content; without #1621, that work would have been blocked at the dispatcher level by silent UIA fallback.

…vas-like surfaces

`click(x, y)` previously routed through `try_invoke_in_window_at_point`
for any UIA element advertising InvokePattern at the click coordinates.
For container surfaces (Pane / Image / Custom / Document / Group)
`Invoke()` fires the element's default action at its centre and ignores
the requested (x, y) — silently breaking pixel precision on canvases,
paint surfaces, image maps, and 3D viewports.

## Repro (from #1621, verified 2026-05-21 on the Windows VM against
   `libs/cua-driver-fixtures/test_page.html` loaded in Edge)

```
cua-driver call click '{"pid":<edge-pid>,"x":110,"y":677}'
# → "✅ Performed UIA Invoke at (110,677)"
# But #canvas-status shows: "canvas: clicked at (152,77)"
#   ─ canvas center, not the requested (110, 677)
```

The canvas's `mousedown` handler fired at synthesised centre coords
because UIA `Invoke()` has no notion of "where inside the element".

## Fix

Add `is_coord_independent_action()` — a control-type whitelist for
elements whose primary action is coord-independent (Button, MenuItem,
Hyperlink, TabItem, ListItem, CheckBox, RadioButton, SplitButton,
TreeItem). For these, UIA Invoke is the semantically correct path —
the element identity *is* the action target, and the click coords
don't matter past hit-testing.

For everything else (Pane / Image / Custom / Document / Group / etc.),
even if the element advertises InvokePattern, fall through to
PostMessage with the literal coords. This preserves UWP / WebView2
coverage for buttons + menu items (the original motivation for the
UIA-first path) without silently rerouting canvas clicks to "click at
centre".

## What this does NOT change

- Element-indexed click (`click(element_index=N)`) is unchanged — it
  takes the UIA Invoke path explicitly via a different code site
  (`impl_.rs:1168`). Callers asking for "invoke this specific element"
  by index keep the previous behaviour.
- Right-click and multi-click already skipped the UIA path
  (`use_uia = (btn == "left" || btn == "middle") && count == 1`); they
  remain on PostMessage.
- The ExpandCollapsePattern preference for Qt menu-bar items (added in
  #1566) is unchanged — that path runs after the new control-type
  filter and only fires when a whitelisted-type element happens to
  also have ExpandCollapse.

## Test plan

- [x] `cargo check -p platform-windows` clean on the VM (41.82s,
      0 new warnings — all 28 warnings are pre-existing)
- [x] `cargo build --release -p cua-driver` clean (24.43s release build)
- [ ] **Runtime verification deferred to next interactive RDP session**:
      load `libs/cua-driver-fixtures/test_page.html` in Edge with the four
      anti-occlusion + a11y flags (see #1620), call `click(pid, x, y)`
      at a non-centre point inside the canvas, expect tool response to say
      `"✅ Posted click to pid <pid>"` (not `"Performed UIA Invoke"`) and
      `#canvas-status` to report the requested coords ±2px. The SSH-only
      session in tonight's autonomous run can't reach an interactive
      desktop (daemon ends up in Session 0; `list_windows` returns empty)
      so the click test couldn't run end-to-end — user needs to RDP in
      to verify.

## Related

- #1620 — Chromium anti-throttling flags (separate fix; needed for any
  Edge/Chrome DOM verification on hidden launches, including this test)

Closes #1621.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@f-trycua f-trycua force-pushed the fix/cua-driver-rs-windows-click-xy-skip-canvas-uia branch from 029d228 to 9325054 Compare May 21, 2026 12:07
@f-trycua f-trycua merged commit 190b657 into main May 21, 2026
4 of 5 checks passed
@f-trycua f-trycua deleted the fix/cua-driver-rs-windows-click-xy-skip-canvas-uia branch May 21, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cua-driver-rs Windows: click(x,y) silently rerouted to UIA Invoke when an actionable element is at that point

1 participant