fix(cua-driver-rs/windows): hotkey UIA routing for XAML/UWP targets (closes #1607)#1611
Conversation
…loses #1607) The `hotkey` tool was silently no-op'ing on modern XAML / UWP hosts (modern Notepad, Settings) because `PostMessage(WM_KEYDOWN/UP)` lands in the window's message queue but the CoreInput dispatcher these apps use only reads from the system input queue. Mirrors the fix in #1597 that routed `type_text` through UIA `ValuePattern.SetValue` for the same architectural reason — `hotkey` now routes through UIA accelerator discovery + pattern activation. ## How the routing works When the target HWND is a XAML host (`is_xaml_host_hwnd`): 1. **AcceleratorKey scan**: walk the UIA subtree from the window root, read each element's `UIA_AcceleratorKeyPropertyId`. If an element advertises an accelerator matching the requested combo (case- insensitive, normalized modifier order), use it. 2. **Name-pattern fallback**: many shipping XAML apps don't set AcceleratorKey at all and instead encode the shortcut in the visible element name as a parenthetical hint — e.g. modern Notepad ships `Button "Bold (Ctrl+B)"` rather than setting AcceleratorKey="Ctrl+B". When AcceleratorKey is empty we scan the Name property for that pattern (requires at least one modifier-like token inside the parens to avoid matching arbitrary parentheticals like "(2)" or "(beta)"). 3. **Pattern activation chain**: try `InvokePattern.Invoke()` first (conventional shortcut handler), then `TogglePattern.Toggle()` (Bold/Italic/Underline-style toolbar buttons). If neither pattern is supported on the matched element, surface an actionable error. 4. **Fall back to PostMessage** for non-XAML targets — non-UWP apps keep the existing fast path with zero behavior change. ## Empirical evidence (Windows 11 VM, modern Notepad) Before this PR: `cua-driver call hotkey {pid, key: "ctrl+b"}` returned "✅ Pressed ctrl+b on pid …" but Bold never toggled (PostMessage ignored by CoreInput dispatcher). After this PR: - `ctrl+b` → matched Name "Bold (Ctrl+B)" → TogglePattern.Toggle → Bold actually toggles ✅ - `ctrl+s` → no UIA AcceleratorKey + no element with "(Ctrl+S)" in name (Save is nested behind File menu) → returns actionable error pointing at `get_window_state` for inspection. Much better than silently no-op'ing. ## Known limitation Menu-nested actions (Save, "Save as", Print, etc. in modern Notepad) don't expose accelerators outside the closed menu's subtree. The fix for this is a richer matcher that walks menus by name + invokes the matching item directly. Tracked as a follow-up; the actionable error this PR produces is honest about the limit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR implements UI Automation-based accelerator key invocation for modern XAML/UWP hosts in Windows, addressing silent failures when PostMessage-delivered hotkeys are ignored by the CoreInput dispatcher. The changes add accelerator lookup and activation helpers in the UIA module, then route HotkeyTool invocation through XAML host detection to attempt UIA accelerator activation before falling back to PostMessage. ChangesXAML/UWP Hotkey Accelerator Support
Sequence Diagram(s)sequenceDiagram
HotkeyTool->>HotkeyTool: Check is_xaml_host_hwnd
alt XAML/UWP Host
HotkeyTool->>HotkeyTool: Build accelerator combo<br/>(modifiers + final key)
HotkeyTool->>try_invoke_accelerator_in_window: Call on blocking task
try_invoke_accelerator_in_window-->>HotkeyTool: (success: bool, scanned: count)
alt Match found and activated
HotkeyTool-->>Client: Return "via UIA<br/>InvokePattern/TogglePattern" success
else No match found
HotkeyTool-->>Client: Return error: PostMessage ignored,<br/>use get_window_state or UIA actions
else Match found but no pattern
HotkeyTool-->>Client: Return activation error
end
else Legacy Win32 Host
HotkeyTool->>post_key: PostMessage WM_KEYDOWN/UP
post_key-->>HotkeyTool: Success or error
HotkeyTool-->>Client: Return PostMessage result
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rs`:
- Around line 1559-1565: Wrap the tokio::task::spawn_blocking call that runs
crate::uia::windows_enum::try_invoke_accelerator_in_window in a
tokio::time::timeout using the same timeout/duration used by get_window_state so
the hotkey path cannot hang forever; capture the JoinHandle, await it through
tokio::time::timeout(..., handle) and if the timeout elapses return/log an error
(and abort the blocking task if possible) instead of awaiting indefinitely.
Ensure you reference the spawn_blocking invocation and
try_invoke_accelerator_in_window when making the change so callers get a
deterministic timeout result consistent with get_window_state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4bec1d8e-8a54-4b50-82ad-9d70ea4bba75
📒 Files selected for processing (2)
libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rslibs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs
| let result = tokio::task::spawn_blocking(move || { | ||
| crate::uia::windows_enum::try_invoke_accelerator_in_window( | ||
| hwnd as isize, | ||
| &accelerator_combo, | ||
| ) | ||
| }) | ||
| .await; |
There was a problem hiding this comment.
Add a timeout around the UIA accelerator scan.
This spawn_blocking is awaited indefinitely, but the new path does a full cross-process UIA subtree walk plus property reads. The same file already guards get_window_state because UIA providers can hang; without the same protection here, one bad provider can wedge hotkey instead of returning an error.
⏱️ Suggested shape
- let result = tokio::task::spawn_blocking(move || {
+ let blocking = tokio::task::spawn_blocking(move || {
crate::uia::windows_enum::try_invoke_accelerator_in_window(
hwnd as isize,
&accelerator_combo,
)
- })
- .await;
+ });
+ let result = match tokio::time::timeout(
+ std::time::Duration::from_secs(4),
+ blocking,
+ ).await {
+ Ok(join_result) => join_result,
+ Err(_) => {
+ return ToolResult::error(format!(
+ "hotkey timed out after 4s while scanning UIA accelerators for pid {raw_pid}."
+ ));
+ }
+ };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let result = tokio::task::spawn_blocking(move || { | |
| crate::uia::windows_enum::try_invoke_accelerator_in_window( | |
| hwnd as isize, | |
| &accelerator_combo, | |
| ) | |
| }) | |
| .await; | |
| let blocking = tokio::task::spawn_blocking(move || { | |
| crate::uia::windows_enum::try_invoke_accelerator_in_window( | |
| hwnd as isize, | |
| &accelerator_combo, | |
| ) | |
| }); | |
| let result = match tokio::time::timeout( | |
| std::time::Duration::from_secs(4), | |
| blocking, | |
| ).await { | |
| Ok(join_result) => join_result, | |
| Err(_) => { | |
| return ToolResult::error(format!( | |
| "hotkey timed out after 4s while scanning UIA accelerators for pid {raw_pid}." | |
| )); | |
| } | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rs` around lines
1559 - 1565, Wrap the tokio::task::spawn_blocking call that runs
crate::uia::windows_enum::try_invoke_accelerator_in_window in a
tokio::time::timeout using the same timeout/duration used by get_window_state so
the hotkey path cannot hang forever; capture the JoinHandle, await it through
tokio::time::timeout(..., handle) and if the timeout elapses return/log an error
(and abort the blocking task if possible) instead of awaiting indefinitely.
Ensure you reference the spawn_blocking invocation and
try_invoke_accelerator_in_window when making the change so callers get a
deterministic timeout result consistent with get_window_state.
…dback) Per CodeRabbit on #1611: the new XAML hotkey path does a full cross-process UIA subtree walk + per-element property reads. Without a timeout, a hung UIA provider would wedge the entire `hotkey` call indefinitely instead of returning a deterministic error to the caller — exactly the failure mode `get_window_state` already guards against in the same file. Wrap the `spawn_blocking` invocation of `try_invoke_accelerator_in_window` in a 4-second `tokio::time::timeout`. On expiry the caller gets an actionable error that names the unresponsive provider's pid + hwnd rather than a daemon hang. 4 s matches the budget the rest of the file uses for similar UIA scans. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The error returned when the UIA accelerator scan finds nothing said: "could not find a descendant UIA AcceleratorKey matching ..." But the scan tries BOTH approaches: the `UIA_AcceleratorKeyPropertyId` property AND the `(Ctrl+X)`-style hint in element Names (added for modern Notepad, which doesn't set AcceleratorKey). The original wording implied only the first was tried, which is misleading when 30+ elements were scanned and nothing matched. Updated to: "could not find a UIA AcceleratorKey or `(Ctrl+X)`-style name hint matching ..." Also added a one-line hint about the most common cause (menu-nested actions like modern Notepad's Save behind the File menu) so the caller knows what to look for in `get_window_state` output. Found via end-to-end Claude Code test in #1611 review — Claude noted the wording was a touch misleading after seeing the actual error message it received from `cua-driver call hotkey ctrl+s` on modern Notepad. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…#1612) The error returned when the UIA accelerator scan finds nothing said: "could not find a descendant UIA AcceleratorKey matching ..." But the scan tries BOTH approaches: the `UIA_AcceleratorKeyPropertyId` property AND the `(Ctrl+X)`-style hint in element Names (added for modern Notepad, which doesn't set AcceleratorKey). The original wording implied only the first was tried, which is misleading when 30+ elements were scanned and nothing matched. Updated to: "could not find a UIA AcceleratorKey or `(Ctrl+X)`-style name hint matching ..." Also added a one-line hint about the most common cause (menu-nested actions like modern Notepad's Save behind the File menu) so the caller knows what to look for in `get_window_state` output. Found via end-to-end Claude Code test in #1611 review — Claude noted the wording was a touch misleading after seeing the actual error message it received from `cua-driver call hotkey ctrl+s` on modern Notepad. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…d dir (#1619) * fix(cua-driver-rs/windows): click prefers ExpandCollapsePattern for menu items (extends #1611) When the UIA element under the click point exposes BOTH InvokePattern AND ExpandCollapsePattern (Qt top-level MenuItems advertise both), the intended behavior is "open the submenu" — Invoke alone is a no-op for menu-bar items. Prefer ExpandCollapse.Expand in that case, fall back to Invoke on failure. Also relaxes the element filter to accept elements that support EITHER pattern (was: InvokePattern only). Without this, ExpandCollapse-only elements (rare but exist, e.g. some tree-view nodes) were skipped entirely by `try_invoke_in_window_at_point`. Found while testing FreeCAD on the Windows VM — click on File menu returned ✅ but the dropdown never appeared because Invoke on Qt menubar items doesn't expand the submenu. Not opening as PR per the in-flight overnight-test directive — branch pushed for backup; user reviews + opens PR when ready. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(test-fixtures): shared cua-driver gesture playground SPA + fixtures dir Creates `libs/cua-driver-fixtures/` as the new shared home for HTML test fixtures used by both Swift cua-driver and Rust cua-driver-rs integration test suites. Previously each port had its own duplicated copy of `interactive.html` etc. — the two `driver_client.py` files have drifted already and the HTML copies will too. This commit adds the *new* shared fixture only; the deprecation path for the duplicated fixtures under each port is documented in the README's Migration plan but not yet executed (would need both ports' integration tests adjusted to resolve from the shared path). ## What's in gesture-playground.html A single self-contained 461-line HTML page with embedded JS covering every cua-driver gesture: | Panel | Tool tested | State exposed | |---|---|---| | 1 click counter | `click(element_index)` | `state.counter` int | | 2 click types | `click`, `right_click`, `double_click` | `state.multi {type, at}` | | 3 type_text mirror | `type_text` | `state.text` | | 4 keyboard / hotkey | `press_key`, `hotkey` with modifier-state check | `state.key {key, code, ctrl, alt, shift, meta}` | | 5 pixel-coord click | `click(x, y)` with pixel-accuracy distance | `state.coord {x, y}`, `state.coord_dist` | | 6 drag-and-drop | `drag` (verifies dragstart → dragover → drop) | `state.drag {dropped_from, dropped_at}` | | 7 scroll | `scroll` | `state.scroll` (px) | | 8 canvas | mousedown/move/up on HTML5 canvas — proves SendInput-vs-PostMessage delivery to custom-drawn surfaces | `state.canvas {type, x, y}` | | 9 form-all-inputs | submit handler with every HTML input type (back-compat with existing fixtures) | `state.form` | | 10 cumulative state dump | reads everything back as JSON for test assertions | — | Each panel has stable `data-test="<id>"` attributes for targeting and inner elements have stable `id`s. State is exposed in a single JSON dump at `#state-dump` so tests can assert end-state in one read. ## Why this matters This playground specifically exercises gaps that surfaced during the overnight Windows VM stress test: - **Modifier-state propagation** (panel 4) verifies SendInput-vs-PostMessage: after `hotkey ["ctrl", "s"]`, `state.key.ctrl === true` proves SendInput is correctly updating GetKeyState - **Pixel accuracy** (panel 5) reports distance from a known target point - **Canvas vs DOM events** (panel 8) catches the universal "PostMessage doesn't reach custom-drawn surfaces" pattern seen in Audacity/GIMP/Blender Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(test-fixtures): consolidate canonical Swift fixtures into shared dir The previous commit on this branch added a separate `gesture-playground.html` SPA with its own ID conventions (`#counter-state`, `#type-state`, etc.) — a parallel harness, not a unified one. That was wrong: both ports already share three canonical HTML fixtures (`interactive.html`, `form_all_inputs.html`, `test_page.html`), and their integration tests target specific stable IDs in those files. The right consolidation is to make those existing fixtures the single source of truth, not to introduce a fourth fixture. ## What this commit does 1. Moves the three canonical fixtures into `libs/cua-driver-fixtures/`: - `interactive.html` (was `cua-driver/Tests/integration/fixtures/`) - `form_all_inputs.html` (was same) - `test_page.html` (was `cua-driver/Tests/integration/assets/`) 2. Replaces each port's copy with a relative symlink to the canonical: ``` cua-driver/Tests/integration/fixtures/interactive.html -> ../../../../cua-driver-fixtures/interactive.html cua-driver/Tests/integration/fixtures/form_all_inputs.html -> ../../../../cua-driver-fixtures/form_all_inputs.html cua-driver/Tests/integration/assets/test_page.html -> ../../../../cua-driver-fixtures/test_page.html cua-driver-rs/tests/integration/fixtures/interactive.html -> ../../../../cua-driver-fixtures/interactive.html cua-driver-rs/tests/integration/v2/assets/test_page.html -> ../../../../../cua-driver-fixtures/test_page.html ``` Existing test files keep working unchanged — every `os.path.join(_THIS_DIR, "fixtures", "interactive.html")` and `f"{html_server}/test_page.html"` resolves through the symlink to the canonical copy. 3. Removes the misguided `gesture-playground.html` from this branch. 4. Adds `gesture_panels.html` — a small (140-line) extension fixture that follows the *same* ID-convention style as `test_page.html`, covering four gestures the v2 harness doesn't probe and that the May 2026 Windows stress test showed needed coverage: - **Hotkey + modifier-state propagation** (`#hotkey-status` prints `ctrl=true|false` so SendInput-vs-PostMessage routing is observable in one assertion — directly the architectural proof for #1614) - **Pixel-coord pinpoint accuracy** (`#coord-status` prints distance from a known target at `(60,60)`) - **Drag-and-drop event sequence** (`#drag-status` records the `dragstart → dragover → drop` chain) - **Scroll position** (`#scroll-status` prints live `scrollTop`) Each panel exposes its state via `window.getGesturePanelState()` for `browser_eval`-based readback when Chromium is launched with `--remote-debugging-port`. ## Result Drift between Swift port and Rust port fixtures is impossible by construction — edits propagate to both ports via the single canonical copy. Net diff: +581 / -1045 lines (the deleted playground + 4 duplicate fixture copies vs. one canonical of each). ## Validated `python3 os.path.isfile` reports all five historic test paths resolve to the canonical copies with correct byte counts; no test code needs to change to consume the consolidated layout. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
The
hotkeytool was silently no-op'ing on modern XAML / UWP hosts (modern Notepad, Settings) becausePostMessage(WM_KEYDOWN/UP)lands in the window's message queue but the CoreInput dispatcher these apps use only reads from the system input queue. Mirrors the fix in #1597 that routedtype_textthrough UIAValuePattern.SetValuefor the same architectural reason —hotkeynow routes through UIA accelerator discovery + pattern activation.Acid test (Windows 11 VM, modern Notepad)
ctrl+bButton "Bold (Ctrl+B)" [actions=[toggle]](toolbar)Pressed ctrl+b on pid X via UIA InvokePattern/TogglePattern— Bold actually togglesctrl+sget_window_statefor inspection (vs silently no-op'ing before)How the routing works
When the target HWND is a XAML host (existing
is_xaml_host_hwndhelper):UIA_AcceleratorKeyPropertyIdon each element. Conventional UWP / WinUI 3 path."Bold (Ctrl+B)". We scan for that pattern, requiring at least one modifier-like token inside the parens so we don't match"(2)"or"(beta)".InvokePattern.Invoke()first, thenTogglePattern.Toggle()(Bold/Italic-style buttons live here — calling.Invokeon them returns the misleading(0x00000000)error).Known limitation (out of scope)
Menu-nested actions (Save, "Save as", Print, etc.) don't expose accelerators outside the closed menu's subtree. Need a richer matcher that walks menus by name + invokes the matching item directly. Tracked separately. The actionable error this PR produces is honest about the limit.
Test plan
cargo build -p cua-driver -p cua-driver-uia --releaseclean on the VMhotkey ctrl+bon modern Notepad toggles Bold (TogglePattern path)hotkey ctrl+son modern Notepad returns actionable error (no exposed accelerator for menu-nested Save)Closes #1607. Related: #1597 (
type_textUIA ValuePattern routing — same architectural fix for the typing path).🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes