test-fixtures: consolidate canonical Swift + Rust fixtures into shared dir#1619
Conversation
…enu items (extends #1611) When the UIA element under the click point exposes BOTH InvokePattern AND ExpandCollapsePattern (Qt top-level MenuItems advertise both), the intended behavior is "open the submenu" — Invoke alone is a no-op for menu-bar items. Prefer ExpandCollapse.Expand in that case, fall back to Invoke on failure. Also relaxes the element filter to accept elements that support EITHER pattern (was: InvokePattern only). Without this, ExpandCollapse-only elements (rare but exist, e.g. some tree-view nodes) were skipped entirely by `try_invoke_in_window_at_point`. Found while testing FreeCAD on the Windows VM — click on File menu returned ✅ but the dropdown never appeared because Invoke on Qt menubar items doesn't expand the submenu. Not opening as PR per the in-flight overnight-test directive — branch pushed for backup; user reviews + opens PR when ready. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…res dir
Creates `libs/cua-driver-fixtures/` as the new shared home for HTML test
fixtures used by both Swift cua-driver and Rust cua-driver-rs integration
test suites. Previously each port had its own duplicated copy of
`interactive.html` etc. — the two `driver_client.py` files have drifted
already and the HTML copies will too.
This commit adds the *new* shared fixture only; the deprecation path for
the duplicated fixtures under each port is documented in the README's
Migration plan but not yet executed (would need both ports' integration
tests adjusted to resolve from the shared path).
## What's in gesture-playground.html
A single self-contained 461-line HTML page with embedded JS covering every
cua-driver gesture:
| Panel | Tool tested | State exposed |
|---|---|---|
| 1 click counter | `click(element_index)` | `state.counter` int |
| 2 click types | `click`, `right_click`, `double_click` | `state.multi {type, at}` |
| 3 type_text mirror | `type_text` | `state.text` |
| 4 keyboard / hotkey | `press_key`, `hotkey` with modifier-state check | `state.key {key, code, ctrl, alt, shift, meta}` |
| 5 pixel-coord click | `click(x, y)` with pixel-accuracy distance | `state.coord {x, y}`, `state.coord_dist` |
| 6 drag-and-drop | `drag` (verifies dragstart → dragover → drop) | `state.drag {dropped_from, dropped_at}` |
| 7 scroll | `scroll` | `state.scroll` (px) |
| 8 canvas | mousedown/move/up on HTML5 canvas — proves SendInput-vs-PostMessage delivery to custom-drawn surfaces | `state.canvas {type, x, y}` |
| 9 form-all-inputs | submit handler with every HTML input type (back-compat with existing fixtures) | `state.form` |
| 10 cumulative state dump | reads everything back as JSON for test assertions | — |
Each panel has stable `data-test="<id>"` attributes for targeting and
inner elements have stable `id`s. State is exposed in a single JSON dump
at `#state-dump` so tests can assert end-state in one read.
## Why this matters
This playground specifically exercises gaps that surfaced during the
overnight Windows VM stress test:
- **Modifier-state propagation** (panel 4) verifies SendInput-vs-PostMessage:
after `hotkey ["ctrl", "s"]`, `state.key.ctrl === true` proves SendInput
is correctly updating GetKeyState
- **Pixel accuracy** (panel 5) reports distance from a known target point
- **Canvas vs DOM events** (panel 8) catches the universal "PostMessage
doesn't reach custom-drawn surfaces" pattern seen in Audacity/GIMP/Blender
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ared dir
The previous commit on this branch added a separate `gesture-playground.html`
SPA with its own ID conventions (`#counter-state`, `#type-state`, etc.) — a
parallel harness, not a unified one. That was wrong: both ports already
share three canonical HTML fixtures (`interactive.html`, `form_all_inputs.html`,
`test_page.html`), and their integration tests target specific stable IDs
in those files. The right consolidation is to make those existing fixtures
the single source of truth, not to introduce a fourth fixture.
## What this commit does
1. Moves the three canonical fixtures into `libs/cua-driver-fixtures/`:
- `interactive.html` (was `cua-driver/Tests/integration/fixtures/`)
- `form_all_inputs.html` (was same)
- `test_page.html` (was `cua-driver/Tests/integration/assets/`)
2. Replaces each port's copy with a relative symlink to the canonical:
```
cua-driver/Tests/integration/fixtures/interactive.html -> ../../../../cua-driver-fixtures/interactive.html
cua-driver/Tests/integration/fixtures/form_all_inputs.html -> ../../../../cua-driver-fixtures/form_all_inputs.html
cua-driver/Tests/integration/assets/test_page.html -> ../../../../cua-driver-fixtures/test_page.html
cua-driver-rs/tests/integration/fixtures/interactive.html -> ../../../../cua-driver-fixtures/interactive.html
cua-driver-rs/tests/integration/v2/assets/test_page.html -> ../../../../../cua-driver-fixtures/test_page.html
```
Existing test files keep working unchanged — every `os.path.join(_THIS_DIR,
"fixtures", "interactive.html")` and `f"{html_server}/test_page.html"`
resolves through the symlink to the canonical copy.
3. Removes the misguided `gesture-playground.html` from this branch.
4. Adds `gesture_panels.html` — a small (140-line) extension fixture that
follows the *same* ID-convention style as `test_page.html`, covering four
gestures the v2 harness doesn't probe and that the May 2026 Windows
stress test showed needed coverage:
- **Hotkey + modifier-state propagation** (`#hotkey-status` prints
`ctrl=true|false` so SendInput-vs-PostMessage routing is observable
in one assertion — directly the architectural proof for #1614)
- **Pixel-coord pinpoint accuracy** (`#coord-status` prints distance from
a known target at `(60,60)`)
- **Drag-and-drop event sequence** (`#drag-status` records the
`dragstart → dragover → drop` chain)
- **Scroll position** (`#scroll-status` prints live `scrollTop`)
Each panel exposes its state via `window.getGesturePanelState()` for
`browser_eval`-based readback when Chromium is launched with
`--remote-debugging-port`.
## Result
Drift between Swift port and Rust port fixtures is impossible by construction
— edits propagate to both ports via the single canonical copy.
Net diff: +581 / -1045 lines (the deleted playground + 4 duplicate fixture
copies vs. one canonical of each).
## Validated
`python3 os.path.isfile` reports all five historic test paths resolve to
the canonical copies with correct byte counts; no test code needs to
change to consume the consolidated layout.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
📝 WalkthroughWalkthroughThis PR consolidates HTML test fixtures into a shared canonical library referenced by both cua-driver (Swift/macOS) and cua-driver-rs (Rust), eliminating duplication via symlinks. It also improves Windows UIA click activation to prefer ExpandCollapse patterns for menu handling. ChangesCanonical test fixture library consolidation
Windows UIA pattern matching improvements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs (2)
223-235: 💤 Low valueConsider importing
UIA_ExpandCollapsePatternIdfor consistency.The code uses the fully-qualified path for
UIA_ExpandCollapsePatternId(lines 231, 264, 271, 284) while similar identifiers likeUIA_InvokePatternIdare imported at line 26. Same applies toIUIAutomationExpandCollapsePattern(lines 274, 287) vs the importedIUIAutomationInvokePattern.Suggested import additions
use windows::Win32::UI::Accessibility::{ - CUIAutomation, IUIAutomation, IUIAutomationElement, IUIAutomationInvokePattern, - IUIAutomationTogglePattern, TreeScope_Children, TreeScope_Subtree, + CUIAutomation, IUIAutomation, IUIAutomationElement, IUIAutomationExpandCollapsePattern, + IUIAutomationInvokePattern, IUIAutomationTogglePattern, TreeScope_Children, TreeScope_Subtree, UIA_AcceleratorKeyPropertyId, UIA_InvokePatternId, UIA_PROPERTY_ID, - UIA_TogglePatternId, + UIA_ExpandCollapsePatternId, UIA_TogglePatternId, };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs` around lines 223 - 235, Import UIA_ExpandCollapsePatternId and IUIAutomationExpandCollapsePattern alongside the existing UIA_InvokePatternId/IUIAutomationInvokePattern imports and replace the fully-qualified usages (windows::Win32::UI::Accessibility::UIA_ExpandCollapsePatternId and windows::Win32::UI::Accessibility::IUIAutomationExpandCollapsePattern) with the short names in windows_enum.rs (e.g., where has_expand is computed and where the expand/collapse pattern is referenced) so the code is consistent and easier to read.
139-178: 💤 Low valueUpdate docstring to reflect ExpandCollapsePattern support.
The docstring still describes only
InvokePatternsupport (lines 140-142, 169, 176-177), but the implementation now also handlesExpandCollapsePattern. Update the documentation to accurately describe the expanded behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@libs/cua-driver-fixtures/gesture_panels.html`:
- Around line 98-105: The dragEvents array is never cleared so subsequent drags
accumulate previous events; inside the 'dragstart' event listener registered on
src (the element retrieved with getElementById('drag-source')), reset dragEvents
(e.g., set dragEvents = [] or dragEvents.length = 0) at the start of the handler
before pushing 'dragstart' and updating the drag-status text, ensuring each drag
interaction begins with a fresh history.
In `@libs/cua-driver-fixtures/README.md`:
- Around line 30-43: The fenced code block in libs/cua-driver-fixtures/README.md
(the tree diagram showing libs/cua-driver/Tests/integration and
libs/cua-driver-rs/tests/integration) is missing a language tag; update the
opening fence from ``` to ```text so the block is recognized as plain text
(addressing MD040) and keep the block content unchanged.
---
Nitpick comments:
In `@libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs`:
- Around line 223-235: Import UIA_ExpandCollapsePatternId and
IUIAutomationExpandCollapsePattern alongside the existing
UIA_InvokePatternId/IUIAutomationInvokePattern imports and replace the
fully-qualified usages
(windows::Win32::UI::Accessibility::UIA_ExpandCollapsePatternId and
windows::Win32::UI::Accessibility::IUIAutomationExpandCollapsePattern) with the
short names in windows_enum.rs (e.g., where has_expand is computed and where the
expand/collapse pattern is referenced) so the code is consistent and easier to
read.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6203bbf2-7009-43c8-bf1a-97a184893226
📒 Files selected for processing (16)
libs/cua-driver-fixtures/README.mdlibs/cua-driver-fixtures/form_all_inputs.htmllibs/cua-driver-fixtures/gesture_panels.htmllibs/cua-driver-fixtures/interactive.htmllibs/cua-driver-fixtures/test_page.htmllibs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rslibs/cua-driver-rs/tests/integration/fixtures/interactive.htmllibs/cua-driver-rs/tests/integration/fixtures/interactive.htmllibs/cua-driver-rs/tests/integration/v2/assets/test_page.htmllibs/cua-driver-rs/tests/integration/v2/assets/test_page.htmllibs/cua-driver/Tests/integration/assets/test_page.htmllibs/cua-driver/Tests/integration/assets/test_page.htmllibs/cua-driver/Tests/integration/fixtures/form_all_inputs.htmllibs/cua-driver/Tests/integration/fixtures/form_all_inputs.htmllibs/cua-driver/Tests/integration/fixtures/interactive.htmllibs/cua-driver/Tests/integration/fixtures/interactive.html
| var dragEvents = []; | ||
| var src = document.getElementById('drag-source'); | ||
| var tgt = document.getElementById('drag-target'); | ||
| src.addEventListener('dragstart', function(e) { | ||
| dragEvents.push('dragstart'); | ||
| e.dataTransfer.setData('text/plain', 'DRAG ME'); | ||
| document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → '); | ||
| }); |
There was a problem hiding this comment.
Reset drag event history at the start of each drag interaction.
dragEvents is never cleared, so a second drag includes events from earlier runs and can make test assertions flaky.
Suggested fix
src.addEventListener('dragstart', function(e) {
- dragEvents.push('dragstart');
+ dragEvents = ['dragstart'];
e.dataTransfer.setData('text/plain', 'DRAG ME');
document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → ');
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| var dragEvents = []; | |
| var src = document.getElementById('drag-source'); | |
| var tgt = document.getElementById('drag-target'); | |
| src.addEventListener('dragstart', function(e) { | |
| dragEvents.push('dragstart'); | |
| e.dataTransfer.setData('text/plain', 'DRAG ME'); | |
| document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → '); | |
| }); | |
| var dragEvents = []; | |
| var src = document.getElementById('drag-source'); | |
| var tgt = document.getElementById('drag-target'); | |
| src.addEventListener('dragstart', function(e) { | |
| dragEvents = ['dragstart']; | |
| e.dataTransfer.setData('text/plain', 'DRAG ME'); | |
| document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → '); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libs/cua-driver-fixtures/gesture_panels.html` around lines 98 - 105, The
dragEvents array is never cleared so subsequent drags accumulate previous
events; inside the 'dragstart' event listener registered on src (the element
retrieved with getElementById('drag-source')), reset dragEvents (e.g., set
dragEvents = [] or dragEvents.length = 0) at the start of the handler before
pushing 'dragstart' and updating the drag-status text, ensuring each drag
interaction begins with a fresh history.
| ``` | ||
| libs/cua-driver/Tests/integration/ | ||
| ├── fixtures/ | ||
| │ ├── interactive.html → ../../../../cua-driver-fixtures/interactive.html | ||
| │ └── form_all_inputs.html → ../../../../cua-driver-fixtures/form_all_inputs.html | ||
| └── assets/ | ||
| └── test_page.html → ../../../../cua-driver-fixtures/test_page.html | ||
|
|
||
| libs/cua-driver-rs/tests/integration/ | ||
| ├── fixtures/ | ||
| │ └── interactive.html → ../../../../cua-driver-fixtures/interactive.html | ||
| └── v2/assets/ | ||
| └── test_page.html → ../../../../../cua-driver-fixtures/test_page.html | ||
| ``` |
There was a problem hiding this comment.
Add a language tag to the fenced tree block.
The code fence starting at Line 30 is missing a language identifier (MD040), which will keep markdown lint noisy.
Suggested fix
-```
+```text
libs/cua-driver/Tests/integration/
├── fixtures/
│ ├── interactive.html → ../../../../cua-driver-fixtures/interactive.html
│ └── form_all_inputs.html → ../../../../cua-driver-fixtures/form_all_inputs.html
└── assets/
└── test_page.html → ../../../../cua-driver-fixtures/test_page.html
libs/cua-driver-rs/tests/integration/
├── fixtures/
│ └── interactive.html → ../../../../cua-driver-fixtures/interactive.html
└── v2/assets/
└── test_page.html → ../../../../../cua-driver-fixtures/test_page.html</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 30-30: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libs/cua-driver-fixtures/README.md` around lines 30 - 43, The fenced code
block in libs/cua-driver-fixtures/README.md (the tree diagram showing
libs/cua-driver/Tests/integration and libs/cua-driver-rs/tests/integration) is
missing a language tag; update the opening fence from ``` to ```text so the
block is recognized as plain text (addressing MD040) and keep the block content
unchanged.
…ling flags in launch_app `launch_app` uses `SW_SHOWNOACTIVATE` so launched windows don't steal focus. For Chromium-based browsers (Edge / Chrome / Brave / Vivaldi / Opera / Chromium / Arc / Thorium / Iridium / Yandex) this triggers occlusion-based renderer throttling: the renderer process is suspended for the *entire* tab lifetime, the UIA tree exposes only browser chrome, and `PrintWindow` returns a blank body. Downstream tools (`get_window_state`, `screenshot`, `click`, `type_text`) all fail silently against the page content. ## Fix When `launch_app` resolves a target naming a Chromium-based browser, auto-prepend three flags to `additional_arguments`: ``` --disable-features=CalculateNativeWinOcclusion ← root cause --disable-backgrounding-occluded-windows ← backstop 1 (process priority) --disable-renderer-backgrounding ← backstop 2 (renderer throttle) ``` `CalculateNativeWinOcclusion` is the root cause; the two `--disable-backgrounding-*` flags backstop the same effect through the process-priority and renderer-throttling layers because Chromium suspends renderers on multiple signals. Injecting all three matches the flag set documented at Chromium's `chrome://flags`. Two helpers, both pure logic: - **`is_chromium_browser_target(target)`** — matches the executable basename (case-insensitive, with/without `.exe`) against the known Chromium browser names. Handles bare names (`"msedge"`), full paths (`r"C:\...\msedge.exe"`), forward-slash paths, and round-tripped launch paths with trailing arguments (`r#""C:\...\chrome.exe" --profile-directory=..."#`). Uses `split_launchable_target` to peel args off launch_path-style targets. - **`inject_chromium_anti_throttling_flags(extra_args)`** — prepends the three flags. Idempotent: if `--disable-features=` already exists in the caller's args, merges `CalculateNativeWinOcclusion` into it (Chromium has subtle merging rules across duplicate `--disable-features` entries — collapsing into one entry avoids ambiguity). The boolean flags are only inserted when absent. ## Where the injection runs After target resolution in `LaunchAppTool::run`, gated on the target having been resolved from `launch_path` / `path` / `name` (i.e. the ShellExecuteExW path). UWP/AUMID routing is skipped because the packaged Edge channel routes differently and the modern Edge ships as a desktop install that hits the ShellExecuteExW path here. ## Tests 8 new unit tests under `chromium_flag_injection_tests`: - `detects_bare_browser_names` — all 10 known names match (case-insensitive) - `detects_full_paths` — both `C:\...` and `C:/...` separators - `detects_launch_path_with_trailing_args` — `"<exe>" <args>` round-trip - `does_not_match_non_chromium_apps` — firefox, notepad, explorer, code, soffice - `injects_three_flags_into_empty_args` — base case - `merges_into_existing_disable_features_list` — `--disable-features=Foo,Bar` + injection = single `--disable-features=Foo,Bar,CalculateNativeWinOcclusion` - `idempotent_when_all_flags_already_present` — second call is a no-op - `preserves_user_url_argument_after_flags` — URL stays in args after injection All 8 pass on the VM (13.77s test compile, 0.00s test execution). ## E2E verification (against #1619's canonical `test_page.html`) ``` launch_app(path='msedge', additional_arguments=['file:///C:/...test_page.html']) → pid 6708, returned without page DOM get_window_state (after Chromium lazy-builds the tree on first AT probe) → 33 elements, includes: Document "CUA Driver Test Page v2" Button "Click Me" id=clicker actions=[invoke] screenshot → fully painted page (Button + Text Input + Checkbox + Dropdown all visible), not a blanked body ``` Edge launched non-foreground via `SW_SHOWNOACTIVATE`; renderer was NOT occlusion-throttled; DOM constructed, exposed via UIA, and painted — exactly the regression-prevention case the fix targets. Closes #1620. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ling flags in launch_app (#1624) `launch_app` uses `SW_SHOWNOACTIVATE` so launched windows don't steal focus. For Chromium-based browsers (Edge / Chrome / Brave / Vivaldi / Opera / Chromium / Arc / Thorium / Iridium / Yandex) this triggers occlusion-based renderer throttling: the renderer process is suspended for the *entire* tab lifetime, the UIA tree exposes only browser chrome, and `PrintWindow` returns a blank body. Downstream tools (`get_window_state`, `screenshot`, `click`, `type_text`) all fail silently against the page content. ## Fix When `launch_app` resolves a target naming a Chromium-based browser, auto-prepend three flags to `additional_arguments`: ``` --disable-features=CalculateNativeWinOcclusion ← root cause --disable-backgrounding-occluded-windows ← backstop 1 (process priority) --disable-renderer-backgrounding ← backstop 2 (renderer throttle) ``` `CalculateNativeWinOcclusion` is the root cause; the two `--disable-backgrounding-*` flags backstop the same effect through the process-priority and renderer-throttling layers because Chromium suspends renderers on multiple signals. Injecting all three matches the flag set documented at Chromium's `chrome://flags`. Two helpers, both pure logic: - **`is_chromium_browser_target(target)`** — matches the executable basename (case-insensitive, with/without `.exe`) against the known Chromium browser names. Handles bare names (`"msedge"`), full paths (`r"C:\...\msedge.exe"`), forward-slash paths, and round-tripped launch paths with trailing arguments (`r#""C:\...\chrome.exe" --profile-directory=..."#`). Uses `split_launchable_target` to peel args off launch_path-style targets. - **`inject_chromium_anti_throttling_flags(extra_args)`** — prepends the three flags. Idempotent: if `--disable-features=` already exists in the caller's args, merges `CalculateNativeWinOcclusion` into it (Chromium has subtle merging rules across duplicate `--disable-features` entries — collapsing into one entry avoids ambiguity). The boolean flags are only inserted when absent. ## Where the injection runs After target resolution in `LaunchAppTool::run`, gated on the target having been resolved from `launch_path` / `path` / `name` (i.e. the ShellExecuteExW path). UWP/AUMID routing is skipped because the packaged Edge channel routes differently and the modern Edge ships as a desktop install that hits the ShellExecuteExW path here. ## Tests 8 new unit tests under `chromium_flag_injection_tests`: - `detects_bare_browser_names` — all 10 known names match (case-insensitive) - `detects_full_paths` — both `C:\...` and `C:/...` separators - `detects_launch_path_with_trailing_args` — `"<exe>" <args>` round-trip - `does_not_match_non_chromium_apps` — firefox, notepad, explorer, code, soffice - `injects_three_flags_into_empty_args` — base case - `merges_into_existing_disable_features_list` — `--disable-features=Foo,Bar` + injection = single `--disable-features=Foo,Bar,CalculateNativeWinOcclusion` - `idempotent_when_all_flags_already_present` — second call is a no-op - `preserves_user_url_argument_after_flags` — URL stays in args after injection All 8 pass on the VM (13.77s test compile, 0.00s test execution). ## E2E verification (against #1619's canonical `test_page.html`) ``` launch_app(path='msedge', additional_arguments=['file:///C:/...test_page.html']) → pid 6708, returned without page DOM get_window_state (after Chromium lazy-builds the tree on first AT probe) → 33 elements, includes: Document "CUA Driver Test Page v2" Button "Click Me" id=clicker actions=[invoke] screenshot → fully painted page (Button + Text Input + Checkbox + Dropdown all visible), not a blanked body ``` Edge launched non-foreground via `SW_SHOWNOACTIVATE`; renderer was NOT occlusion-throttled; DOM constructed, exposed via UIA, and painted — exactly the regression-prevention case the fix targets. Closes #1620. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…nstall.ps1 PS 5.1 workaround (#1627) * docs(cua-driver): Windows behavior notes for the v0.2.9 fix chain + install.ps1 PS 5.1 workaround Tonight's three cua-driver-rs Windows fixes (#1620 Chromium anti-throttling flag auto-inject in `launch_app`, #1621 control-type whitelist for the `click(x, y)` UIA Invoke pre-check, #1623 SendInput routing for Chromium coord clicks) shipped in v0.2.9 without docs updates. This PR closes that gap and documents the install.ps1 PS 5.1 parse bug as a known issue. ## mcp-tools.mdx - New top-level section `## Windows behavior notes` at the end of the reference, gathering the three cross-cutting changes: - `launch_app` Chromium flag list + the 10 detected browser executables - `click(x, y)` control-type whitelist (Button / MenuItem / Hyperlink / TabItem / ListItem / CheckBox / RadioButton / SplitButton / TreeItem) + why canvases / Panes / Customs fall through - SendInput on Chromium with brief foreground swap + cursor jump, the UIAccess requirement, and the `cua-driver-uia.exe` proxy default - `hotkey`'s SendInput-routed delivery + matching UIAccess constraint - Inline cross-references from `click`, `launch_app`, and `hotkey` pointing to the Windows behavior section so callers reading any of those tool entries see the platform-specific notes. ## installation.mdx - Callout under the Windows install one-liner documenting #1626 (PS 5.1 parse error on `install.ps1`) with the manual-zip workaround verbatim from the issue, scoped to PS 5.1 only (PS 7+ parses fine). Closes the standing /docs update obligation for #1619, #1620, #1621, #1623. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(cua-driver): make manual-zip PATH update idempotent (CodeRabbit on #1627) Re-running the manual-install workaround duplicated `$dest` in the User PATH because the snippet unconditionally prepended. Guards with a `-notcontains` check before `SetEnvironmentVariable` so the entry is added at most once. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
interactive.html,form_all_inputs.html,test_page.html) to a new top-levellibs/cua-driver-fixtures/directory — the single source of truth shared by bothcua-driver(Swift / macOS) andcua-driver-rs(Rust / Windows + Linux + macOS).os.path.join(_THIS_DIR, "fixtures", "interactive.html"),f"{html_server}/test_page.html") keeps working unchanged.gesture_panels.html— a 140-line companion fixture using the same ID-convention style astest_page.html, covering the four gestures the v2 harness doesn't currently probe: hotkey + modifier-state propagation, pixel-coord pinpoint, drag-and-drop sequence, scroll. Each panel exposes its state viawindow.getGesturePanelState()forbrowser_evalreadback.Why
Both ports' integration test trees previously carried duplicate copies of the same HTML fixtures —
interactive.htmlin two places,test_page.htmlin two places. The duplicate copies are byte-identical today (diff -qshows no differences), but with no link between them they would drift the moment one port iterated on its harness.gesture_panels.htmlexists because the May 2026 Windows VM stress test (Notepad++/VS Code/LibreOffice/FreeCAD/Inkscape/Audacity/Krita) showed four gestures need explicit harness probes thattest_page.htmldoesn't cover. Most critical: modifier-state propagation — the page-level#hotkey-statusprintsctrl=true|falseso SendInput-vs-PostMessage hotkey routing is observable in a single DOM assertion, directly proving #1614's architectural fix.What this is NOT
gesture_panels.htmldriver (no test invokes it yet — that's a follow-up once the Chromium-on-Windows browser-eval harness is wired up; the file is added now so it lives alongside the canonical fixtures from the start).Result
Drift between Swift and Rust port fixtures is impossible by construction — edits propagate to both ports via the single canonical copy. Net diff: +581 / -1045 lines.
Test plan
test_page.htmlviahtml_server(no path changes)fixtures/interactive.htmlviaos.path.joingit clonewithcore.symlinks=true(default on POSIX), the symlinks resolve to the canonical files (validated: all 5 paths report correct byte counts)gesture_panels.htmlopens in any browser and the four panels update their status divs on direct interactiongesture_panels.htmlinto a Windows-specific cua-driver-rs pytest that drives Edge with--remote-debugging-portforbrowser_evalreadback (separate PR — needs the Chromium harness improvements documented in the README's "Known browser-coverage gaps" section)🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
Bug Fixes
Documentation
Tests