Skip to content

feat(cua-driver-rs)(windows)(#1620): auto-inject Chromium anti-throttling flags in launch_app#1624

Merged
f-trycua merged 1 commit into
mainfrom
feat/cua-driver-rs-windows-launch-app-chromium-flags
May 21, 2026
Merged

feat(cua-driver-rs)(windows)(#1620): auto-inject Chromium anti-throttling flags in launch_app#1624
f-trycua merged 1 commit into
mainfrom
feat/cua-driver-rs-windows-launch-app-chromium-flags

Conversation

@f-trycua
Copy link
Copy Markdown
Collaborator

@f-trycua f-trycua commented May 21, 2026

Summary

launch_app uses SW_SHOWNOACTIVATE so launched windows don't steal focus. For Chromium-based browsers (Edge / Chrome / Brave / Vivaldi / Opera / Chromium / Arc / Thorium / Iridium / Yandex) that triggers occlusion-based renderer throttling — the renderer process is suspended for the entire tab lifetime, the UIA tree exposes only browser chrome, and PrintWindow returns a blank body. Downstream tools (get_window_state, screenshot, click, type_text) all fail silently against the page content.

This PR makes launch_app auto-inject three flags whenever the resolved target names a Chromium-based browser:

--disable-features=CalculateNativeWinOcclusion   ← root cause
--disable-backgrounding-occluded-windows         ← backstop 1 (process priority)
--disable-renderer-backgrounding                  ← backstop 2 (renderer throttle)

Callers don't need to know about the Chromium quirk; the driver handles it.

Implementation

Two pure-logic helpers + a small dispatch hook in LaunchAppTool::run:

  • is_chromium_browser_target(target) — case-insensitive basename match against the known Chromium browser names. Handles bare names ("msedge"), full paths (r"C:\...\msedge.exe"), forward-slash paths, and round-tripped launch paths with trailing arguments (r#""C:\...\chrome.exe" --profile-directory="..."#). Uses split_launchable_target to peel args off launch_path-style targets.

  • inject_chromium_anti_throttling_flags(extra_args) — prepends the three flags. Idempotent: if --disable-features= already exists, merges CalculateNativeWinOcclusion into it (Chromium has subtle merging rules across duplicate --disable-features= entries — collapsing into one entry avoids ambiguity). Boolean flags only inserted when absent.

The injection runs after target resolution in LaunchAppTool::run, gated on the target coming from launch_path / path / name (the ShellExecuteExW path). UWP/AUMID routing is skipped; the modern Edge ships as a desktop install that goes through the ShellExecuteExW path here.

Tests

8 new unit tests under chromium_flag_injection_tests covering:

  • All 10 known Chromium browser names (case-insensitive, with/without .exe)
  • Full Windows paths (C:\...\msedge.exe)
  • Forward-slash paths
  • Round-tripped launch paths with shortcut args
  • Non-Chromium apps don't match (firefox, notepad, explorer, code, soffice)
  • Injection into empty args (base case)
  • Merging into existing --disable-features=Foo,Bar
  • Idempotency (second call is a no-op)
  • URL argument preserved after injection

All 8 pass on the VM (13.77s test compile, 0.00s test exec).

E2E verification

Against libs/cua-driver-fixtures/test_page.html (the canonical Swift harness, now shared per #1619):

launch_app(path='msedge', additional_arguments=['file:///C:/.../test_page.html'])
  → pid 6708, returned without page DOM
get_window_state (after Chromium lazy-builds the tree on first AT probe)
  → 33 elements, includes:
     Document "CUA Driver Test Page v2"
     Button "Click Me" id=clicker actions=[invoke]
screenshot
  → fully painted page (Button + Text Input + Checkbox + Dropdown visible),
    not a blanked body

Edge launched non-foreground via SW_SHOWNOACTIVATE; renderer was NOT occlusion-throttled; DOM constructed, exposed via UIA, and painted — the regression-prevention case the fix targets.

Related

Closes #1620.

Test plan

  • cargo test -p platform-windows --lib chromium_flag_injection_tests — 8 tests pass on the VM
  • cargo build --release -p cua-driver clean (24.79s)
  • E2E: launch_app(path='msedge', additional_arguments=['file:///...']) without flags → page DOM exposed via UIA + screenshot shows rendered content
  • Optional follow-up: extend the test to confirm Chrome / Brave behavior matches Edge (only Edge was on the VM tonight)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation

    • Updated test fixture documentation for cross-platform driver support.
  • New Features

    • Added comprehensive test fixtures for form inputs, interactive gestures, and coordinate-based interactions.
    • Expanded test coverage for keyboard modifiers, drag-and-drop event sequences, and scroll position tracking.
  • Improvements

    • Enhanced Chromium browser support on Windows with automatic performance optimizations.
    • Improved UI element interaction handling for better pixel-coordinate accuracy and control type recognition.

Review Change Stack

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Ignored Ignored Preview May 21, 2026 12:08pm

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b7eecdbf-9054-4256-a533-ab0414731af1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR consolidates HTML test fixtures from both Swift and Rust ports into a shared canonical directory, updates Windows platform driver behavior for hidden Chromium browsers and UIA interactions, and documents fixture coverage and browser compatibility gaps.

Changes

Canonical fixture consolidation and documentation

Layer / File(s) Summary
Fixture documentation and coverage guide
libs/cua-driver-fixtures/README.md
README establishes the canonical fixture directory, documents how both ports reference fixtures via relative paths, introduces gesture_panels.html and its state-polling interface, enumerates Windows/Chromium environment gaps (GPU occlusion, accessibility detection, hotkey integrity), and provides guidance for adding new fixtures.
Canonical HTML fixture files
libs/cua-driver-fixtures/form_all_inputs.html, libs/cua-driver-fixtures/gesture_panels.html, libs/cua-driver-fixtures/interactive.html, libs/cua-driver-fixtures/test_page.html
form_all_inputs.html provides comprehensive form field coverage (text, password, email, number, tel, textarea, select, checkbox, radio, range, date, color) with handleSubmit() and getFieldValues() accessors; gesture_panels.html defines four gesture probes (hotkey/modifier capture, pixel-coordinate pinpoint, drag-and-drop sequence, scroll monitoring) exposing state via getGesturePanelState(); interactive.html provides simple click counter and text mirroring; test_page.html offers comprehensive UI coverage (click, input, checkbox, dropdown, textarea, link, canvas) for baseline integration testing.
Fixture pointer symlinks in both ports
libs/cua-driver-rs/tests/integration/v2/assets/test_page.html, libs/cua-driver/Tests/integration/assets/test_page.html, libs/cua-driver/Tests/integration/fixtures/form_all_inputs.html
Both cua-driver (Swift) and cua-driver-rs (Rust) test directories now reference the canonical fixtures via single-line relative-path pointers instead of maintaining duplicate copies; old local interactive.html instances removed.

Windows platform improvements

Layer / File(s) Summary
Chromium anti-throttling flag injection
libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rs
launch_app now auto-detects Chromium browsers (msedge, chrome, brave, opera, vivaldi, chromium, arc) by resolving executable basename and registry shortcuts, and injects three anti-occlusion flags (--disable-features=CalculateNativeWinOcclusion, --disable-backgrounding-occluded-windows, --disable-renderer-backgrounding) into additional_arguments with deduplication and merging into existing --disable-features= entries. Flag injection is gated to desktop-style launches (not UWP/package routing) and is verified by comprehensive unit tests covering target matching and injection behavior.
UIA coordinate-independent action selection and invocation
libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs
try_invoke_in_window_at_point now accepts descendants with either InvokePattern or ExpandCollapsePattern, classifies elements by control type to identify coordinate-independent actions (canvas, container-like surfaces), filters candidates by coordinate-independence when using screen coordinates, and updates invocation to prefer ExpandCollapse.Expand() over Invoke when both patterns exist, with proper fallback handling. UIA control-type imports expanded to support the new classification logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

The PR spans fixture creation (straightforward HTML/documentation), fixture reorganization (systematic pointer/symlink changes across two port directories), and two independent Windows driver features (Chromium detection+flag injection with comprehensive tests, and UIA selection+invocation logic with moderate complexity). While the total file count is moderately high, each section is coherent and reviewable in sequence without extensive cross-file reasoning.

Possibly related PRs

  • trycua/cua#1549: Also modifies libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs to improve UIA hit-testing behavior via coordinate-independent action selection and pattern preference.
  • trycua/cua#1375: Depends on form_all_inputs.html fixture from this PR; adds background Safari form-fill assertions in test_hermes_form_fill.py that consume handleSubmit(), window._submitted, and getFieldValues().
  • trycua/cua#1551: Also extends try_invoke_in_window_at_point in the same file to refine window-scoped UIA invocation logic.

Poem

🐰 Fixtures once scattered, now one canonical home,
Both Swift and Rust ports no longer roam,
Gesture probes capture hotkeys, clicks, and scrolls,
While Chromium wakes from occlusion's throttling tolls!
UIA patterns expand with wisdom anew.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: auto-injection of Chromium anti-throttling flags in the Windows launch_app implementation.
Linked Issues check ✅ Passed The PR addresses both #1619 (fixture consolidation and gesture_panels.html addition) and #1620 (Chromium flag auto-injection for Windows launch_app) with comprehensive implementation of all requirements.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issues: fixture consolidation (#1619), gesture panel addition (#1619), Chromium flag injection (#1620), and supporting Windows UIA click behavior improvements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cua-driver-rs-windows-launch-app-chromium-flags

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@libs/cua-driver-fixtures/gesture_panels.html`:
- Around line 98-105: The drag event history array dragEvents is persistent
across runs; reset it at the start of each drag by reinitializing dragEvents =
[] inside the src.addEventListener('dragstart', ...) handler before pushing the
first 'dragstart', then update
document.getElementById('drag-status').textContent as before so each drag run is
deterministic and doesn't include stale events.

In `@libs/cua-driver-fixtures/README.md`:
- Around line 30-43: Add a language tag to the fenced code block that contains
the directory tree beginning with "libs/cua-driver/Tests/integration/" (the
README.md snippet showing fixtures/assets and cua-driver-rs paths); change the
opening triple backticks to include a language token such as "text" (e.g.,
```text) so markdownlint MD040 is satisfied while leaving the block content
unchanged.

In `@libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rs`:
- Around line 837-843: The Chromium anti-throttling flags are skipped for
launches that supply a non-AUMID bundle_id because the current condition only
looks at launch_path_opt, path_opt, or name_opt; update the conditional so
launches with a bundle_id also enter the Chromium-injection branch.
Specifically, in the block that checks if launch_path_opt.is_some() ||
path_opt.is_some() || name_opt.is_some(), include the bundle_id option (or
ensure name_opt is set from bundle_id) so that when let Some(t) =
target.as_deref() and is_chromium_browser_target(t) is true you still call
inject_chromium_anti_throttling_flags(&mut extra_args); keep the same
target/is_chromium_browser_target and inject_chromium_anti_throttling_flags
calls but broaden the presence check to include bundle_id.

In `@libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs`:
- Around line 182-211: The inserted helper is now between the existing doc
comment and pub fn try_invoke_in_window_at_point, so the long doc got attached
to is_coord_independent_action and try_invoke_in_window_at_point is left
undocumented and stale; fix by moving is_coord_independent_action (and its short
doc) either before the big doc block or after pub fn
try_invoke_in_window_at_point so the /// block correctly documents
try_invoke_in_window_at_point, and update the doc text to mention that the
routine accepts InvokePattern OR ExpandCollapsePattern and that it filters by
coord-independent control types (reference the functions
is_coord_independent_action and try_invoke_in_window_at_point and the
Invoke/Expand patterns in the updated wording).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: beb8adff-e4d3-49b5-a02b-eef5e127dc66

📥 Commits

Reviewing files that changed from the base of the PR and between 5e9afd6 and 46f72c9.

📒 Files selected for processing (17)
  • libs/cua-driver-fixtures/README.md
  • libs/cua-driver-fixtures/form_all_inputs.html
  • libs/cua-driver-fixtures/gesture_panels.html
  • libs/cua-driver-fixtures/interactive.html
  • libs/cua-driver-fixtures/test_page.html
  • libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rs
  • libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs
  • libs/cua-driver-rs/tests/integration/fixtures/interactive.html
  • libs/cua-driver-rs/tests/integration/fixtures/interactive.html
  • libs/cua-driver-rs/tests/integration/v2/assets/test_page.html
  • libs/cua-driver-rs/tests/integration/v2/assets/test_page.html
  • libs/cua-driver/Tests/integration/assets/test_page.html
  • libs/cua-driver/Tests/integration/assets/test_page.html
  • libs/cua-driver/Tests/integration/fixtures/form_all_inputs.html
  • libs/cua-driver/Tests/integration/fixtures/form_all_inputs.html
  • libs/cua-driver/Tests/integration/fixtures/interactive.html
  • libs/cua-driver/Tests/integration/fixtures/interactive.html

Comment on lines +98 to +105
var dragEvents = [];
var src = document.getElementById('drag-source');
var tgt = document.getElementById('drag-target');
src.addEventListener('dragstart', function(e) {
dragEvents.push('dragstart');
e.dataTransfer.setData('text/plain', 'DRAG ME');
document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → ');
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reset drag event history per drag run.

dragEvents keeps growing across multiple drags, so later assertions may include stale events. Reinitialize it in dragstart to keep each probe deterministic.

Suggested diff
 src.addEventListener('dragstart', function(e) {
+  dragEvents = [];
   dragEvents.push('dragstart');
   e.dataTransfer.setData('text/plain', 'DRAG ME');
   document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → ');
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var dragEvents = [];
var src = document.getElementById('drag-source');
var tgt = document.getElementById('drag-target');
src.addEventListener('dragstart', function(e) {
dragEvents.push('dragstart');
e.dataTransfer.setData('text/plain', 'DRAG ME');
document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → ');
});
var dragEvents = [];
var src = document.getElementById('drag-source');
var tgt = document.getElementById('drag-target');
src.addEventListener('dragstart', function(e) {
dragEvents = [];
dragEvents.push('dragstart');
e.dataTransfer.setData('text/plain', 'DRAG ME');
document.getElementById('drag-status').textContent = 'drag: ' + dragEvents.join(' → ');
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/cua-driver-fixtures/gesture_panels.html` around lines 98 - 105, The drag
event history array dragEvents is persistent across runs; reset it at the start
of each drag by reinitializing dragEvents = [] inside the
src.addEventListener('dragstart', ...) handler before pushing the first
'dragstart', then update document.getElementById('drag-status').textContent as
before so each drag run is deterministic and doesn't include stale events.

Comment on lines +30 to +43
```
libs/cua-driver/Tests/integration/
├── fixtures/
│ ├── interactive.html → ../../../../cua-driver-fixtures/interactive.html
│ └── form_all_inputs.html → ../../../../cua-driver-fixtures/form_all_inputs.html
└── assets/
└── test_page.html → ../../../../cua-driver-fixtures/test_page.html

libs/cua-driver-rs/tests/integration/
├── fixtures/
│ └── interactive.html → ../../../../cua-driver-fixtures/interactive.html
└── v2/assets/
└── test_page.html → ../../../../../cua-driver-fixtures/test_page.html
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced code block.

The block starting at Line 30 is missing a fence language, which triggers markdownlint MD040.

Suggested diff
-```
+```text
 libs/cua-driver/Tests/integration/
 ├── fixtures/
 │   ├── interactive.html      → ../../../../cua-driver-fixtures/interactive.html
 │   └── form_all_inputs.html  → ../../../../cua-driver-fixtures/form_all_inputs.html
 └── assets/
     └── test_page.html        → ../../../../cua-driver-fixtures/test_page.html

 libs/cua-driver-rs/tests/integration/
 ├── fixtures/
 │   └── interactive.html      → ../../../../cua-driver-fixtures/interactive.html
 └── v2/assets/
     └── test_page.html        → ../../../../../cua-driver-fixtures/test_page.html
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 30-30: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/cua-driver-fixtures/README.md` around lines 30 - 43, Add a language tag
to the fenced code block that contains the directory tree beginning with
"libs/cua-driver/Tests/integration/" (the README.md snippet showing
fixtures/assets and cua-driver-rs paths); change the opening triple backticks to
include a language token such as "text" (e.g., ```text) so markdownlint MD040 is
satisfied while leaving the block content unchanged.

Comment on lines +837 to +843
if launch_path_opt.is_some() || path_opt.is_some() || name_opt.is_some() {
if let Some(t) = target.as_deref() {
if is_chromium_browser_target(t) {
inject_chromium_anti_throttling_flags(&mut extra_args);
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include non-AUMID bundle_id launches in the Chromium injection path.

Line 837 excludes bundle_id, even though this tool documents non-AUMID bundle_id as a name alias. That means launch_app({ "bundle_id": "chrome" }) still goes through the ShellExecuteEx path without the anti-throttling flags, so hidden Chromium launches via that supported entry point keep the renderer-occlusion behavior this PR is meant to fix.

🔧 Suggested fix
-        if launch_path_opt.is_some() || path_opt.is_some() || name_opt.is_some() {
+        let uses_shell_execute_target = launch_path_opt.is_some()
+            || path_opt.is_some()
+            || name_opt.is_some()
+            || bundle_id_opt
+                .as_deref()
+                .is_some_and(|s| !crate::launch_uwp::is_aumid(s));
+        if uses_shell_execute_target {
             if let Some(t) = target.as_deref() {
                 if is_chromium_browser_target(t) {
                     inject_chromium_anti_throttling_flags(&mut extra_args);
                 }
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if launch_path_opt.is_some() || path_opt.is_some() || name_opt.is_some() {
if let Some(t) = target.as_deref() {
if is_chromium_browser_target(t) {
inject_chromium_anti_throttling_flags(&mut extra_args);
}
}
}
let uses_shell_execute_target = launch_path_opt.is_some()
|| path_opt.is_some()
|| name_opt.is_some()
|| bundle_id_opt
.as_deref()
.is_some_and(|s| !crate::launch_uwp::is_aumid(s));
if uses_shell_execute_target {
if let Some(t) = target.as_deref() {
if is_chromium_browser_target(t) {
inject_chromium_anti_throttling_flags(&mut extra_args);
}
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/cua-driver-rs/crates/platform-windows/src/tools/impl_.rs` around lines
837 - 843, The Chromium anti-throttling flags are skipped for launches that
supply a non-AUMID bundle_id because the current condition only looks at
launch_path_opt, path_opt, or name_opt; update the conditional so launches with
a bundle_id also enter the Chromium-injection branch. Specifically, in the block
that checks if launch_path_opt.is_some() || path_opt.is_some() ||
name_opt.is_some(), include the bundle_id option (or ensure name_opt is set from
bundle_id) so that when let Some(t) = target.as_deref() and
is_chromium_browser_target(t) is true you still call
inject_chromium_anti_throttling_flags(&mut extra_args); keep the same
target/is_chromium_browser_target and inject_chromium_anti_throttling_flags
calls but broaden the presence check to include bundle_id.

Comment on lines 182 to 211
/// Returns `true` when the element's control type has a *coord-independent*
/// primary action — i.e. a UIA `Invoke()` on it does something semantically
/// equivalent to "click the element" regardless of where inside its bounding
/// rectangle the click was requested.
///
/// Used by the `x, y` click path to decide whether to take the UIA Invoke
/// route or fall through to PostMessage with the literal coords. The split
/// matters for canvases, panes, and custom-drawn surfaces where Invoke would
/// fire `mousedown` at the element centre — losing the caller's pixel
/// precision (see #1621).
fn is_coord_independent_action(elem: &IUIAutomationElement) -> bool {
let ct: UIA_CONTROLTYPE_ID = match unsafe { elem.CurrentControlType() } {
Ok(t) => t,
Err(_) => return false,
};
matches!(
ct,
UIA_ButtonControlTypeId
| UIA_MenuItemControlTypeId
| UIA_HyperlinkControlTypeId
| UIA_TabItemControlTypeId
| UIA_ListItemControlTypeId
| UIA_CheckBoxControlTypeId
| UIA_RadioButtonControlTypeId
| UIA_SplitButtonControlTypeId
| UIA_TreeItemControlTypeId
)
}

pub fn try_invoke_in_window_at_point(hwnd: isize, sx: i32, sy: i32) -> bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

New helper inserted between try_invoke_in_window_at_point's doc block and its definition — doc now misattributed and stale.

The pre-existing /// block at lines 142–181 was the doc for try_invoke_in_window_at_point. Rust attaches /// to the immediately following item, so by inserting is_coord_independent_action (line 192) between that doc and pub fn try_invoke_in_window_at_point (line 211), two things happen:

  1. The entire 142–191 block now documents is_coord_independent_action, which is jarring (most of it describes a windowed UIA invoke walk, not a control-type classifier), and try_invoke_in_window_at_point is left undocumented.
  2. The wording inside that block is now stale anyway — line 144 (fire 'Invoke' on the deepest descendant ... AND which supports 'InvokePattern') and lines 178–181 (exposes 'InvokePattern'. Smallest-area approximates "deepest"...) no longer reflect the new behavior, which accepts InvokePattern OR ExpandCollapsePattern and filters by coord-independent control type.

Move the new helper out of the way (e.g. place is_coord_independent_action and its doc before the existing 142–181 block, or after try_invoke_in_window_at_point) and refresh the stale wording to mention ExpandCollapse + the coord-independent filter.

📝 Suggested re-ordering (place helper before the doc block)
+/// Returns `true` when the element's control type has a *coord-independent*
+/// primary action — i.e. a UIA `Invoke()` on it does something semantically
+/// equivalent to "click the element" regardless of where inside its bounding
+/// rectangle the click was requested.
+///
+/// Used by the `x, y` click path to decide whether to take the UIA Invoke
+/// route or fall through to PostMessage with the literal coords. The split
+/// matters for canvases, panes, and custom-drawn surfaces where Invoke would
+/// fire `mousedown` at the element centre — losing the caller's pixel
+/// precision (see `#1621`).
+fn is_coord_independent_action(elem: &IUIAutomationElement) -> bool {
+    let ct: UIA_CONTROLTYPE_ID = match unsafe { elem.CurrentControlType() } {
+        Ok(t) => t,
+        Err(_) => return false,
+    };
+    matches!(
+        ct,
+        UIA_ButtonControlTypeId
+            | UIA_MenuItemControlTypeId
+            | UIA_HyperlinkControlTypeId
+            | UIA_TabItemControlTypeId
+            | UIA_ListItemControlTypeId
+            | UIA_CheckBoxControlTypeId
+            | UIA_RadioButtonControlTypeId
+            | UIA_SplitButtonControlTypeId
+            | UIA_TreeItemControlTypeId
+    )
+}
+
 /// Hit-test screen point `(sx, sy)` against the UIA subtree rooted at
-/// `hwnd` and fire `Invoke` on the deepest descendant whose bounding
-/// rect contains the point AND which supports `InvokePattern`. Returns
-/// `true` iff such an element was found and `Invoke()` succeeded.
+/// `hwnd` and activate the deepest descendant whose bounding rect contains
+/// the point AND which supports `InvokePattern` or `ExpandCollapsePattern`
+/// AND whose control type has a coord-independent primary action (see
+/// `is_coord_independent_action`). Returns `true` iff such an element was
+/// found and `Invoke()` / `Expand()` succeeded.
 ...
-/// Returns `true` when the element's control type has a *coord-independent*
-/// primary action — i.e. a UIA `Invoke()` on it does something semantically
-/// equivalent to "click the element" regardless of where inside its bounding
-/// rectangle the click was requested.
-///
-/// Used by the `x, y` click path to decide whether to take the UIA Invoke
-/// route or fall through to PostMessage with the literal coords. The split
-/// matters for canvases, panes, and custom-drawn surfaces where Invoke would
-/// fire `mousedown` at the element centre — losing the caller's pixel
-/// precision (see `#1621`).
-fn is_coord_independent_action(elem: &IUIAutomationElement) -> bool {
-    ...
-}
-
 pub fn try_invoke_in_window_at_point(hwnd: isize, sx: i32, sy: i32) -> bool {

Also refresh lines 178–181 to reference Invoke/Expand and the coord-independent filter rather than just InvokePattern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libs/cua-driver-rs/crates/platform-windows/src/uia/windows_enum.rs` around
lines 182 - 211, The inserted helper is now between the existing doc comment and
pub fn try_invoke_in_window_at_point, so the long doc got attached to
is_coord_independent_action and try_invoke_in_window_at_point is left
undocumented and stale; fix by moving is_coord_independent_action (and its short
doc) either before the big doc block or after pub fn
try_invoke_in_window_at_point so the /// block correctly documents
try_invoke_in_window_at_point, and update the doc text to mention that the
routine accepts InvokePattern OR ExpandCollapsePattern and that it filters by
coord-independent control types (reference the functions
is_coord_independent_action and try_invoke_in_window_at_point and the
Invoke/Expand patterns in the updated wording).

…ling flags in launch_app

`launch_app` uses `SW_SHOWNOACTIVATE` so launched windows don't steal
focus. For Chromium-based browsers (Edge / Chrome / Brave / Vivaldi /
Opera / Chromium / Arc / Thorium / Iridium / Yandex) this triggers
occlusion-based renderer throttling: the renderer process is suspended
for the *entire* tab lifetime, the UIA tree exposes only browser chrome,
and `PrintWindow` returns a blank body. Downstream tools
(`get_window_state`, `screenshot`, `click`, `type_text`) all fail
silently against the page content.

## Fix

When `launch_app` resolves a target naming a Chromium-based browser,
auto-prepend three flags to `additional_arguments`:

```
--disable-features=CalculateNativeWinOcclusion   ← root cause
--disable-backgrounding-occluded-windows         ← backstop 1 (process priority)
--disable-renderer-backgrounding                  ← backstop 2 (renderer throttle)
```

`CalculateNativeWinOcclusion` is the root cause; the two
`--disable-backgrounding-*` flags backstop the same effect through the
process-priority and renderer-throttling layers because Chromium
suspends renderers on multiple signals. Injecting all three matches the
flag set documented at Chromium's `chrome://flags`.

Two helpers, both pure logic:

- **`is_chromium_browser_target(target)`** — matches the executable
  basename (case-insensitive, with/without `.exe`) against the known
  Chromium browser names. Handles bare names (`"msedge"`), full paths
  (`r"C:\...\msedge.exe"`), forward-slash paths, and round-tripped
  launch paths with trailing arguments (`r#""C:\...\chrome.exe"
  --profile-directory=..."#`). Uses `split_launchable_target` to peel
  args off launch_path-style targets.

- **`inject_chromium_anti_throttling_flags(extra_args)`** — prepends the
  three flags. Idempotent: if `--disable-features=` already exists in
  the caller's args, merges `CalculateNativeWinOcclusion` into it
  (Chromium has subtle merging rules across duplicate `--disable-features`
  entries — collapsing into one entry avoids ambiguity). The boolean
  flags are only inserted when absent.

## Where the injection runs

After target resolution in `LaunchAppTool::run`, gated on the target
having been resolved from `launch_path` / `path` / `name` (i.e. the
ShellExecuteExW path). UWP/AUMID routing is skipped because the
packaged Edge channel routes differently and the modern Edge ships as a
desktop install that hits the ShellExecuteExW path here.

## Tests

8 new unit tests under `chromium_flag_injection_tests`:
- `detects_bare_browser_names` — all 10 known names match (case-insensitive)
- `detects_full_paths` — both `C:\...` and `C:/...` separators
- `detects_launch_path_with_trailing_args` — `"<exe>" <args>` round-trip
- `does_not_match_non_chromium_apps` — firefox, notepad, explorer, code, soffice
- `injects_three_flags_into_empty_args` — base case
- `merges_into_existing_disable_features_list` — `--disable-features=Foo,Bar`
  + injection = single `--disable-features=Foo,Bar,CalculateNativeWinOcclusion`
- `idempotent_when_all_flags_already_present` — second call is a no-op
- `preserves_user_url_argument_after_flags` — URL stays in args after injection

All 8 pass on the VM (13.77s test compile, 0.00s test execution).

## E2E verification (against #1619's canonical `test_page.html`)

```
launch_app(path='msedge', additional_arguments=['file:///C:/...test_page.html'])
  → pid 6708, returned without page DOM
get_window_state (after Chromium lazy-builds the tree on first AT probe)
  → 33 elements, includes:
     Document "CUA Driver Test Page v2"
     Button "Click Me" id=clicker actions=[invoke]
screenshot
  → fully painted page (Button + Text Input + Checkbox + Dropdown all visible),
    not a blanked body
```

Edge launched non-foreground via `SW_SHOWNOACTIVATE`; renderer was NOT
occlusion-throttled; DOM constructed, exposed via UIA, and painted —
exactly the regression-prevention case the fix targets.

Closes #1620.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@f-trycua f-trycua force-pushed the feat/cua-driver-rs-windows-launch-app-chromium-flags branch from 46f72c9 to 3c19a7a Compare May 21, 2026 12:07
@f-trycua f-trycua merged commit 1fc348c into main May 21, 2026
5 checks passed
@f-trycua f-trycua deleted the feat/cua-driver-rs-windows-launch-app-chromium-flags branch May 21, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cua-driver-rs Windows: launch_app should auto-inject anti-throttling flags for hidden Chromium browsers

1 participant