perf: speed up iOS replay runner by thymikee · Pull Request #557 · callstack/agent-device

thymikee · 2026-05-18T17:39:31Z

Summary

Speeds up the iOS replay runner while keeping the reliability fixes that fell out of the Maestro comparison.

High-level changes that moved the needle:

Reuses XCTest runner build products safely with version/source/build-setting metadata and artifact manifests, so normal local runs avoid DerivedData cleanup and recursive xctestrun scans.
Adds build-time artifact manifests too, so pnpm build:xcuitest leaves the runtime cache in the fast-path state instead of forcing one follow-up scan.
Routes simple iOS selectors (id, label, text, value) through direct runner queries/taps for click, get, is, and the first wait probe, while preserving snapshot fallback for misses and transport failures.
Keeps wait/find snapshot polling fresh, so the 750ms snapshot cache cannot return stale UI during selector polling.
Preserves hard selector failures like ambiguous matches instead of silently falling back to snapshot and potentially touching/reading a different node.
Scopes replay actions to the active runner/provider, reducing repeated provider/session resolution in replay loops.
Keeps the iOS runner app headless for app sessions when an appBundleId is known, avoiding the visible helper-app hop.
Adds gesture stabilization and keyboard-aware dragging so faster replay actions still wait for the UI state that matters.
Improves iOS text entry by pinning coordinate-targeted fields, waiting for keyboard readiness, verifying append/replacement where appropriate, and repairing only through the pinned target.
Adds Maestro YAML replay support for the benchmark subset so the same form flow can be compared against Maestro.

Touched scope: 64 files. Scope expanded beyond the original benchmark prototype because the comparison exposed iOS runner startup, selector fallback, gesture, and text-entry reliability issues that affected the measured path.

Validation

Validated with cache-focused unit coverage, selector/interaction unit coverage, tooling checks, XCTest builds, and repeated iOS benchmark runs.

Checks that passed:

pnpm exec vitest run src/platforms/ios/__tests__/runner-client.test.ts src/platforms/ios/__tests__/runner-xctestrun.test.ts src/platforms/ios/__tests__/index.test.ts src/daemon/handlers/__tests__/snapshot-handler.test.ts
pnpm exec vitest run src/daemon/handlers/__tests__/interaction.test.ts src/daemon/handlers/__tests__/snapshot-handler.test.ts src/utils/__tests__/interactors.test.ts
node --check scripts/write-xcuitest-cache-metadata.mjs
pnpm build:xcuitest
pnpm format
pnpm check:quick
zsh -n .tmp/maestro-prototype/run-ios-benchmark.zsh
git diff --check

Cache reliability notes:

The xctestrun cache is invalidated by package version, runner source fingerprint, platform/device kind, destination family, bundle ids, signing settings, and performance build settings.
Cached artifact paths are trusted only when the xctestrun and every referenced product still exist with matching mtimes.
Stale/missing manifests fall back to scan or rebuild; package-version mismatches rebuild; stale worktree xctestruns are rejected; expected macOS runner repair failures rebuild once.
The only cache issue found in this pass was performance-only: build-time metadata previously omitted artifact paths, so the next runtime command had to scan DerivedData once. This PR now writes artifact manifests from the build script too. Verified locally that both iOS and macOS metadata contain an xctestrun path and three existing product paths after pnpm build:xcuitest.

Latest serial benchmark, after dropping one local close outlier:

Case	Device / port	Average	Improvement vs 0.14.9
Local worktree agent-device	iPhone 17 Pro / 8082	15.223s	+74.4%
Maestro	iPhone 17 / 8083	21.306s	+64.1%
Global agent-device 0.14.9 baseline	iPhone 16e / 8084	59.372s	baseline

Parallel benchmark mode is also available for visual side-by-side inspection, but it is not the fair timing source on this machine: under three-simulator load the local runner replay stayed stable around 13.7s while app readiness moved between open and explicit wait.

github-actions · 2026-05-18T17:41:28Z

PR Preview Action v1.8.1
🚀 View preview at https://callstackincubator.github.io/agent-device/pr-preview/pr-557/
Built to branch `gh-pages` at 2026-05-18 21:04 UTC. Preview will be ready when the GitHub Pages deployment is complete.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ecba5906b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-18T17:45:07Z

+    while (Date.now() - startedAt < timeoutMs) {
+      const result = await queryDirectIosSelector(params, params.session!, selector);
+      if (result?.found) {
+        const payload = {
+          kind: 'selector',
+          selector: selector.raw,
+          waitedMs: Date.now() - startedAt,
+          selectorChain: [selector.raw],
+          directSelector: true,
+        };
+        recordIfSession(params.sessionStore, params.sessionName, params.req, payload);
+        return payload;
+      }
+      await sleep(DIRECT_WAIT_POLL_INTERVAL_MS);
+    }
+    throw new AppError('COMMAND_FAILED', `wait timed out for selector: ${selector.raw}`);


Fall back when direct iOS wait cannot resolve

For iOS simple selectors this direct path now owns the whole wait loop, but queryDirectIosSelector returns null on runner errors/ambiguous matches and found: false for matches the runner filters out as non-hittable; in both cases this loop keeps polling until it throws instead of falling back to the snapshot selector runtime. That regresses cases the existing waitForSelector path supports, such as a selector that matches multiple nodes or a visible non-hittable/container node, because findSelectorChainMatch only needs at least one snapshot match while the runner-side findElement requires a unique hittable element.

Useful? React with 👍 / 👎.

thymikee force-pushed the codex/maestro-benchmark-rebase-guard branch from ecba590 to 955a1f8 Compare May 18, 2026 17:40

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

thymikee force-pushed the codex/maestro-benchmark-rebase-guard branch 4 times, most recently from 6326daf to c68fd4c Compare May 18, 2026 18:42

perf: speed up iOS replay runner

33a7f5e

thymikee force-pushed the codex/maestro-benchmark-rebase-guard branch from c68fd4c to 33a7f5e Compare May 18, 2026 18:47

thymikee mentioned this pull request May 18, 2026

Track Maestro flow compatibility for replay --maestro #558

Open

thymikee added 3 commits May 18, 2026 22:39

fix: harden ios replay fast paths

fd768a8

fix: address ci validation failures

064cf66

refactor: trim unused ios replay surface

f269420

thymikee merged commit 094c290 into main May 19, 2026
19 checks passed

thymikee deleted the codex/maestro-benchmark-rebase-guard branch May 19, 2026 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: speed up iOS replay runner#557

perf: speed up iOS replay runner#557
thymikee merged 4 commits into
mainfrom
codex/maestro-benchmark-rebase-guard

thymikee commented May 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 18, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-05-18 21:04 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thymikee commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

github-actions Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Built to branch gh-pages at 2026-05-18 21:04 UTC. Preview will be ready when the GitHub Pages deployment is complete.

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thymikee commented May 18, 2026 •

edited

Loading

github-actions Bot commented May 18, 2026 •

edited

Loading

Built to branch `gh-pages` at 2026-05-18 21:04 UTC.
Preview will be ready when the GitHub Pages deployment is complete.