Skip to content

fix(ios): expose Launch/Terminate as object schemas with uri field#2383

Merged
quanru merged 5 commits intomainfrom
fix/ios-launch-cli-schema-2313
Apr 23, 2026
Merged

fix(ios): expose Launch/Terminate as object schemas with uri field#2383
quanru merged 5 commits intomainfrom
fix/ios-launch-cli-schema-2313

Conversation

@quanru
Copy link
Copy Markdown
Collaborator

@quanru quanru commented Apr 21, 2026

Summary

Fixes #2313. npx @midscene/ios launch <bundle-id> could not pass the bundle id on to the device: the CLI had no --uri flag, positional args were dropped, and the handler ended up calling z.string().parse({}), surfacing as "Expected string, received object".

Root cause

  • packages/ios/src/device.ts:1040 — the Launch and Terminate actions used a bare z.string() paramSchema, unlike Android/Harmony which use z.object({ uri: z.string() }).
  • packages/shared/src/mcp/tool-generator.ts:158-168extractActionSchema only surfaces named fields when paramSchema is a ZodObject. For a primitive schema it casts the ZodString instance to a Record, so every enumerable prototype property leaks into the CLI schema (--parse, --safeParse, --_def, …) and there is no --uri to bind to.
  • packages/shared/src/cli/cli-args.ts:25-27parseCliArgs only captures --flag args and silently drops positional arguments, so launch com.xx.xx.xxx never even reaches validation.
  • packages/core/src/common.ts:710 — with empty parsed args, parseActionParam({}, z.string()) throws Expected string, received object, which is the error the reporter saw.

Before the fix, midscene-ios launch --help surfaced the leak:

Options:
  --spa
  --_def     App name, bundle ID, or URL to launch. ...
  --parse
  --safeParse
  --parseAsync
  ...

After the fix:

Options:
  --uri      App name, bundle ID, or URL to launch. ...
  --device-id   iOS device UDID (aliases: --deviceId)
  --wda-host    WebDriverAgent host ...
  ...

Changes

  • packages/ios/src/device.ts: switch launchParamSchema and terminateParamSchema to z.object({ uri: z.string() }) (matches Android/Harmony). The call implementations now read param.uri and reject empty strings with a consistent error.
  • packages/ios/src/agent.ts: replace the auto-wrapped launch/terminate fields with explicit async launch(uri: string) / async terminate(uri: string) methods that wrap the argument into { uri } before dispatching. Preserves the existing programmatic API used by docs, agent.launch('com.apple.Preferences'), YAML CLI setup (packages/cli/src/create-yaml-player.ts:330), and existing tests.
  • packages/ios/tests/unit-test/device.test.ts: regression tests asserting that Launch/Terminate paramSchemas are ZodObject with a uri: ZodString field, that Launch.call delegates param.uri to device.launch, and that empty uri is rejected.
  • packages/ios/tests/unit-test/agent.test.ts: update the mocked action-space call stubs to forward param.uri (mirroring the real implementation).

YAML shortcuts (ios.launch: com.xxx) already wrap string values into { uri } via buildShortcutActionParam (packages/core/src/yaml/player.ts:108), so no YAML changes were needed.

Validation

  • pnpm run lint
  • npx nx test ios — 114 tests pass (including the new regression tests)
  • npx nx test ios-mcp — 2 tests pass
  • npx nx test cli — 114 tests pass
  • npx nx build ios — success (including declaration bundling)
  • Manual: ran the CLI through runToolsCLI with argv ['launch', '--help'], ['launch', '--uri', 'com.xx.xx.xxx'], and ['launch', 'com.xx.xx.xxx']. --uri is now accepted and the only remaining error is the expected "WebDriverAgent not running" when no device is present.

Test plan

  • Run midscene-ios launch --uri com.apple.Preferences against a connected simulator/device and confirm it launches.
  • Run midscene-ios launch --help and confirm --uri is the only action option listed (no --parse/--safeParse/etc. leak).
  • Run a YAML script containing ios.launch: com.apple.Preferences and confirm the app launches (covers the shortcut path).
  • Run existing code that uses agent.launch('com.apple.Preferences') and confirm it still works (programmatic API unchanged).

The Launch and Terminate actions used a bare `z.string()` paramSchema.
Because the CLI/MCP schema layer only surfaces fields from ZodObject
shapes, there was no `--uri` flag to bind to and `launch --help` instead
listed leaked ZodString prototype methods (`--parse`, `--safeParse`,
`--_def`, ...). Running `midscene-ios launch com.xx.xx.xxx` silently
dropped the positional argument and eventually tripped
`z.string().parse({})`, surfacing as "Expected string, received object".

Switch both actions to `z.object({ uri })` to match Android/Harmony,
read `param.uri` in the `call` implementation, and keep
`agent.launch(uri)` / `agent.terminate(uri)` as string-typed wrappers so
existing programmatic callers (docs, YAML setup, tests) continue to work.

Fixes #2313.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 21, 2026

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: c142f77
Status: ✅  Deploy successful!
Preview URL: https://ce93cbe6.midscene.pages.dev
Branch Preview URL: https://fix-ios-launch-cli-schema-23.midscene.pages.dev

View logs

quanru added 4 commits April 21, 2026 16:04
Before this change `extractActionSchema` fell through to
`paramSchema as Record<string, z.ZodTypeAny>` when the schema wasn't a
ZodObject. That cast leaked the Zod instance's prototype methods
(`parse`, `safeParse`, `_def`, ...) as CLI flags and let a primitive
schema like `z.string()` silently reach runtime, which is how #2313
slipped past review on iOS while Android and Harmony were fine.

Throw a named, actionable error at tool-definition time when a schema
isn't a ZodObject. Any future platform that copies the iOS mistake will
now fail the platform's own `initTools` unit tests instead of the user's
CLI invocation. Added positive and negative unit tests in
`packages/shared`.
…contract

- iOS device.test: add Terminate.call routing + empty-uri rejection to
  mirror the Launch coverage added in the fix commit.
- iOS cli-launch.test (new): drive `runToolsCLI` with real argv for
  `launch --uri`, `terminate --uri`, `launch --help`, `terminate --help`
  and an invalid `--bundleId`. Guards the exact user-facing surface that
  regressed in #2313 — both the routing of the uri arg and the absence
  of ZodString prototype leaks (--parse / --safeParse / --_def / etc.)
  from the help output.
- Android page.test + Harmony device.test: assert Launch/Terminate
  paramSchemas are `z.object({ uri: z.string() })`. The shared
  tool-generator now rejects non-object schemas, but it does not pin
  field names — these assertions stop a future drift like "Android keeps
  uri, iOS renames to bundleId" from silently breaking CLI ergonomics.
@quanru quanru merged commit 3c2c00c into main Apr 23, 2026
9 checks passed
@quanru quanru deleted the fix/ios-launch-cli-schema-2313 branch April 23, 2026 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: midscene ios CLI launch command is not work

2 participants