Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ The project is in early development and considered experimental. Pull requests a
- Core commands: `open`, `back`, `home`, `app-switcher`, `press`, `long-press`, `focus`, `type`, `fill`, `scroll`, `scrollintoview`, `wait`, `alert`, `screenshot`, `close`, `reinstall`, `push`, `trigger-app-event`.
- Inspection commands: `snapshot` (accessibility tree), `diff snapshot` (structural baseline diff), `appstate`, `apps`, `devices`.
- Clipboard commands: `clipboard read`, `clipboard write <text>`.
- Keyboard commands: `keyboard status|get|dismiss` (Android).
- Performance command: `perf` (alias: `metrics`) returns a metrics JSON blob for the active session; startup timing is currently sampled.
- App logs and traffic inspection: `logs path` returns session log metadata; `logs start` / `logs stop` stream app output; `logs clear` truncates session app logs; `logs clear --restart` resets and restarts stream in one step; `logs doctor` checks readiness; `logs mark` writes timeline markers; `network dump` parses recent HTTP(s) entries from session logs.
- Device tooling: `adb` (Android), `simctl`/`devicectl` (iOS via Xcode).
Expand Down Expand Up @@ -152,6 +153,7 @@ agent-device scrollintoview @e42
- `trace start`, `trace stop`
- `logs path`, `logs start`, `logs stop`, `logs clear`, `logs clear --restart`, `logs doctor`, `logs mark` (session app log file for grep; iOS simulator + iOS device + Android)
- `clipboard read`, `clipboard write <text>` (iOS simulator + Android)
- `keyboard [status|get|dismiss]` (Android emulator/device)
- `network dump [limit] [summary|headers|body|all]`, `network log ...` (best-effort HTTP(s) parsing from session app log)
- `settings wifi|airplane|location on|off`
- `settings appearance light|dark|toggle`
Expand Down Expand Up @@ -418,6 +420,12 @@ Clipboard:
- Supported on Android emulator/device and iOS simulator.
- iOS physical devices currently return `UNSUPPORTED_OPERATION` for clipboard commands.

Keyboard:
- `keyboard status` (or `keyboard get`) reports Android keyboard visibility and best-effort input type classification (`text`, `number`, `email`, `phone`, `password`, `datetime`).
- `keyboard dismiss` issues Android back keyevent only when keyboard is visible, then verifies hidden state.
- Works with an active session device or explicit selectors (`--platform`, `--device`, `--udid`, `--serial`).
- Supported on Android emulator/device.

## Debug

- **App logs (token-efficient):** Logging is off by default in normal flows. Enable it on demand when debugging. With an active session, run `logs path` to get path + state metadata (e.g. `<state-dir>/sessions/<session>/app.log`). Run `logs start` to stream app output to that file; use `logs stop` to stop. Run `logs clear` to truncate `app.log` (and remove rotated `app.log.N` files) before a new repro window. Run `logs doctor` for tool/runtime checks and `logs mark "step"` to insert timeline markers. Grep the file when you need to inspect errors (e.g. `grep -n "Error\|Exception" <path>`) instead of pulling full logs into context. Supported on iOS simulator, iOS physical device, and Android.
Expand Down
3 changes: 3 additions & 0 deletions skills/agent-device/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,8 @@ agent-device is visible 'id="anchor"'
agent-device appstate
agent-device clipboard read
agent-device clipboard write "token"
agent-device keyboard status
agent-device keyboard dismiss
agent-device perf --json
agent-device network dump [limit] [summary|headers|body|all]
agent-device push <bundle|package> <payload.json|inline-json>
Expand Down Expand Up @@ -169,6 +171,7 @@ agent-device batch --steps-file /tmp/batch-steps.json --json
- Use `fill` for clear-then-type semantics; use `type` for focused append typing.
- iOS `appstate` is session-scoped; Android `appstate` is live foreground state.
- Clipboard helpers: `clipboard read` / `clipboard write <text>` are supported on Android and iOS simulators; iOS physical devices are not supported yet.
- Android keyboard helpers: `keyboard status|get|dismiss` report keyboard visibility/type and dismiss via keyevent when visible.
- `network dump` is best-effort and parses HTTP(s) entries from the session app log file.
- Biometric settings: iOS simulator supports `settings faceid|touchid <match|nonmatch|enroll|unenroll>`; Android supports `settings fingerprint <match|nonmatch>` where runtime tooling is available.
- For AndroidTV/tvOS selection, always pair `--target` with `--platform` (`ios`, `android`, or `apple` alias); target-only selection is invalid.
Expand Down
7 changes: 7 additions & 0 deletions src/core/__tests__/capabilities.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ test('simulator-only iOS commands with Android support reject iOS devices', () =
}
});

test('keyboard command is Android-only', () => {
assert.equal(isCommandSupportedOnDevice('keyboard', iosSimulator), false, 'keyboard on iOS sim');
assert.equal(isCommandSupportedOnDevice('keyboard', iosDevice), false, 'keyboard on iOS device');
assert.equal(isCommandSupportedOnDevice('keyboard', androidDevice), true, 'keyboard on Android');
});

test('swipe supports iOS simulator, iOS device, and Android', () => {
assert.equal(isCommandSupportedOnDevice('swipe', iosSimulator), true, 'swipe on iOS sim');
assert.equal(isCommandSupportedOnDevice('swipe', iosDevice), true, 'swipe on iOS device');
Expand Down Expand Up @@ -127,6 +133,7 @@ test('tvOS follows iOS capability matrix by device kind', () => {
for (const cmd of ['pinch', 'push', 'settings', 'alert']) {
assert.equal(isCommandSupportedOnDevice(cmd, tvOsSimulator), true, `${cmd} on tvOS simulator`);
}
assert.equal(isCommandSupportedOnDevice('keyboard', tvOsSimulator), false, 'keyboard on tvOS simulator');
});

test('unknown commands default to supported', () => {
Expand Down
1 change: 1 addition & 0 deletions src/core/capabilities.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ const COMMAND_CAPABILITY_MATRIX: Record<string, CommandCapability> = {
boot: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
click: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
clipboard: { ios: { simulator: true }, android: { emulator: true, device: true, unknown: true } },
keyboard: { ios: {}, android: { emulator: true, device: true, unknown: true } },
close: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
fill: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
diff: { ios: { simulator: true, device: true }, android: { emulator: true, device: true, unknown: true } },
Expand Down
35 changes: 35 additions & 0 deletions src/core/dispatch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ import { listAndroidDevices } from '../platforms/android/devices.ts';
import {
appSwitcherAndroid,
backAndroid,
dismissAndroidKeyboard,
ensureAdb,
getAndroidKeyboardState,
homeAndroid,
pushAndroidNotification,
readAndroidClipboardText,
Expand Down Expand Up @@ -461,6 +463,39 @@ export async function dispatchCommand(
else await writeAndroidClipboardText(device, text);
return { action, textLength: Array.from(text).length };
}
case 'keyboard': {
if (device.platform !== 'android') {
throw new AppError('UNSUPPORTED_OPERATION', 'keyboard is currently supported only on Android');
}
const action = (positionals[0] ?? 'status').toLowerCase();
if (action !== 'status' && action !== 'get' && action !== 'dismiss') {
throw new AppError('INVALID_ARGS', 'keyboard requires a subcommand: status, get, or dismiss');
}
if (positionals.length > 1) {
throw new AppError('INVALID_ARGS', 'keyboard accepts at most one subcommand argument');
}
if (action === 'dismiss') {
const result = await dismissAndroidKeyboard(device);
return {
platform: 'android',
action: 'dismiss',
attempts: result.attempts,
wasVisible: result.wasVisible,
dismissed: result.dismissed,
visible: result.visible,
inputType: result.inputType,
type: result.type,
};
}
const state = await getAndroidKeyboardState(device);
return {
platform: 'android',
action: 'status',
visible: state.visible,
inputType: state.inputType,
type: state.type,
};
}
case 'settings': {
const [setting, state, target, mode, appBundleId] = positionals;
const permissionOptions =
Expand Down
106 changes: 106 additions & 0 deletions src/daemon/handlers/__tests__/session.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -838,6 +838,112 @@ test('clipboard requires an active session or explicit device selector', async (
}
});

test('keyboard requires an active session or explicit device selector', async () => {
const sessionStore = makeSessionStore();
const response = await handleSessionCommands({
req: {
token: 't',
session: 'default',
command: 'keyboard',
positionals: ['status'],
flags: {},
},
sessionName: 'default',
logPath: path.join(os.tmpdir(), 'daemon.log'),
sessionStore,
invoke: noopInvoke,
});

assert.ok(response);
assert.equal(response?.ok, false);
if (response && !response.ok) {
assert.equal(response.error.code, 'INVALID_ARGS');
assert.match(response.error.message, /keyboard requires an active session or an explicit device selector/i);
}
});

test('keyboard dismiss supports explicit selector without active session', async () => {
const sessionStore = makeSessionStore();
const selectedDevice: SessionState['device'] = {
platform: 'android',
id: 'emulator-5554',
name: 'Pixel Emulator',
kind: 'emulator',
booted: true,
};

const response = await handleSessionCommands({
req: {
token: 't',
session: 'default',
command: 'keyboard',
positionals: ['dismiss'],
flags: { platform: 'android', serial: 'emulator-5554' },
},
sessionName: 'default',
logPath: path.join(os.tmpdir(), 'daemon.log'),
sessionStore,
invoke: noopInvoke,
ensureReady: async () => {},
resolveTargetDevice: async () => selectedDevice,
dispatch: async (device, command, positionals) => {
assert.equal(device.id, 'emulator-5554');
assert.equal(command, 'keyboard');
assert.deepEqual(positionals, ['dismiss']);
return { platform: 'android', action: 'dismiss', dismissed: true, visible: false };
},
});

assert.ok(response);
assert.equal(response?.ok, true);
if (response && response.ok) {
assert.equal(response.data?.platform, 'android');
assert.equal(response.data?.action, 'dismiss');
assert.equal(response.data?.dismissed, true);
assert.equal(response.data?.visible, false);
}
});

test('keyboard rejects unsupported iOS simulator devices', async () => {
const sessionStore = makeSessionStore();
const sessionName = 'ios-sim-session';
sessionStore.set(
sessionName,
makeSession(sessionName, {
platform: 'ios',
id: 'sim-1',
name: 'iPhone 17 Pro',
kind: 'simulator',
booted: true,
}),
);

const response = await handleSessionCommands({
req: {
token: 't',
session: sessionName,
command: 'keyboard',
positionals: ['status'],
flags: {},
},
sessionName,
logPath: path.join(os.tmpdir(), 'daemon.log'),
sessionStore,
invoke: noopInvoke,
ensureReady: async () => {},
dispatch: async () => {
throw new Error('dispatch should not run for unsupported targets');
},
});

assert.ok(response);
assert.equal(response?.ok, false);
if (response && !response.ok) {
assert.equal(response.error.code, 'UNSUPPORTED_OPERATION');
assert.match(response.error.message, /keyboard is not supported on this device/i);
}
});

test('clipboard read uses active session device', async () => {
const sessionStore = makeSessionStore();
const sessionName = 'ios-sim-session';
Expand Down
14 changes: 14 additions & 0 deletions src/daemon/handlers/session.ts
Original file line number Diff line number Diff line change
Expand Up @@ -806,6 +806,20 @@ export async function handleSessionCommands(params: {
});
}

if (command === 'keyboard') {
return await runSessionOrSelectorDispatch({
req,
sessionName,
logPath,
sessionStore,
ensureReady,
resolveDevice,
dispatch,
command: 'keyboard',
positionals: req.positionals ?? [],
});
}

if (command === 'perf') {
const session = sessionStore.get(sessionName);
if (!session) {
Expand Down
Loading
Loading