Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ npx agent-device open SampleApp
```

The skill is also accessible on [ClawHub](https://clawhub.ai/okwasniewski/agent-device).
For structured exploratory QA workflows, use the dogfood skill at [skills/dogfood/SKILL.md](skills/dogfood/SKILL.md).

## Quick Start

Expand Down
1 change: 1 addition & 0 deletions skills/agent-device/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ description: Automates interactions for iOS simulators/devices and Android emula
# Mobile Automation with agent-device

For exploration, use snapshot refs. For deterministic replay, use selectors.
For structured exploratory QA bug hunts and reporting, use [../dogfood/SKILL.md](../dogfood/SKILL.md).

## Start Here (Read This First)

Expand Down
183 changes: 183 additions & 0 deletions skills/dogfood/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
---
name: dogfood
description: Systematically explore and test a mobile app on iOS/Android with agent-device to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", or "test this app" on mobile. Produces a structured report with reproducible evidence: screenshots, optional repro videos, and detailed steps for every issue.
allowed-tools: Bash(agent-device:*), Bash(npx agent-device:*)
---

# Dogfood (agent-device)

Systematically explore a mobile app, find issues, and produce a report with full reproduction evidence for every finding.

## Setup

Only the **Target app** is required. Everything else has sensible defaults.

| Parameter | Default | Example override |
|-----------|---------|-----------------|
| **Target app** | _(required)_ | `Settings`, `com.example.app`, deep link URL |
| **Platform** | Infer from user context; otherwise ask (`ios` or `android`) | `--platform ios` |
| **Session name** | Slugified app/platform (for example `settings-ios`) | `--session my-session` |
| **Output directory** | `./dogfood-output/` | `Output directory: /tmp/mobile-qa` |
| **Scope** | Full app | `Focus on onboarding and profile` |
| **Authentication** | None | `Sign in to user@example.com` |

If the user gives enough context to start, begin immediately with defaults. Ask follow-up only when a required detail is missing (for example platform or credentials).

Prefer direct `agent-device` binary when available.

## Workflow

```
1. Initialize Set up session, output dirs, report file
2. Launch/Auth Open app and sign in if needed
3. Orient Capture initial snapshot and map navigation
4. Explore Systematically test flows and states
5. Document Record reproducible evidence per issue
6. Wrap up Reconcile summary, close session
```

### 1. Initialize

```bash
mkdir -p {OUTPUT_DIR}/screenshots {OUTPUT_DIR}/videos
cp {SKILL_DIR}/templates/dogfood-report-template.md {OUTPUT_DIR}/report.md
```

### 2. Launch/Auth

Start a named session and launch target app:

```bash
agent-device --session {SESSION} open {TARGET_APP} --platform {PLATFORM}
agent-device --session {SESSION} snapshot -i
```

If login is required:

```bash
agent-device --session {SESSION} snapshot -i
agent-device --session {SESSION} fill @e1 "{EMAIL}"
agent-device --session {SESSION} fill @e2 "{PASSWORD}"
agent-device --session {SESSION} press @e3
agent-device --session {SESSION} wait 1000
agent-device --session {SESSION} snapshot -i
```

For OTP/email codes: ask the user, wait for input, then continue.

### 3. Orient

Capture initial evidence and navigation anchors:

```bash
agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/initial.png
agent-device --session {SESSION} snapshot -i
```

Map top-level navigation, tabs, and key workflows before deep testing.

### 4. Explore

Read [references/issue-taxonomy.md](references/issue-taxonomy.md) for severity/category calibration.

Strategy:

- Move through each major app area (tabs, drawers, settings pages).
- Test core journeys end-to-end (create, edit, delete, submit, recover).
- Validate edge states (empty/error/loading/offline/permissions denied).
- Use `diff snapshot -i` after UI transitions to avoid stale refs.
- Periodically capture `logs path` and inspect the app log when behavior looks suspicious.

Useful commands per screen:

```bash
agent-device --session {SESSION} snapshot -i
agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/{screen-name}.png
agent-device --session {SESSION} appstate
agent-device --session {SESSION} logs path
```

### 5. Document Issues (Repro-First)

Explore and document in one pass. When you find an issue, stop and fully capture evidence before continuing.

#### Interactive/behavioral issues

Use video + step screenshots:

1. Start recording:

```bash
agent-device --session {SESSION} record start {OUTPUT_DIR}/videos/issue-{NNN}-repro.mp4
```

2. Reproduce with visible pacing. Capture each step:

```bash
agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-1.png
sleep 1
# perform action
sleep 1
agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-2.png
```

3. Capture final broken state:

```bash
sleep 2
agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-result.png
```

4. Stop recording:

```bash
agent-device --session {SESSION} record stop
```

5. Append issue immediately to report with numbered steps and screenshot references.

#### Static/on-load issues

Single screenshot is sufficient; no video required:

```bash
agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}.png
```

Set **Repro Video** to `N/A` in the report.

### 6. Wrap Up

Target 5-10 well-evidenced issues, then finish:

1. Reconcile summary severity counts in `report.md`.
2. Close session:

```bash
agent-device --session {SESSION} close
```

3. Report total issues, severity breakdown, and highest-risk findings.

## Guidance

- Repro quality matters more than issue count.
- Use refs (`@eN`) for fast exploration, selectors for deterministic replay assertions when needed.
- Re-snapshot after any mutation (navigation, modal, list update, form submit).
- Use `fill` for clear-then-type semantics; use `type` for incremental typing behavior checks.
- Keep logs optional and targeted: enable/read app logs only when useful for diagnosis.
- Never read source code of the app under test; findings must come from observed runtime behavior.
- Write each issue immediately to avoid losing evidence.
- Never delete screenshots/videos/report artifacts during a session.

## References

| Reference | When to Read |
|-----------|--------------|
| [references/issue-taxonomy.md](references/issue-taxonomy.md) | Start of session; severity/categories/checklist |

## Templates

| Template | Purpose |
|----------|---------|
| [templates/dogfood-report-template.md](templates/dogfood-report-template.md) | Copy into output directory as the report file |
83 changes: 83 additions & 0 deletions skills/dogfood/references/issue-taxonomy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Issue Taxonomy (Mobile)

Reference for categorizing issues found during mobile dogfooding.

## Severity Levels

| Severity | Definition |
|----------|------------|
| **critical** | Blocks a core workflow, causes data loss, or crashes/freeze loops the app |
| **high** | Major feature broken or unusable, no practical workaround |
| **medium** | Feature works with notable friction or partial failure; workaround exists |
| **low** | Minor cosmetic or polish issue |

## Categories

### Visual / UI

- Layout broken, clipped, overlapped, or unreadable text
- Safe-area/notch overlap issues
- Incorrect dark/light appearance rendering
- Missing assets/icons
- Animation glitches or flicker

### Functional

- Buttons/controls do nothing or trigger wrong action
- Flows fail (create/edit/delete/submit)
- Navigation dead-ends or wrong destination
- State loss after background/foreground transitions
- Deep link opens wrong screen or fails

### UX

- Confusing hierarchy or navigation labels
- Missing loading/progress feedback
- Unclear error handling or no recovery affordance
- Excessive steps for common tasks
- Inconsistent behavior between similar screens

### Content

- Typos, incorrect copy, placeholder text
- Wrong labels/help text
- Truncated text with no affordance
- Inconsistent terminology across screens

### Performance

- Slow startup or route transitions
- Input lag or gesture jank
- Scroll hitches/frame drops
- Notable battery/thermal symptoms during basic usage

### Diagnostics / Logs

- Native crashes or repeated fatal exceptions
- Repeated warnings correlated with broken behavior
- Unhandled runtime errors visible during repro

### Permissions / Platform

- Permission prompt flow broken or loops forever
- Denied permissions not handled gracefully
- Platform-specific regressions (iOS-only or Android-only)
- Background/foreground lifecycle regressions

### Accessibility

- Missing labels or incorrect accessibility names
- Focus order/navigation issues for assistive tech
- Low contrast or unreadable text scaling
- Touch targets too small for reliable interaction

## Exploration Checklist

1. Visual scan: capture screenshot; verify layout/safe areas/text/icon rendering.
2. Interactions: press controls, open menus/modals, validate expected response.
3. Forms/input: test valid/invalid/empty/boundary input.
4. Navigation: traverse all top-level sections and return paths.
5. App states: loading/empty/error/offline/permission-denied/background-resume.
6. Logs/diagnostics: inspect app logs when behavior is suspicious.
7. Platform parity: verify critical flows on each requested platform.
8. Accessibility basics: labels, touch target sizes, readability/contrast.
52 changes: 52 additions & 0 deletions skills/dogfood/templates/dogfood-report-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Dogfood Report: {APP_NAME}

| Field | Value |
|-------|-------|
| **Date** | {DATE} |
| **Platform** | {PLATFORM} |
| **Target App** | {TARGET_APP} |
| **Session** | {SESSION_NAME} |
| **Scope** | {SCOPE} |

## Summary

| Severity | Count |
|----------|-------|
| Critical | 0 |
| High | 0 |
| Medium | 0 |
| Low | 0 |
| **Total** | **0** |

## Issues

<!-- Copy this block for each issue found. Interactive issues need video + step screenshots. Static issues can be screenshot-only (Repro Video = N/A). -->

### ISSUE-001: {Short title}

| Field | Value |
|-------|-------|
| **Severity** | critical / high / medium / low |
| **Category** | visual / functional / ux / content / performance / diagnostics / permissions / accessibility |
| **Screen / Route** | {screen where issue was found} |
| **Repro Video** | {path to video, or N/A for static issues} |

**Description**

{What is wrong, what was expected, and what actually happened.}

**Repro Steps**

1. Open {screen/entry point}
![Step 1](screenshots/issue-001-step-1.png)

2. {Action}
![Step 2](screenshots/issue-001-step-2.png)

3. {Action}
![Step 3](screenshots/issue-001-step-3.png)

4. **Observe:** {broken behavior}
![Result](screenshots/issue-001-result.png)

---
1 change: 1 addition & 0 deletions website/docs/docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ title: Introduction
- Session-aware workflows and replay

If you know `agent-browser`, this is the mobile-native counterpart for iOS/Android UI automation.
For exploratory QA and bug-hunting workflows, see `skills/dogfood/SKILL.md` in this repository.

## What it’s good at

Expand Down
Loading