diff --git a/README.md b/README.md index 3edc1223d..eedc25cbb 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,7 @@ npx agent-device open SampleApp ``` The skill is also accessible on [ClawHub](https://clawhub.ai/okwasniewski/agent-device). +For structured exploratory QA workflows, use the dogfood skill at [skills/dogfood/SKILL.md](skills/dogfood/SKILL.md). ## Quick Start diff --git a/skills/agent-device/SKILL.md b/skills/agent-device/SKILL.md index 53c870b7c..e0cb318bc 100644 --- a/skills/agent-device/SKILL.md +++ b/skills/agent-device/SKILL.md @@ -6,6 +6,7 @@ description: Automates interactions for iOS simulators/devices and Android emula # Mobile Automation with agent-device For exploration, use snapshot refs. For deterministic replay, use selectors. +For structured exploratory QA bug hunts and reporting, use [../dogfood/SKILL.md](../dogfood/SKILL.md). ## Start Here (Read This First) diff --git a/skills/dogfood/SKILL.md b/skills/dogfood/SKILL.md new file mode 100644 index 000000000..0aae558ea --- /dev/null +++ b/skills/dogfood/SKILL.md @@ -0,0 +1,183 @@ +--- +name: dogfood +description: Systematically explore and test a mobile app on iOS/Android with agent-device to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", or "test this app" on mobile. Produces a structured report with reproducible evidence: screenshots, optional repro videos, and detailed steps for every issue. +allowed-tools: Bash(agent-device:*), Bash(npx agent-device:*) +--- + +# Dogfood (agent-device) + +Systematically explore a mobile app, find issues, and produce a report with full reproduction evidence for every finding. + +## Setup + +Only the **Target app** is required. Everything else has sensible defaults. + +| Parameter | Default | Example override | +|-----------|---------|-----------------| +| **Target app** | _(required)_ | `Settings`, `com.example.app`, deep link URL | +| **Platform** | Infer from user context; otherwise ask (`ios` or `android`) | `--platform ios` | +| **Session name** | Slugified app/platform (for example `settings-ios`) | `--session my-session` | +| **Output directory** | `./dogfood-output/` | `Output directory: /tmp/mobile-qa` | +| **Scope** | Full app | `Focus on onboarding and profile` | +| **Authentication** | None | `Sign in to user@example.com` | + +If the user gives enough context to start, begin immediately with defaults. Ask follow-up only when a required detail is missing (for example platform or credentials). + +Prefer direct `agent-device` binary when available. + +## Workflow + +``` +1. Initialize Set up session, output dirs, report file +2. Launch/Auth Open app and sign in if needed +3. Orient Capture initial snapshot and map navigation +4. Explore Systematically test flows and states +5. Document Record reproducible evidence per issue +6. Wrap up Reconcile summary, close session +``` + +### 1. Initialize + +```bash +mkdir -p {OUTPUT_DIR}/screenshots {OUTPUT_DIR}/videos +cp {SKILL_DIR}/templates/dogfood-report-template.md {OUTPUT_DIR}/report.md +``` + +### 2. Launch/Auth + +Start a named session and launch target app: + +```bash +agent-device --session {SESSION} open {TARGET_APP} --platform {PLATFORM} +agent-device --session {SESSION} snapshot -i +``` + +If login is required: + +```bash +agent-device --session {SESSION} snapshot -i +agent-device --session {SESSION} fill @e1 "{EMAIL}" +agent-device --session {SESSION} fill @e2 "{PASSWORD}" +agent-device --session {SESSION} press @e3 +agent-device --session {SESSION} wait 1000 +agent-device --session {SESSION} snapshot -i +``` + +For OTP/email codes: ask the user, wait for input, then continue. + +### 3. Orient + +Capture initial evidence and navigation anchors: + +```bash +agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/initial.png +agent-device --session {SESSION} snapshot -i +``` + +Map top-level navigation, tabs, and key workflows before deep testing. + +### 4. Explore + +Read [references/issue-taxonomy.md](references/issue-taxonomy.md) for severity/category calibration. + +Strategy: + +- Move through each major app area (tabs, drawers, settings pages). +- Test core journeys end-to-end (create, edit, delete, submit, recover). +- Validate edge states (empty/error/loading/offline/permissions denied). +- Use `diff snapshot -i` after UI transitions to avoid stale refs. +- Periodically capture `logs path` and inspect the app log when behavior looks suspicious. + +Useful commands per screen: + +```bash +agent-device --session {SESSION} snapshot -i +agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/{screen-name}.png +agent-device --session {SESSION} appstate +agent-device --session {SESSION} logs path +``` + +### 5. Document Issues (Repro-First) + +Explore and document in one pass. When you find an issue, stop and fully capture evidence before continuing. + +#### Interactive/behavioral issues + +Use video + step screenshots: + +1. Start recording: + +```bash +agent-device --session {SESSION} record start {OUTPUT_DIR}/videos/issue-{NNN}-repro.mp4 +``` + +2. Reproduce with visible pacing. Capture each step: + +```bash +agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-1.png +sleep 1 +# perform action +sleep 1 +agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-step-2.png +``` + +3. Capture final broken state: + +```bash +sleep 2 +agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}-result.png +``` + +4. Stop recording: + +```bash +agent-device --session {SESSION} record stop +``` + +5. Append issue immediately to report with numbered steps and screenshot references. + +#### Static/on-load issues + +Single screenshot is sufficient; no video required: + +```bash +agent-device --session {SESSION} screenshot {OUTPUT_DIR}/screenshots/issue-{NNN}.png +``` + +Set **Repro Video** to `N/A` in the report. + +### 6. Wrap Up + +Target 5-10 well-evidenced issues, then finish: + +1. Reconcile summary severity counts in `report.md`. +2. Close session: + +```bash +agent-device --session {SESSION} close +``` + +3. Report total issues, severity breakdown, and highest-risk findings. + +## Guidance + +- Repro quality matters more than issue count. +- Use refs (`@eN`) for fast exploration, selectors for deterministic replay assertions when needed. +- Re-snapshot after any mutation (navigation, modal, list update, form submit). +- Use `fill` for clear-then-type semantics; use `type` for incremental typing behavior checks. +- Keep logs optional and targeted: enable/read app logs only when useful for diagnosis. +- Never read source code of the app under test; findings must come from observed runtime behavior. +- Write each issue immediately to avoid losing evidence. +- Never delete screenshots/videos/report artifacts during a session. + +## References + +| Reference | When to Read | +|-----------|--------------| +| [references/issue-taxonomy.md](references/issue-taxonomy.md) | Start of session; severity/categories/checklist | + +## Templates + +| Template | Purpose | +|----------|---------| +| [templates/dogfood-report-template.md](templates/dogfood-report-template.md) | Copy into output directory as the report file | diff --git a/skills/dogfood/references/issue-taxonomy.md b/skills/dogfood/references/issue-taxonomy.md new file mode 100644 index 000000000..ce6fda91e --- /dev/null +++ b/skills/dogfood/references/issue-taxonomy.md @@ -0,0 +1,83 @@ +# Issue Taxonomy (Mobile) + +Reference for categorizing issues found during mobile dogfooding. + +## Severity Levels + +| Severity | Definition | +|----------|------------| +| **critical** | Blocks a core workflow, causes data loss, or crashes/freeze loops the app | +| **high** | Major feature broken or unusable, no practical workaround | +| **medium** | Feature works with notable friction or partial failure; workaround exists | +| **low** | Minor cosmetic or polish issue | + +## Categories + +### Visual / UI + +- Layout broken, clipped, overlapped, or unreadable text +- Safe-area/notch overlap issues +- Incorrect dark/light appearance rendering +- Missing assets/icons +- Animation glitches or flicker + +### Functional + +- Buttons/controls do nothing or trigger wrong action +- Flows fail (create/edit/delete/submit) +- Navigation dead-ends or wrong destination +- State loss after background/foreground transitions +- Deep link opens wrong screen or fails + +### UX + +- Confusing hierarchy or navigation labels +- Missing loading/progress feedback +- Unclear error handling or no recovery affordance +- Excessive steps for common tasks +- Inconsistent behavior between similar screens + +### Content + +- Typos, incorrect copy, placeholder text +- Wrong labels/help text +- Truncated text with no affordance +- Inconsistent terminology across screens + +### Performance + +- Slow startup or route transitions +- Input lag or gesture jank +- Scroll hitches/frame drops +- Notable battery/thermal symptoms during basic usage + +### Diagnostics / Logs + +- Native crashes or repeated fatal exceptions +- Repeated warnings correlated with broken behavior +- Unhandled runtime errors visible during repro + +### Permissions / Platform + +- Permission prompt flow broken or loops forever +- Denied permissions not handled gracefully +- Platform-specific regressions (iOS-only or Android-only) +- Background/foreground lifecycle regressions + +### Accessibility + +- Missing labels or incorrect accessibility names +- Focus order/navigation issues for assistive tech +- Low contrast or unreadable text scaling +- Touch targets too small for reliable interaction + +## Exploration Checklist + +1. Visual scan: capture screenshot; verify layout/safe areas/text/icon rendering. +2. Interactions: press controls, open menus/modals, validate expected response. +3. Forms/input: test valid/invalid/empty/boundary input. +4. Navigation: traverse all top-level sections and return paths. +5. App states: loading/empty/error/offline/permission-denied/background-resume. +6. Logs/diagnostics: inspect app logs when behavior is suspicious. +7. Platform parity: verify critical flows on each requested platform. +8. Accessibility basics: labels, touch target sizes, readability/contrast. diff --git a/skills/dogfood/templates/dogfood-report-template.md b/skills/dogfood/templates/dogfood-report-template.md new file mode 100644 index 000000000..92373f0ce --- /dev/null +++ b/skills/dogfood/templates/dogfood-report-template.md @@ -0,0 +1,52 @@ +# Dogfood Report: {APP_NAME} + +| Field | Value | +|-------|-------| +| **Date** | {DATE} | +| **Platform** | {PLATFORM} | +| **Target App** | {TARGET_APP} | +| **Session** | {SESSION_NAME} | +| **Scope** | {SCOPE} | + +## Summary + +| Severity | Count | +|----------|-------| +| Critical | 0 | +| High | 0 | +| Medium | 0 | +| Low | 0 | +| **Total** | **0** | + +## Issues + + + +### ISSUE-001: {Short title} + +| Field | Value | +|-------|-------| +| **Severity** | critical / high / medium / low | +| **Category** | visual / functional / ux / content / performance / diagnostics / permissions / accessibility | +| **Screen / Route** | {screen where issue was found} | +| **Repro Video** | {path to video, or N/A for static issues} | + +**Description** + +{What is wrong, what was expected, and what actually happened.} + +**Repro Steps** + +1. Open {screen/entry point} + ![Step 1](screenshots/issue-001-step-1.png) + +2. {Action} + ![Step 2](screenshots/issue-001-step-2.png) + +3. {Action} + ![Step 3](screenshots/issue-001-step-3.png) + +4. **Observe:** {broken behavior} + ![Result](screenshots/issue-001-result.png) + +--- diff --git a/website/docs/docs/introduction.md b/website/docs/docs/introduction.md index f3bbfb1e5..74d1fe929 100644 --- a/website/docs/docs/introduction.md +++ b/website/docs/docs/introduction.md @@ -11,6 +11,7 @@ title: Introduction - Session-aware workflows and replay If you know `agent-browser`, this is the mobile-native counterpart for iOS/Android UI automation. +For exploratory QA and bug-hunting workflows, see `skills/dogfood/SKILL.md` in this repository. ## What it’s good at