diff --git a/docs/cli/features/droid-control.mdx b/docs/cli/features/droid-control.mdx index 92e7541..fd3b4dd 100644 --- a/docs/cli/features/droid-control.mdx +++ b/docs/cli/features/droid-control.mdx @@ -57,70 +57,104 @@ You also need the runtime tools for your use case (tuistory, agent-browser, ffmp Droid Control adds three slash commands. Each handles the full workflow end-to-end: planning, execution, recording, and reporting. - - Test a specific behavior claim and report findings with evidence. + + Record a demo video of a feature or PR. ``` - /verify "ESC cancels streaming in bash mode" + /demo pr-1847 ``` - Droid launches the app, attempts the claim, and reports what actually happened -- with screenshots and text snapshots as evidence. + Accepts a PR number, GitHub URL, or free-text description. Comparison PRs get side-by-side layout by default; new features get single-branch. - - The droid is framed as an **investigator**, not an advocate. If the claim is false, that's a valid finding. Anti-fabrication rules prevent staging evidence to match expected outcomes. - - - - Run automated QA against terminal CLIs or web/Electron apps. + Add flags for extra polish: ``` - /qa-test https://app.example.com -- login, create a project, invite a member + /demo pr-1847 -- showcase, keys ``` - Droid drives the browser (or terminal) through the flow, captures each step, and reports pass/fail with annotated screenshots. + | Flag | Effect | + |------|--------| + | `showcase` | Cinematic preset with warm backgrounds and film grain | + | `keys` | Keystroke overlay pills showing user actions | + + #### How it works + + + + Fetches the PR description, diff, and linked ticket. For each change, identifies what needs to be proven and what a viewer could confuse it with. + + + Scripts a sequence of actions that produces visible evidence the feature works. Both branches run identical interactions so only the behavior differs. Presents the plan and waits for your approval before recording. + + + Launches recorded sessions on the baseline and candidate branches in parallel using worker subagents. + + + Renders a polished video via Remotion with title cards, window chrome, and effects. Six visual presets range from cinematic (`factory`) to utilitarian (`minimal`). + + + Checks the final video against the original commitments before delivering. + + - - Record a demo video of a feature or PR. + + Test a specific behavior claim and report findings with evidence. ``` - /demo pr-1847 + /verify "ESC cancels streaming in bash mode" ``` - Droid reads the PR, scripts interactions that prove the change works, records both branches in parallel, and renders a side-by-side comparison video. + Also accepts a PR reference with an optional claim: - Add flags for extra polish: + ``` + /verify 11386 -- the fork flag creates a new session + ``` + + If given a PR number alone, Droid fetches the PR and identifies the most important testable claim. + + + The droid is framed as an **investigator**, not an advocate. If the claim is false, that's a valid finding. Anti-fabrication rules prevent staging evidence to match expected outcomes. + + + #### How it works + + + + Identifies the specific behavior to observe and what evidence type is needed: text snapshots for functional claims, screenshots for visual claims, or raw byte captures for encoding claims. + + + Launches the app, runs the minimal interaction sequence that demonstrates the behavior, and captures the result. If the behavior contradicts the claim, that is evidence -- not an error. + + + Delivers a structured report with a **CONFIRMED**, **REFUTED**, or **INCONCLUSIVE** conclusion, along with all captured evidence inline. + + + + + Run automated QA against terminal CLIs, web apps, or Electron apps. ``` - /demo pr-1847 -- showcase, keys + /qa-test https://app.example.com -- login, create a project, invite a member ``` - | Flag | Effect | - |------|--------| - | `showcase` | Cinematic preset with warm backgrounds and film grain | - | `keys` | Keystroke overlay pills showing user actions | + Also accepts a CLI command, Electron app name, PR reference, or free-text description. Test steps after `--` are optional -- Droid designs a reasonable flow if none are provided. + + #### How it works + + + + Determines the target (web, terminal, or Electron), designs test steps from your instructions or the app's UI, and identifies what evidence to capture at each step. + + + Launches the app and executes each step, capturing screenshots (browser) or text snapshots (terminal) along the way. If a step fails, it records the failure and continues for maximum coverage. + + + Delivers a step-level pass/fail table with inline evidence and a summary of any issues found. + + -## How `/demo` works - - - - Fetches the PR description, diff, and linked ticket. Identifies what needs to be proven and what could be confused with existing behavior. - - - Scripts a sequence of actions that produces visible evidence the feature works. For comparison PRs, both branches run identical interactions so only the behavior differs. - - - Launches recorded sessions on the baseline and candidate branches in parallel using worker subagents. - - - Renders a polished video via Remotion with title cards, window chrome, keystroke overlays, and effects. Six visual presets range from cinematic to utilitarian. - - - Checks the final video against the original commitments before delivering. - - - ### Example output Every video below was planned, recorded, and rendered entirely by a Droid. @@ -140,24 +174,20 @@ Every video below was planned, recorded, and rendered entirely by a Droid. - {/* - To enable web/Electron demos, drop the videos into docs/images/features/ and uncomment: - - + - + - */} ## Automation drivers diff --git a/docs/images/features/droid-control-web-comparison.mp4 b/docs/images/features/droid-control-web-comparison.mp4 new file mode 100644 index 0000000..cac988f Binary files /dev/null and b/docs/images/features/droid-control-web-comparison.mp4 differ diff --git a/docs/images/features/droid-control-web-single.mp4 b/docs/images/features/droid-control-web-single.mp4 new file mode 100644 index 0000000..dcfb7e4 Binary files /dev/null and b/docs/images/features/droid-control-web-single.mp4 differ