Skip to content

feat: add droid-control plugin for terminal, browser, and computer automation#18

Open
factory-ain3sh wants to merge 5 commits intomasterfrom
droid-control
Open

feat: add droid-control plugin for terminal, browser, and computer automation#18
factory-ain3sh wants to merge 5 commits intomasterfrom
droid-control

Conversation

@factory-ain3sh
Copy link
Copy Markdown

@factory-ain3sh factory-ain3sh commented Apr 10, 2026

Description

Adds droid-control as a Factory plugin that gives droids the ability to automate terminal TUIs, web pages, and Electron apps. Covers three workflows: recording demos, verifying behavior claims, and running QA test flows.

The plugin uses a composition architecture with three layers:

  • Orchestrator (skills/droid-control/) -- thin router with 3 orthogonal routing dimensions (target, stage, artifact) that tells the droid which atoms to load. ~93 lines.
  • 10 atom skills -- self-contained background knowledge loaded on demand. Split into driver atoms (tuistory, true-input, agent-browser), target atoms (droid-cli, pty-capture), stage atoms (capture, compose, verify), and a polish atom (showcase). Platform-specific content lives in platforms/ subdirs so droids only load the OS they are working on.
  • 3 workflow commands (/demo, /verify, /qa-test) -- thin intent declarations that parse user arguments into commitments, then delegate to stage atoms via hybrid handoffs (structured params for mechanical decisions, natural language for creative ones).

Drivers

  • tuistory -- virtual PTY automation with Playwright-style CLI, asciinema recording, forced truecolor output (FORCE_COLOR=3 baked into tctl runner)
  • true-input -- real terminal emulator automation via headless Wayland compositor (Linux), KVM/QEMU virsh send-key (Windows), or QEMU monitor sendkey (macOS). Platform docs split into platforms/{linux,windows,macos}.md to avoid context bloat.
  • agent-browser -- web/Electron automation via Playwright-backed CLI with CDP support

Concurrency

Multiple droids can film demos, run QA, or verify claims simultaneously on the same machine. Ground rule 5 requires every workflow to generate a run-scoped ID (RUN_ID) used as a prefix for all tctl session names and a scoped temp directory (RUN_DIR) for all output files. This prevents session name collisions in the shared /tmp/tctl-sessions/ namespace and output path conflicts.

Video post-processing (Remotion)

The compose/showcase pipeline uses Remotion (React-based video renderer) for all video compositing. The Remotion project lives at plugins/droid-control/remotion/ (29 source files, 22 components).

Automatic cinematic layers (rendered without configuration):

  • Warm radial background with progressive amber glow intensification (Factory palette)
  • Floating accent-colored particles on Lissajous paths
  • SVG fractal noise overlay (analog film texture)
  • Color grade overlay (warm amber or cool blue, palette-driven)
  • Motion blur title-to-content transition (blur + scale + opacity crossfade)
  • Animated window chrome entrance (scale-in, traffic lights, glassmorphic border, warm-tinted shadow)
  • Staggered panel entrance for side-by-side layouts
  • Auto-scaled title/subtitle text via @remotion/layout-utils fitText (prevents long PR titles from overflowing)

Effect-driven layers (via props):

  • Spotlight overlays (SVG mask dimming with fade)
  • Keystroke overlays (pop-in monospace pills with backdrop-blur)
  • Directed zoom (3-phase: zoom in, hold, zoom out)
  • Section header banners (slide-in chapter labels with backdrop-blur)
  • Section transition sweeps (frosted-glass band that crosses the frame at section boundaries)

6 presets: factory, factory-hero, hero, macos, presentation, minimal. Each configures window chrome, spacing, background style, and palette. Factory presets produce rich cinematic warmth; Catppuccin presets stay subtle and cool.

Helper script (render-showcase.sh): auto-stages clips into Remotion public/, detects clip duration via ffprobe, renders, and cleans up. Eliminates the most common error class for droids.

Design

See ARCHITECTURE.md for the rationale behind the composition model, waterfall routing, task delegation, hybrid handoffs, and platform isolation.

Plugin structure

plugins/droid-control/
├── README.md                                   # Plugin overview and quick start
├── ARCHITECTURE.md                             # Design rationale
├── NOTICES.md                                  # Third-party dependency attribution
├── bin/tctl                                    # Unified terminal control wrapper
├── remotion/                                   # Remotion video renderer (React/TSX)
│   ├── src/compositions/Showcase.tsx           # Main composition (wires all layers)
│   ├── src/components/                         # 22 visual components
│   ├── src/lib/                                # Schema, palettes, presets, duration
│   └── package.json                            # Remotion 4.x + React 19
├── scripts/
│   ├── render-showcase.sh                      # Render helper (stage + detect + render + cleanup)
│   ├── capture-terminal-bytes.py               # PTY byte capture matrix
│   ├── pty-hex-dumper.py                       # Raw-mode hex dumper
│   ├── windows/                                # Windows KVM/virsh VM control
│   └── macos/                                  # macOS QEMU monitor VM control
├── skills/
│   ├── droid-control/SKILL.md                  # Orchestrator (3 routing tables + 5 ground rules)
│   ├── tuistory/SKILL.md                       # Virtual PTY driver
│   ├── true-input/                             # Real terminal driver (per-platform)
│   ├── agent-browser/SKILL.md                  # Web/Electron driver
│   ├── droid-cli/SKILL.md                      # Droid CLI target patterns
│   ├── pty-capture/                            # Byte capture (per-platform)
│   ├── capture/SKILL.md                        # Recording lifecycle stage
│   ├── compose/SKILL.md                        # Video assembly (Remotion pipeline)
│   ├── verify/SKILL.md                         # Deliverable verification stage
│   └── showcase/SKILL.md                       # Visual polish reference
└── commands/
    ├── demo.md                                 # /demo -- record demo videos
    ├── verify.md                               # /verify -- test a claim and report findings
    └── qa-test.md                              # /qa-test -- automated QA flows

Potential Risk & Impact

  • No runtime code changes. Entirely plugin content (markdown skill docs, shell scripts, Remotion project). Zero impact on existing plugins or functionality.
  • Plugin skills are loaded on demand by droids only when relevant commands are invoked. No ambient token cost.
  • Platform-specific content is isolated in subdirs so droids only load the platform they need.
  • The Remotion project requires npm install in plugins/droid-control/remotion/ before first use. node_modules/ is gitignored.
  • External dependencies (tuistory, cage, wtype, ffmpeg, Node.js, Chrome) are documented in prerequisites but not auto-installed.

How Has This Been Tested?

  • All tctl operations validated end-to-end across Ghostty, Kitty, and Alacritty on Linux/Wayland
  • PTY byte capture confirmed terminal encoding differences across terminals
  • Color passthrough verified: FORCE_COLOR=3 produces full truecolor SGR codes in tuistory PTY recordings
  • Remotion project passes npx tsc --noEmit with zero errors (29 source files)
  • Remotion components audited by 5 parallel review workers checking: Remotion API correctness (no CSS animations, all frame-driven), no hallucinated APIs, performance (memoization, interpolation count), logic bugs (frequency math, frame boundaries, masking), and cross-file integration
  • Audit fixes applied: duration calculation, compound animation prevention, frequency correction, spotlight mask fix, useMemo optimizations
  • Title card auto-scaling verified with both long and short titles via remotion still
  • Section transition sweep verified via remotion still at mid-sweep frame
  • Concurrent session isolation verified: run-scoped session names and output directories prevent collisions
  • Windows VM control tested via virsh send-key with live KVM/QEMU Windows 11 VM
  • macOS VM control tested via QEMU monitor sendkey with live OSX-KVM VM
  • Plugin structure conforms to Factory plugin spec (.factory-plugin/plugin.json)
  • Coherence review: all atom cross-references verified, all routing table entries point to existing atoms

Trying it out

# 1. Register the Factory plugins marketplace (if not already added)
droid plugin marketplace add https://github.com/Factory-AI/factory-plugins

# 2. Install the plugin
droid plugin install droid-control@factory-plugins --scope user

# 3. Install Remotion dependencies (one-time, only needed for video rendering)
cd ~/.factory/plugins/cache/factory-plugins/droid-control/*/remotion && npm install

Or use the /plugins UI: Browse tab, select droid-control, install.

factory-ain3sh and others added 3 commits April 9, 2026 22:25
Publish the droid-control plugin in the public marketplace so it can be installed directly from factory-plugins.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
… halftone texture locally, trim internal references

- Replace hard keyword gate for effects with three-tier model: utilitarian
  (default for demos), full (default for showcase), none (explicit opt-out only)
- Defer specific effect selection to compose stage where recordings exist
- Bundle bg-halftone-rotor.jpg locally instead of fetching from vercel at render time
- Remove monorepo path references (apps/cli/, packages/logging) from skill docs
- Update verify atom to check effects against committed tier

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- README: plugin overview, command examples, quick start, prerequisites
- ARCHITECTURE: design rationale -- UX for droids, waterfall routing,
  task delegation, orthogonal routing, hybrid handoffs, platform isolation
- NOTICES: third-party dependency attribution and license terms

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- agg is not on crates.io; use cargo install --git from asciinema/agg
- Replace brittle glob path for Remotion install with droid plugin list

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant