spectatty (spectate-tty)

A toolkit for building TUI applications with AI agents. Agents can spawn real terminal sessions, interact with them like a human would, screenshot the rendered output, and share live sessions with you - so they can actually see and debug the UI as they build it, rather than guessing from raw text.

Exposed as an MCP server (and CLI) so any MCP-capable agent can use it out of the box.

The problem

Nowadays, everyone is building a terminal application with AI (myself included), but it's very common to get stuck in a loop of: prompt for changes -> test the UI manually -> tell the agent what's wrong -> repeat. The agent is essentially flying blind.

When building web applications, this isn't a problem - tools like Playwright let agents inspect and interact with the rendered UI as they build it. There's no equivalent for TUI applications. Here's why that's hard.

What agents currently have access to for viewing terminal output:

tmux capture-pane - strips all color and styling, returns raw text:

htop	opencode

Meanwhile, this is what the UI actually looks like from a human perspective:

htop	opencode

What spectatty gives agents:

Using spectatty, agents can actually render and screenshot the UI as it looks from the human perspective, complete with spacing, colors, fonts, etc. This allows it to debug UI bugs that are not even visible in the text-only view (for example, you might be using colors to indicate a boundary in your UI, or the spacing might be subtly off between components).

htop	opencode

Demo

Prompt given to the agent:

My CLI currently shows dim and not dim text for thinking and non-thinking output. Please change this to use blue for thinking and green for text instead. Record a before and after mp4 showing the full scope of the changes.

The agent makes the code changes, then autonomously records before and after MP4s to show the result. In the video below, the left terminal is the agent working in Claude Code. The right terminal is a live view of the agent's spectatty session via spectatty attach -- all the typing there is done by the agent, not the user.

spectatty-demo-v2-hq.mp4

The before and after MP4s the agent produced:

Before	After
before.mp4	after.mp4

Additional problems that spectatty solves

Spawning a real terminal - a bash tool is usually a subprocess with piped stdio, not a full PTY. TUI apps that rely on terminal size, alternate screen, mouse input, or raw keystrokes break in a bash tool. spectatty spawns a genuine PTY backed by a headless terminal emulator, so you can test a full suite of terminal behavior.
Interaction at the right level of abstraction - writing raw escape sequences is flexible but unreadable. If the agent tells you it ran \x1b[A\x1b[Acontinue\r you have no idea what it did. spectatty exposes discrete human-like actions (terminal_type, terminal_key, terminal_ctrl, etc.) so agent traces are readable and reproducible. Raw escape access is still available via terminal_write as a last resort - if the agent has to use it, that usually means a gap in the action set or a problem with the app.
Human collaboration - two parts. First, how does the user see what the agent is doing? spectatty attach <sessionId> lets you watch a live agent-driven session in your own terminal, in real time. terminal_screenshot accepts a savePath so PNGs get written and viewed by the user. Second, how does the user share context back? Jump into the attached session yourself with spectatty attach <sessionId>, reproduce a bug, and the agent sees the result the next time it screenshots.

How it works

bun-pty (spawn process in a real PTY)
  -> @xterm/headless (parse escape sequences into virtual screen buffer)
  -> @napi-rs/canvas (render cell grid to PNG)
  -> MCP server (expose as tools over stdio)

The MCP server delegates to a long-running daemon that owns the PTY sessions. Every byte the program writes passes through xterm.js in-process, so the rendered screen is always current without polling. A Unix socket per session lets a human attach and watch live.

Tools

Input

Tool	Description
`terminal_type`	Type text, exactly as a user would. Add `submit: true` to press Enter after. Preferred over `terminal_write` for text input.
`terminal_key`	Press a named key: `enter`, `backspace`, `tab`, `escape`, `up`/`down`/`left`/`right`, `page_up`/`page_down`, `home`/`end`, `f1`--`f12`. Supports `times` to repeat.
`terminal_ctrl`	Send a Ctrl+key combination: `c` (interrupt), `d` (EOF), `z` (suspend), `l` (clear), etc.
`terminal_send_scroll`	Send scroll input up or down.
`terminal_mouse`	Send a mouse event (click, move, down, up) at a specific column/row position.
`terminal_write`	Send raw input with escape sequences (`\r`, `\x03`, `\e`). Last resort - prefer the tools above.

Observation

Tool	Description
`terminal_screenshot`	Capture the current screen as PNG, plain text, or both. `savePath` writes the PNG to disk. `viewportTop` scrolls to a specific line before capturing.
`terminal_wait_for`	Wait for a regex pattern to appear in the terminal text. Polls until matched or timeout.
`terminal_list`	List all active sessions with their dimensions and exit status.

Session lifecycle

Tool	Description
`terminal_spawn`	Spawn a new terminal session. Returns a `sessionId` and the path to the attach socket.
`terminal_resize`	Resize a session to new dimensions.
`terminal_kill`	Kill a session and clean up all resources.

Recording

Tool	Description
`terminal_record_start`	Start recording terminal output as an asciicast v2 `.cast` file.
`terminal_record_stop`	Stop recording. Saves the `.cast` file and automatically writes a sidecar `.tape.json` alongside it (same path, `.cast` extension replaced with `.tape.json`).
`terminal_to_gif`	Convert a `.cast` recording to an animated GIF (requires `agg`).
`terminal_to_mp4`	Convert a `.cast` recording to an MP4 video (requires `ffmpeg`).
`terminal_export_tape`	Export the session's interaction log as a replayable `.tape.json` file.
`terminal_replay_tape`	Replay a `.tape.json` into a fresh session and return the live session ID.

CLI

spectatty <subcommand>

Subcommand	Description
`mcp`	Start the MCP server on stdio
`server start/stop/status`	Manage the background daemon
`ctl <subcommand>`	Control terminal sessions (mirrors all MCP tools - see below)
`attach <sessionId>`	Attach your terminal to a live session. Uses Ctrl+A as a prefix key for control commands - press Ctrl+A then: `d` to detach, `l` to acquire the agent lock (blocks agent writes so you can type freely), `u` to release the lock, `s` to take a screenshot while locked. Send a literal Ctrl+A by pressing Ctrl+A twice. If you get stuck, Ctrl+A then `d` always exits.
`tail <file.cast>`	Live-tail an asciicast recording as it's being written
`to-gif <input.cast> <output.gif>`	Convert a recording to an animated GIF (requires `agg`)
`to-mp4 <input.cast> <output.mp4>`	Convert a recording to MP4 (requires `ffmpeg`)
`replay-cast <file.cast>`	Play back a `.cast` file in the current terminal with original timing
`replay-tape <file.tape.json>`	Replay a tape file. Produces a `.cast` by default; `--live` replays into the current terminal.

spectatty ctl exposes every MCP tool as a subcommand for scripting and debugging: spawn, list, type, key, ctrl, write, screenshot, resize, kill, scroll, mouse, wait-for, record-start, record-stop, export-tape, replay-tape.

Install

Requires Bun (v1.0+). The to-gif and to-mp4 commands also require agg and ffmpeg respectively.

bun install -g spectatty
spectatty mcp

Usage

MCP client configuration

Add to your MCP client config (e.g. Claude Desktop, Cursor, etc.):

{
  "mcpServers": {
    "spectatty": {
      "command": "spectatty",
      "args": ["mcp"]
    }
  }
}

Note: If spectatty is not on your MCP client's PATH, use the full path (e.g. ~/.bun/bin/spectatty).

Development

git clone https://github.com/dayvidwang/spectatty
cd spectatty
bun install
bun run src/cli.ts mcp

Example interaction

1. terminal_spawn(cols: 220, rows: 50)                        -> { sessionId: "term-1", attachSocket: "/tmp/spectatty-*.sock" }
2. terminal_type(sessionId: "term-1", text: "vim file.txt", submit: true)
3. terminal_wait_for(sessionId: "term-1", pattern: "vim")
4. terminal_screenshot(sessionId: "term-1")                   -> PNG + text of vim loaded
5. terminal_key(sessionId: "term-1", key: "i")               -> enter insert mode
6. terminal_type(sessionId: "term-1", text: "hello world")
7. terminal_key(sessionId: "term-1", key: "escape")
8. terminal_type(sessionId: "term-1", text: ":wq", submit: true)
9. terminal_screenshot(sessionId: "term-1")                   -> confirm file saved
10. terminal_kill(sessionId: "term-1")

Meanwhile, a human can run spectatty attach term-1 to watch the session live.

Stack

Layer	Package
PTY	bun-pty
Terminal emulation	@xterm/headless
PNG rendering	@napi-rs/canvas
HTML serialization	@xterm/addon-serialize
MCP server	@modelcontextprotocol/sdk

Design notes

Why not compose tmux + asciinema?

The obvious alternative is to shell out to existing tools: tmux for session management, asciinema for recording, VHS as a reference for scripted interaction.

The core issue: tmux capture-pane is snapshot-only (you poll it). tmux pipe-pane streams raw PTY bytes, but to produce a screenshot from those bytes you'd still need to parse them through a terminal emulator yourself. You'd end up reimplementing the rendering pipeline anyway, with an extra subprocess hop in the middle.

The fundamental property spectatty needs is rendered state always current, zero polling. That requires owning the PTY directly so xterm.js can process every byte in-process as it arrives.

VHS is the closest spiritual predecessor - it drives a headless terminal and renders output. The difference is that VHS is a batch tool (script in -> video out), while spectatty is interactive (agent drives the session live, takes screenshots at arbitrary points, waits for patterns, keeps the session alive indefinitely).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
assets		assets
benchmark		benchmark
src		src
.gitignore		.gitignore
README.md		README.md
SKILL.md		SKILL.md
TODO.md		TODO.md
bun.lock		bun.lock
mise.toml		mise.toml
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spectatty (spectate-tty)

The problem

Demo

Additional problems that spectatty solves

How it works

Tools

Input

Observation

Session lifecycle

Recording

CLI

Install

Usage

MCP client configuration

Development

Example interaction

Stack

Design notes

Why not compose tmux + asciinema?

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

spectatty (spectate-tty)

The problem

Demo

Additional problems that spectatty solves

How it works

Tools

Input

Observation

Session lifecycle

Recording

CLI

Install

Usage

MCP client configuration

Development

Example interaction

Stack

Design notes

Why not compose tmux + asciinema?

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages