The terminal equivalent of agent-browser
Terminal automation CLI for AI agents
Control vim, htop, lazygit, dialog, and other TUI applications programmatically
Installation • Quick Start • Commands • Runtime • AI Agents
Note
Built with AI, for AI. This project was built with the support of an AI agent, planned thoroughly with a tight feedback loop, and reviewed at each step. While it has been tested extensively, edge cases may still exist. Use it in production with the level of caution you apply to any automation that can drive terminal software.
agent-terminal lets AI agents interact with terminal applications through a simple command-line interface. It manages pseudo-terminal (PTY) sessions with VT100 terminal emulation, captures screen state, and provides keyboard and mouse input for navigating terminal user interfaces. Think of it as headless terminal automation for AI workflows.
Note
Origin: agent-terminal is derived from the earlier pilotty project, and the mascot artwork also comes from pilotty.
- PTY session management: Spawn and manage terminal applications in background sessions.
- Terminal emulation: VT100 emulation for accurate screen capture and state tracking.
- Render modes:
basic,styled, andcolorsnapshot fidelity. - ANSI text output:
snapshot --format textwith styled/color render modes emits ANSI SGR sequences that recreate terminal appearance. - Keyboard-first interaction: Drive TUIs with
press,type,wait,scroll, andclickcommands. - AI-friendly output: Structured JSON responses with actionable suggestions on errors.
- Multi-session support: Run multiple terminal apps simultaneously in isolated sessions.
- Zero-config daemon: The daemon auto-starts on first use and auto-stops after 5 minutes idle.
agent-browser gives AI agents a browser automation surface. agent-terminal does the same for terminal apps, which makes it useful for TUIs, curses-style dashboards, interactive CLIs, and editor workflows that do not expose a web UI.
Download the agent-terminal-<target>.tar.gz asset for your platform from the latest GitHub Release, then extract and install the binary somewhere on your PATH:
tar -xzf agent-terminal-<target>.tar.gz
install -m 0755 agent-terminal ~/.local/bin/agent-terminalThe release also includes agent-terminal-completions.tar.gz if you prefer to install generated shell completions from the published artifacts.
Requires Rust 1.70+.
git clone <repository-url> agent-terminal
cd agent-terminal
cargo build --release -p agent-terminal-cli
install -m 0755 target/release/agent-terminal ~/.local/bin/agent-terminalGenerate completions from the installed binary:
agent-terminal completions bash > ~/.local/share/bash-completion/completions/agent-terminal
agent-terminal completions zsh > ~/.zfunc/_agent-terminal
agent-terminal completions fish > ~/.config/fish/completions/agent-terminal.fishnpm distribution is intentionally disabled for now. The supported public install paths are GitHub release tarballs, shell completion artifacts, or building the binary from source.
agent-terminal --version
agent-terminal --help| Platform | Architecture | Status |
|---|---|---|
| macOS | x64 (Intel) | ✅ |
| macOS | arm64 (Apple Silicon) | ✅ |
| Linux | x64 | ✅ |
| Linux | arm64 | ✅ |
| Windows | - | ❌ Not supported |
Windows is not currently supported because the daemon/runtime model depends on Unix domain sockets and POSIX PTY APIs.
# Spawn a deterministic interactive shell session
agent-terminal spawn --name shell env PS1='agent-terminal> ' bash --noprofile --norc -i
# Wait for the prompt before sending input
agent-terminal wait -s shell "agent-terminal> "
# Capture a baseline before triggering a visible change
HASH=$(agent-terminal snapshot -s shell | jq -r '.content_hash')
# Run one command
agent-terminal type -s shell "printf 'hello from agent-terminal\n'"
agent-terminal press -s shell Enter
# Wait for the change to settle, then inspect the result
agent-terminal snapshot -s shell --await-change "$HASH" --settle 100
agent-terminal snapshot -s shell --format text
# Clean up
agent-terminal kill -s shell
agent-terminal stopCompatibility spellings remain available for existing scripts: agent-terminal key ..., Ctrl+..., Alt+..., short arrows like Up, and agent-terminal wait-for .... New docs and examples use press with Control+..., Meta+..., Option+..., and Arrow..., plus wait for simple text/regex polling, first.
Use agent-terminal snapshot --await-change <content_hash> --settle <ms> when you need to wait for the screen to both change and stabilize.
Broader command grammar cleanup is intentionally deferred to M009-HANDOFF.md so M008 can standardize the preferred shell lifecycle without changing runtime or protocol semantics.
agent-terminal spawn <command> # Spawn a TUI app (for example, vim file.txt)
agent-terminal spawn --name myapp <cmd> # Spawn with a custom session name
agent-terminal spawn --cwd /path cmd # Spawn in a specific working directory
agent-terminal kill # Kill default session
agent-terminal kill -s myapp # Kill specific session
agent-terminal list-sessions # List all active sessions
agent-terminal stop # Stop the daemon and all sessions
agent-terminal daemon # Manually start daemon (usually auto-starts)
agent-terminal examples # Show end-to-end workflow exampleagent-terminal snapshot # Full JSON with text
agent-terminal snapshot --format compact # JSON without text field
agent-terminal snapshot --format text # Plain text with cursor indicator
agent-terminal snapshot -s editor # Snapshot a specific session
# Render modes control style/color fidelity
agent-terminal snapshot --render basic # Text only
agent-terminal snapshot --render styled # Text attributes (bold, italic, underline, dim, inverse)
agent-terminal snapshot --render color # Full color (default: style_map + color_map)
# ANSI text output recreates terminal appearance
agent-terminal snapshot --format text --render color
agent-terminal snapshot --format text --render styled
# Wait for the screen to change before returning
HASH=$(agent-terminal snapshot | jq -r '.content_hash')
agent-terminal press Enter
agent-terminal snapshot --await-change $HASH
agent-terminal snapshot --await-change $HASH --settle 100agent-terminal type "hello"
agent-terminal press Enter
agent-terminal press Control+C
agent-terminal press Meta+F
agent-terminal press ArrowUp
agent-terminal press F1
agent-terminal press Tab
agent-terminal press Escape
agent-terminal press "Control+X m"
agent-terminal press "Escape : w q Enter"
agent-terminal press "a b c" --delay 50
agent-terminal click 10 5
agent-terminal scroll up
agent-terminal scroll down 5Compatibility spellings: key, Ctrl+..., Alt+..., and short arrows like Up still work. Prefer press with Control+..., Meta+..., Option+..., and Arrow... when writing new examples or prompts.
agent-terminal resize 120 40
agent-terminal wait "Ready"
agent-terminal wait "Error" --regex
agent-terminal wait "Done" -t 5000Prefer agent-terminal wait for literal text or regex polling. agent-terminal wait-for ... remains available as a compatibility alias for existing scripts.
Use agent-terminal snapshot --await-change <content_hash> --settle <ms> when you need to wait for the screen to both change and stabilize.
The snapshot command returns structured data about the terminal screen:
{
"snapshot_id": 42,
"size": { "cols": 80, "rows": 24 },
"cursor": { "row": 5, "col": 10, "visible": true },
"text": "Options: [x] Enable [ ] Debug\nActions: [OK] [Cancel]",
"elements": [
{ "kind": "toggle", "row": 0, "col": 9, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
{ "kind": "toggle", "row": 0, "col": 22, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
{ "kind": "button", "row": 1, "col": 9, "width": 4, "text": "[OK]", "confidence": 0.8 },
{ "kind": "button", "row": 1, "col": 14, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
],
"content_hash": 12345678901234567890
}Sessions always capture full style and color data internally, and agent-terminal snapshot defaults to --render color unless you override it per request.
| Mode | --format full (JSON) |
--format text |
|---|---|---|
basic |
text + elements | Plain text with cursor indicator |
styled |
text + elements + style_map |
ANSI text with bold/italic/underline |
color (default) |
text + elements + style_map + color_map |
ANSI text with full color |
agent-terminal snapshot
agent-terminal snapshot --render styled
agent-terminal snapshot --render basicWith --format text and --render styled or --render color, the output contains ANSI SGR escape sequences that recreate the terminal's visual appearance:
agent-terminal spawn ls --color
agent-terminal snapshot --format text --render color
agent-terminal snapshot -s myapp --format text --render color | less -Ragent-terminal uses a daemon architecture similar to agent-browser:
┌─────────────┐ Unix Socket ┌─────────────────┐
│ CLI │ ──────────────────▶ │ Daemon │
│agent-terminal│ JSON-line │ (auto-started) │
└─────────────┘ └─────────────────┘
│
┌────────┴────────┐
▼ ▼
┌───────────┐ ┌───────────┐
│ Session │ │ Session │
│ (htop) │ │ (vim) │
└───────────┘ └───────────┘
- Auto-start: The daemon starts automatically on the first command.
- Auto-stop: The daemon shuts down after 5 minutes with no active sessions.
- Session cleanup: Sessions are removed when their child process exits.
- Shared state: Multiple CLI invocations can address the same named session.
- Clean shutdown:
agent-terminal stopgracefully terminates all sessions.
The daemon socket directory is resolved in this priority order:
$AGENT_TERMINAL_SOCKET_DIR/{session}.sock$XDG_RUNTIME_DIR/agent-terminal/{session}.sock~/.agent-terminal/{session}.sock/tmp/agent-terminal/{session}.sock
Related public environment variables:
| Variable | Description |
|---|---|
AGENT_TERMINAL_SESSION |
Default session name |
AGENT_TERMINAL_SOCKET_DIR |
Override socket directory |
RUST_LOG |
Logging level (for example debug, info) |
# Spawn a named shell session with an explicit prompt
agent-terminal spawn --name shell env PS1='agent-terminal> ' bash --noprofile --norc -i
# Wait for the prompt before sending input
agent-terminal wait -s shell "agent-terminal> "
# Capture a baseline hash, then trigger visible output
HASH=$(agent-terminal snapshot -s shell | jq -r '.content_hash')
agent-terminal type -s shell "printf 'hello from agent-terminal\n'"
agent-terminal press -s shell Enter
# Wait for the output to change and settle, then inspect it
agent-terminal snapshot -s shell --await-change "$HASH" --settle 100
agent-terminal snapshot -s shell --format text
# Clean up the session and daemon explicitly
agent-terminal kill -s shell
agent-terminal stopMost coding agents can work directly from the CLI help:
Use agent-terminal to drive one shell session end to end. Start with agent-terminal examples or --help, then follow the lifecycle: spawn a named shell with an explicit prompt, wait for readiness, type a command, press Enter, snapshot --await-change --settle after the visible change, and clean up with kill then stop.
## Terminal Automation
Use `agent-terminal` for TUI automation. Run `agent-terminal examples` or `agent-terminal --help` before the first interaction.
Preferred shell lifecycle:
1. `agent-terminal spawn --name shell env PS1='agent-terminal> ' bash --noprofile --norc -i` - Start a deterministic shell session.
2. `agent-terminal wait -s shell "agent-terminal> "` - Wait for the prompt before sending input.
3. `HASH=$(agent-terminal snapshot -s shell | jq -r '.content_hash')` - Capture a baseline before the next visible change.
4. `agent-terminal type -s shell "printf 'hello from agent-terminal\n'"` then `agent-terminal press -s shell Enter` - Trigger one command.
5. `agent-terminal snapshot -s shell --await-change "$HASH" --settle 100` - Wait for the screen to change and stabilize.
6. `agent-terminal snapshot -s shell --format text` - Read the updated terminal state.
7. `agent-terminal kill -s shell` then `agent-terminal stop` - Clean up explicitly when done.
Compatibility spellings remain available for existing scripts: `agent-terminal key ...`, `Ctrl+...`, `Alt+...`, short arrows like `Up`, and `agent-terminal wait-for ...`.
Deferred broader grammar work lives in [M009-HANDOFF.md](./M009-HANDOFF.md).For AI-powered terminal apps that stream responses:
agent-terminal spawn --name ai opencode
agent-terminal wait -s ai "Ask anything" -t 15000
HASH=$(agent-terminal snapshot -s ai | jq -r '.content_hash')
agent-terminal type -s ai "write a haiku about rust"
agent-terminal press -s ai Enter
agent-terminal snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text
agent-terminal scroll -s ai up 10
agent-terminal snapshot -s ai --format text
agent-terminal kill -s aiPreferred contract: use agent-terminal press and spell modifiers/arrows as Control+..., Meta+..., Option+..., and Arrow... in new docs, prompts, and scripts.
Compatibility spellings: key, Ctrl+..., Alt+..., and short arrows like Up still work for existing scripts.
| Surface | Preferred examples | Compatibility notes |
|---|---|---|
| Named keys | Enter, Tab, Escape, Space, Backspace |
Case insensitive aliases like Return = Enter, Esc = Escape still work |
| Arrow keys | ArrowUp, ArrowDown, ArrowLeft, ArrowRight |
Short forms like Up, Down, Left, Right still parse |
| Navigation | Home, End, PageUp, PageDown, Insert, Delete |
Also: PgUp, PgDn, Ins, Del |
| Function keys | F1 - F12 |
Unchanged |
| Modifier combos | Control+C, Meta+F, Option+F, Shift+A |
Ctrl+C and Alt+F remain supported |
| Combined modifiers | Control+Option+C |
Short-form combinations like Ctrl+Alt+C still parse |
| Special | Plus |
Literal + character |
| Sequences | "Control+X m", "Escape : w q Enter" |
Space-separated keys |
Contributions are welcome. Before opening a change, run the project checks that match your scope (for example cargo fmt, cargo clippy --all --all-features, targeted tests, and any README/runtime audits touched by your changes).
MIT
- Inspired by agent-browser by Vercel Labs
- Built with vt100 for terminal emulation
- Built with portable-pty for PTY management
