A production-grade Model Context Protocol (MCP) server that hands Windsurf IDE (or any MCP-compatible agent) total vision, mouse, and keyboard control of your Windows, macOS, or Linux machine — with hard-interrupt safety you can trigger at any moment.
Built for the Director. Controlled by the Director. You interrupt, it stops.
- 38 MCP tools covering screenshots, live recording, mouse, keyboard, zoom, session control, runtime interrupt toggles, and hot-reloadable config
- Anthropic computer-use schema compatible — tool names match so Cascade / Claude recognize them natively
- Two independent kill switches:
- Press Escape anywhere on your system
- Move your mouse more than N pixels (configurable threshold)
- Live screen recording with a circular memory buffer and auto-save on interrupt
- Multi-monitor fast capture via
mss - Massive
config.json— 9 sections, 50+ knobs, hot-reloadable - Human-like input — configurable jitter, easing, timing variance
- Safety guards — rate limits, blocked regions, dangerous-key confirmation, session duration watchdog
- DPI-aware on Windows out of the box
- stdio transport — zero port conflicts, spawns directly from Windsurf
git clone https://github.com/ZannyTornadoCoding/Z-ComputerUse-MCP-Server
cd Z-ComputerUse-MCP-Server
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .
copy config.example.json config.jsonWith optional extras (OCR + template matching + interrupt sound):
pip install -e ".[full]"git clone https://github.com/ZannyTornadoCoding/Z-ComputerUse-MCP-Server
cd Z-ComputerUse-MCP-Server
uv venv
.\.venv\Scripts\Activate.ps1
uv pip install -e ".[full]"
copy config.example.json config.json.\scripts\install.ps1Add this entry to ~/.codeium/windsurf/mcp_config.json (or ~/.codeium/windsurf-next/mcp_config.json for Windsurf Next):
{
"mcpServers": {
"z-computeruse": {
"command": "python",
"args": ["-m", "z_computeruse"],
"env": {
"Z_COMPUTERUSE_CONFIG": "<ABSOLUTE_PATH_TO_REPO>/config.json"
}
}
}
}Replace <ABSOLUTE_PATH_TO_REPO> with the full path to your cloned repository (e.g. C:/Users/you/code/Z-ComputerUse-MCP-Server on Windows, /home/you/code/Z-ComputerUse-MCP-Server on Linux).
Then press the refresh button in Windsurf's MCP panel. Cascade should pick up 38 tools under the z-computeruse server.
| Tool | Purpose |
|---|---|
screenshot |
Full-screen capture with auto-scaling |
screenshot_region |
Capture a rectangle |
zoom_region |
Full-resolution crop for fine inspection |
get_screen_info |
Monitor list, resolutions, active primary |
find_text_on_screen |
OCR-based text locator (requires vision extras) |
find_image_on_screen |
Template-matching locator (requires vision extras) |
| Tool | Purpose |
|---|---|
start_recording |
Begin live frame capture to ring buffer |
stop_recording |
Stop + optionally save to MP4 |
get_recent_frames |
Retrieve N most recent frames |
get_recording_status |
Running state, frame count, duration |
| Tool | Purpose |
|---|---|
mouse_move |
Move cursor to (x, y) |
mouse_position |
Get current cursor position |
left_click / right_click / middle_click / double_click / triple_click |
Click variants |
left_click_drag |
Click-drag between two points |
scroll |
Wheel scroll with direction + clicks |
mouse_down / mouse_up |
Fine-grained click control |
| Tool | Purpose |
|---|---|
type_text |
Type a string with human-like variance |
key |
Press key or combo (ctrl+s, alt+tab) |
hold_key |
Hold a key for N seconds |
key_down / key_up |
Fine-grained key control |
| Tool | Purpose |
|---|---|
wait |
Pause for N seconds (interruptible) |
get_status |
Interrupt state, session info, safety stats |
reset_interrupt |
Clear the interrupt flag so the agent can resume |
pause_session / resume_session |
Temporarily halt all actions |
end_session |
Finalize session, optionally save recording |
| Tool | Purpose |
|---|---|
get_config |
Fetch current config (whole or by section) |
reload_config |
Hot-reload config.json without restart; propagates interrupt changes live |
save_current_config |
Persist in-memory config back to disk |
set_mouse_interrupt |
Toggle mouse-movement kill switch on/off at runtime |
set_escape_interrupt |
Toggle escape-key kill switch on/off at runtime |
set_interrupts_enabled |
Master switch for the entire interrupt manager |
Z-ComputerUse runs two independent global listeners from the moment it starts:
- Keyboard listener — watches for a single press of
Esc(configurable) anywhere on the system. - Mouse listener — watches for any movement more than
mouse_movement_threshold_pxpixels away from where the bot expected the cursor to be.
When either fires:
- A shared
threading.Eventgets set. - The current tool call aborts (raises
InterruptedErrorcaught at tool boundary). - All subsequent tool calls return
{"interrupted": true, "reason": "escape_key" | "mouse_movement"}until you explicitly callreset_interrupt. - If
save_on_interruptis on, the lastbuffer_secondsof screen recording is flushed to disk so you can review what it was doing. - If
sound_on_interruptis on, a sound plays.
To resume: tell Cascade something like "I interrupted because X. The situation is now Y. Call reset_interrupt and continue with Z." Cascade then calls the tool, the flag clears, and it proceeds.
The bot can't self-reset without Cascade calling the tool, and Cascade can't call the tool unless you tell it to — so you always hold the kill switch.
All options in config.json. Sections: server, display, recording, mouse, keyboard, interrupt, safety, vision, session, advanced.
Key knobs to tune:
| Key | Effect |
|---|---|
interrupt.mouse_movement_threshold_px |
Lower = more sensitive (default 35 px) |
interrupt.mouse_movement_grace_ms |
Time window where programmatic moves don't trigger interrupt |
mouse.move_duration_seconds |
Slower = more human-looking |
mouse.human_like_movement |
Adds easing + jitter |
display.screenshot_scale_max_edge_px |
1568 is LLM-safe; higher = sharper but bigger tokens |
recording.fps |
5 is a good balance of quality vs. memory |
safety.max_actions_per_minute |
Rate limiter |
safety.max_session_duration_minutes |
Hard session cap |
safety.blocked_regions |
List of [x, y, w, h] rectangles the bot can't click |
Environment variable Z_COMPUTERUSE_CONFIG overrides the default config path.
For testing outside Windsurf:
python -m z_computeruseIt will listen on stdio. Pair with MCP Inspector:
npx -y @modelcontextprotocol/inspector python -m z_computeruse- Python 3.11+
- Windows 10/11, macOS 12+, or Linux with X11
mss,pyautogui,pynput(installed automatically)- Optional: Tesseract OCR binary (for
find_text_on_screen), ffmpeg-bundled automatically viaimageio-ffmpeg
This tool gives an AI direct, unchecked control of your mouse, keyboard, and screen. It is designed for power users who understand the risks. Always:
- Keep one hand near the mouse / Escape key
- Start with
safety.max_actions_per_minutelow until you trust the agent - Review recordings after every session
- Never run on a machine with sensitive credentials exposed to the agent
MIT — see LICENSE.