Skip to content

ZannyTornadoCoding/Z-ComputerUse-MCP-Server

Repository files navigation

Z-ComputerUse V1.0.0

A production-grade Model Context Protocol (MCP) server that hands Windsurf IDE (or any MCP-compatible agent) total vision, mouse, and keyboard control of your Windows, macOS, or Linux machine — with hard-interrupt safety you can trigger at any moment.

Built for the Director. Controlled by the Director. You interrupt, it stops.


Highlights

  • 38 MCP tools covering screenshots, live recording, mouse, keyboard, zoom, session control, runtime interrupt toggles, and hot-reloadable config
  • Anthropic computer-use schema compatible — tool names match so Cascade / Claude recognize them natively
  • Two independent kill switches:
    • Press Escape anywhere on your system
    • Move your mouse more than N pixels (configurable threshold)
  • Live screen recording with a circular memory buffer and auto-save on interrupt
  • Multi-monitor fast capture via mss
  • Massive config.json — 9 sections, 50+ knobs, hot-reloadable
  • Human-like input — configurable jitter, easing, timing variance
  • Safety guards — rate limits, blocked regions, dangerous-key confirmation, session duration watchdog
  • DPI-aware on Windows out of the box
  • stdio transport — zero port conflicts, spawns directly from Windsurf

Installation

Option A: pip (simplest)

git clone https://github.com/ZannyTornadoCoding/Z-ComputerUse-MCP-Server
cd Z-ComputerUse-MCP-Server
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -e .
copy config.example.json config.json

With optional extras (OCR + template matching + interrupt sound):

pip install -e ".[full]"

Option B: uv (fast, recommended)

git clone https://github.com/ZannyTornadoCoding/Z-ComputerUse-MCP-Server
cd Z-ComputerUse-MCP-Server
uv venv
.\.venv\Scripts\Activate.ps1
uv pip install -e ".[full]"
copy config.example.json config.json

Option C: one-shot installer

.\scripts\install.ps1

Windsurf Integration

Add this entry to ~/.codeium/windsurf/mcp_config.json (or ~/.codeium/windsurf-next/mcp_config.json for Windsurf Next):

{
    "mcpServers": {
        "z-computeruse": {
            "command": "python",
            "args": ["-m", "z_computeruse"],
            "env": {
                "Z_COMPUTERUSE_CONFIG": "<ABSOLUTE_PATH_TO_REPO>/config.json"
            }
        }
    }
}

Replace <ABSOLUTE_PATH_TO_REPO> with the full path to your cloned repository (e.g. C:/Users/you/code/Z-ComputerUse-MCP-Server on Windows, /home/you/code/Z-ComputerUse-MCP-Server on Linux).

Then press the refresh button in Windsurf's MCP panel. Cascade should pick up 38 tools under the z-computeruse server.


Tools Exposed

Screen & Vision

Tool Purpose
screenshot Full-screen capture with auto-scaling
screenshot_region Capture a rectangle
zoom_region Full-resolution crop for fine inspection
get_screen_info Monitor list, resolutions, active primary
find_text_on_screen OCR-based text locator (requires vision extras)
find_image_on_screen Template-matching locator (requires vision extras)

Recording

Tool Purpose
start_recording Begin live frame capture to ring buffer
stop_recording Stop + optionally save to MP4
get_recent_frames Retrieve N most recent frames
get_recording_status Running state, frame count, duration

Mouse

Tool Purpose
mouse_move Move cursor to (x, y)
mouse_position Get current cursor position
left_click / right_click / middle_click / double_click / triple_click Click variants
left_click_drag Click-drag between two points
scroll Wheel scroll with direction + clicks
mouse_down / mouse_up Fine-grained click control

Keyboard

Tool Purpose
type_text Type a string with human-like variance
key Press key or combo (ctrl+s, alt+tab)
hold_key Hold a key for N seconds
key_down / key_up Fine-grained key control

Control & Session

Tool Purpose
wait Pause for N seconds (interruptible)
get_status Interrupt state, session info, safety stats
reset_interrupt Clear the interrupt flag so the agent can resume
pause_session / resume_session Temporarily halt all actions
end_session Finalize session, optionally save recording

Config & Runtime Toggles

Tool Purpose
get_config Fetch current config (whole or by section)
reload_config Hot-reload config.json without restart; propagates interrupt changes live
save_current_config Persist in-memory config back to disk
set_mouse_interrupt Toggle mouse-movement kill switch on/off at runtime
set_escape_interrupt Toggle escape-key kill switch on/off at runtime
set_interrupts_enabled Master switch for the entire interrupt manager

The Interrupt System (Total Control)

Z-ComputerUse runs two independent global listeners from the moment it starts:

  1. Keyboard listener — watches for a single press of Esc (configurable) anywhere on the system.
  2. Mouse listener — watches for any movement more than mouse_movement_threshold_px pixels away from where the bot expected the cursor to be.

When either fires:

  1. A shared threading.Event gets set.
  2. The current tool call aborts (raises InterruptedError caught at tool boundary).
  3. All subsequent tool calls return {"interrupted": true, "reason": "escape_key" | "mouse_movement"} until you explicitly call reset_interrupt.
  4. If save_on_interrupt is on, the last buffer_seconds of screen recording is flushed to disk so you can review what it was doing.
  5. If sound_on_interrupt is on, a sound plays.

To resume: tell Cascade something like "I interrupted because X. The situation is now Y. Call reset_interrupt and continue with Z." Cascade then calls the tool, the flag clears, and it proceeds.

The bot can't self-reset without Cascade calling the tool, and Cascade can't call the tool unless you tell it to — so you always hold the kill switch.


Config Highlights

All options in config.json. Sections: server, display, recording, mouse, keyboard, interrupt, safety, vision, session, advanced.

Key knobs to tune:

Key Effect
interrupt.mouse_movement_threshold_px Lower = more sensitive (default 35 px)
interrupt.mouse_movement_grace_ms Time window where programmatic moves don't trigger interrupt
mouse.move_duration_seconds Slower = more human-looking
mouse.human_like_movement Adds easing + jitter
display.screenshot_scale_max_edge_px 1568 is LLM-safe; higher = sharper but bigger tokens
recording.fps 5 is a good balance of quality vs. memory
safety.max_actions_per_minute Rate limiter
safety.max_session_duration_minutes Hard session cap
safety.blocked_regions List of [x, y, w, h] rectangles the bot can't click

Environment variable Z_COMPUTERUSE_CONFIG overrides the default config path.


Running Standalone

For testing outside Windsurf:

python -m z_computeruse

It will listen on stdio. Pair with MCP Inspector:

npx -y @modelcontextprotocol/inspector python -m z_computeruse

Requirements

  • Python 3.11+
  • Windows 10/11, macOS 12+, or Linux with X11
  • mss, pyautogui, pynput (installed automatically)
  • Optional: Tesseract OCR binary (for find_text_on_screen), ffmpeg-bundled automatically via imageio-ffmpeg

Safety & Disclaimer

This tool gives an AI direct, unchecked control of your mouse, keyboard, and screen. It is designed for power users who understand the risks. Always:

  • Keep one hand near the mouse / Escape key
  • Start with safety.max_actions_per_minute low until you trust the agent
  • Review recordings after every session
  • Never run on a machine with sensitive credentials exposed to the agent

License

MIT — see LICENSE.

About

Production-grade MCP server giving Windsurf IDE (or any MCP-compatible agent) total vision, mouse, and keyboard control of your machine — with two independent hard-interrupt kill switches (Escape key + mouse movement). 38 tools, hot-reloadable config, live screen recording.

Resources

License

Stars

Watchers

Forks

Packages