Skip to content

Da-Coder-Jr/Computer-Use-MCP

Repository files navigation

Computer-Use-MCP

Computer-Use-MCP is a real stdio MCP server for computer-control experiments. It gives an MCP client a QEMU-isolated desktop target instead of the user's real computer.

The project is intentionally small:

  • MCP over stdio.
  • QEMU for the sandbox computer.
  • No Electron app.
  • No browser SaaS.
  • No model API keys.
  • No host mouse, keyboard, screenshot, browser, or shell control.

Why This Exists

Computer-use agents are powerful and risky. The safest default is to make the "computer" a disposable target that can be started, observed, clicked, typed into, and stopped without touching the host desktop.

This server exposes that target through MCP tools so clients can build approval flows, run replays, and test computer-control policies around a concrete local sandbox.

Requirements

  • Python 3.10+
  • QEMU, available as qemu-system-x86_64
  • An MCP client that supports stdio servers

Optional environment variables:

export COMPUTER_USE_MCP_QEMU=/absolute/path/to/qemu-system-x86_64
export COMPUTER_USE_MCP_STATE_DIR=/tmp/computer-use-mcp
export COMPUTER_USE_MCP_QMP_PORT=39731
export COMPUTER_USE_MCP_HMP_PORT=39732

Install

python3 -m venv .venv
. .venv/bin/activate
python -m pip install -e .

For development without installing:

PYTHONPATH=src python3 -m computer_use_mcp.mcp_stdio

MCP Server

Run the stdio server:

./scripts/run-server.sh

Example MCP client config:

{
  "mcpServers": {
    "Computer-Use-MCP": {
      "command": "/absolute/path/to/Computer-Use-MCP/scripts/run-server.sh"
    }
  }
}

This repo also includes .mcp.json for local clients that can discover project MCP config.

Tools

The server exposes:

  • vm_status: report QEMU path, running state, local monitor endpoints, VNC URL, display size, and network mode.
  • start_vm: start the isolated QEMU computer target.
  • stop_vm: stop the target, optionally removing runtime state.
  • screenshot: return a PNG screenshot as MCP image content plus structured metadata.
  • move_mouse: move the pointer inside the VM.
  • click: click VM coordinates.
  • drag: drag between VM coordinates.
  • scroll: send wheel events.
  • type_text: type bounded text through QEMU key events.
  • press_key: press a key or key combination.
  • launch_app: reserved for future guest-agent images; currently returns an explicit unsupported error.

Tool definitions include JSON Schemas, structured output schemas, and MCP annotations such as readOnlyHint, destructiveHint, idempotentHint, and openWorldHint.

QEMU Target

Basic VM workflow:

./scripts/vm status
./scripts/vm start
./scripts/vm screenshot --only-meta
./scripts/vm stop

The controller starts QEMU with:

  • -display none
  • VNC bound to 127.0.0.1
  • QMP bound to 127.0.0.1
  • HMP bound to 127.0.0.1
  • -nic none by default
  • USB tablet pointer input

The default VM is a bare display target. It is useful for protocol and control-path validation, but launching real guest apps needs a configured guest image or guest agent.

Safety Model

The server does not use:

  • macOS Accessibility APIs
  • host cursor movement
  • host keyboard events
  • host screenshots
  • host browser automation
  • host shell execution
  • Docker
  • cloud credentials

The MCP client should still require user approval before risky actions inside the VM, especially typing, clicks on unknown UI, stopping with state removal, or any action involving accounts, credentials, payments, messages, uploads, downloads, personal data, or external network access.

See docs/security-model.md for the full model.

Testing

Run the unit tests:

python3 -m unittest discover -s tests -v

Run the MCP protocol test:

./scripts/protocol-test.py

If local VM execution is allowed in your environment, test the real QEMU path:

./scripts/vm start
./scripts/vm screenshot --only-meta
./scripts/vm stop

Some sandboxes block localhost socket binding or daemonized QEMU processes. In that case the unit and protocol tests still validate the MCP contract, but a real VM test needs a less restricted local shell or CI runner with QEMU support.

GitHub Ready Checklist

Before pushing:

git status --short
find . -name ".env" -o -name "*.pem" -o -name "*.key" -o -name "*.crt" -o -name "id_rsa*"
python3 -m unittest discover -s tests -v
./scripts/protocol-test.py

This repo intentionally ignores .env, local state, screenshots, workspace data, caches, build outputs, and secret-like file patterns.

Known Limitations

  • There is no prebuilt OS image in the repo.
  • launch_app requires a future guest-agent image.
  • Network is disabled by default with -nic none.
  • Live VM testing requires QEMU privileges that may not be available inside restricted coding sandboxes.
  • Cloud VM testing is not included because it would require user-provided cloud credentials and cost controls.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors