Ghost MCP Server

The Ghost in the Machine - An MCP (Model Context Protocol) server that exposes OS-level UI automation capabilities to AI clients.

License: Free for personal and hobby use (PolyForm Noncommercial). Commercial use requires a license — contact Peter Isberg.

Overview

Ghost MCP allows AI assistants like Claude to control your computer's mouse, keyboard, and screen. This enables automation of legacy applications, GUI testing, and any task that requires interacting with applications that don't have APIs.

Features

🖱️ Mouse Control: Move cursor, click, double-click, scroll
⌨️ Keyboard Control: Type text, press individual keys
📸 Screen Capture: Take screenshots with optional region selection
🔍 OCR: Read text and word positions from the screen with Tesseract (always built in)
🔐 Token Authentication: Requires a secret token before the server will start
📋 Audit Logging: Tamper-evident JSON Lines log of every tool call, auth failure, and lifecycle event
🛡️ Failsafe: Emergency shutdown by moving mouse to top-left corner (0,0)
📝 Proper Logging: All logs go to stderr, keeping stdout clean for MCP protocol
🌐 Dual Transport: stdio (default) or HTTP/SSE for web-based clients

Available Tools

Core Tools

Tool	Description	Parameters
`get_screen_size`	Get screen resolution and DPI scale factor.	None
`move_mouse`	Move mouse cursor to absolute coordinates. Origin (0,0) is top-left.	`x`, `y`
`click`	Click mouse at current cursor position. Use `move_mouse` first.	`button` (left/right/middle)
`click_at`	Move mouse to coordinates and click in one operation. Preferred over separate move+click.	`x`, `y`, `button` (optional, default left)
`double_click`	Move mouse to coordinates and double-click. Use for opening files or activating items.	`x`, `y`
`scroll`	Move mouse to coordinates and scroll the wheel.	`x`, `y`, `direction` (up/down/left/right), `amount` (optional, default 3)
`type_text`	Type text via keyboard. Use for input fields.	`text`
`press_key`	Press a single key. Use for Enter, Tab, shortcuts, etc.	`key`
`take_screenshot`	Capture screen as base64 PNG. Optional region parameters.	`x`, `y`, `width`, `height`, `quality`

OCR Tools ⭐ (Require Tesseract)

Tool	Description	Parameters
`read_screen_text`	Read text from screen using OCR. Returns text and word positions.	`x`, `y`, `width`, `height`, `grayscale`
`find_and_click`	Find text on screen with OCR and click its center.	`text`, `button`, `nth`, `x`, `y`, `width`, `height`
`find_and_click_all` ⭐	Click multiple buttons in ONE atomic operation.	`texts` (array), `button`, `delay_ms`
`wait_for_text` ⭐	Wait for text to appear or disappear. Verify UI changes.	`text`, `visible`, `timeout_ms`, `x`, `y`, `width`, `height`
`find_elements` ⭐	Discover all clickable text elements with coordinates. Fast alternative to screenshots.	`x`, `y`, `width`, `height`

Note: OCR tools require Tesseract to be installed and TESSDATA_PREFIX to be set. The installation scripts handle this automatically. See Prerequisites.

⭐ New Tools for AI Accuracy

find_and_click_all - Click 3 buttons with ONE call instead of 3+ verification loops (75% faster)
wait_for_text - Properly verify UI state changes instead of guessing with screenshots
find_elements - Discover all clickable elements 10x faster than taking screenshots

Prerequisites

The installation scripts (install.ps1 for Windows, install.sh for Linux) automatically install all required dependencies.

If building manually, you'll need:

Go 1.24+ - Download Go
C Compiler - For RobotGo (GUI automation library)
Tesseract OCR + Leptonica - Required for OCR tools (read_screen_text, find_and_click); always built in

Platform-Specific Dependencies

Windows

Required: MinGW-w64 GCC Compiler

Install via Chocolatey (recommended):

# Run as Administrator
choco install mingw -y

Tesseract OCR is required (always built in). Install via vcpkg (the install script does this automatically):

# Install vcpkg and Tesseract (MinGW-compatible triplet)
git clone https://github.com/Microsoft/vcpkg.git $env:USERPROFILE\vcpkg
cd $env:USERPROFILE\vcpkg
.\bootstrap-vcpkg.bat -disableMetrics
$env:CC  = "C:\ProgramData\mingw64\mingw64\bin\gcc.exe"
$env:CXX = "C:\ProgramData\mingw64\mingw64\bin\g++.exe"
.\vcpkg install tesseract:x64-mingw-dynamic leptonica:x64-mingw-dynamic

# Download English language data for Tesseract
$tessdata = "$env:USERPROFILE\vcpkg\installed\x64-mingw-dynamic\share\tessdata"
Invoke-WebRequest "https://github.com/tesseract-ocr/tessdata_fast/raw/main/eng.traineddata" -OutFile "$tessdata\eng.traineddata"

# Set required environment variables (persist across reboots)
[System.Environment]::SetEnvironmentVariable("TESSDATA_PREFIX", "$env:USERPROFILE\vcpkg\installed\x64-mingw-dynamic\share\tessdata", "User")
[System.Environment]::SetEnvironmentVariable("CGO_CPPFLAGS", "-I$env:USERPROFILE/vcpkg/installed/x64-mingw-dynamic/include", "User")
[System.Environment]::SetEnvironmentVariable("CGO_LDFLAGS",  "-L$env:USERPROFILE/vcpkg/installed/x64-mingw-dynamic/lib", "User")

macOS

# Install Xcode Command Line Tools
xcode-select --install

# Tesseract OCR (required)
brew install tesseract leptonica

# Grant accessibility permissions (required for automation)
# System Settings → Privacy & Security → Accessibility → Add Terminal/your IDE

Linux

Ubuntu/Debian:

# X11 automation libraries
sudo apt-get install libx11-dev xorg-dev libxtst-dev libpng-dev
# Tesseract OCR (required)
sudo apt-get install tesseract-ocr libleptonica-dev libtesseract-dev

Fedora:

sudo dnf install libX11-devel libXtst-devel libpng-devel
sudo dnf install tesseract tesseract-devel leptonica-devel

Arch Linux:

sudo pacman -S libx11 libxtst libpng
sudo pacman -S tesseract tesseract-data-eng

Installation

Quick Install (Recommended)

We provide automated installation scripts that handle all dependencies and configuration:

Windows

# Run PowerShell as Administrator
.\install.ps1

The installer will:

Install MinGW (GCC compiler) via Chocolatey
Install Tesseract OCR via vcpkg (x64-mingw-dynamic triplet)
Download English language data (eng.traineddata) and set TESSDATA_PREFIX
Build the binary and copy runtime DLLs
Generate auth token
Configure environment variables
Display MCP client configuration

Linux

chmod +x install.sh
./install.sh

The installer will:

Detect your package manager (apt/dnf/yum/pacman/zypper)
Install GCC, X11 development libraries, and Tesseract OCR
Build the binary
Generate auth token
Save environment configuration to ~/.config/ghost-mcp/env
Display MCP client configuration

OCR Support

OCR (read_screen_text) is always built in. The installer sets up Tesseract automatically, including downloading the English language data file and setting TESSDATA_PREFIX.

Manual Build

If you prefer to build manually:

# Clone or navigate to the project directory
cd ghost-mcp

# Download dependencies
go mod download

# Build (Windows — requires MinGW and Tesseract via vcpkg, see Prerequisites)
go build -o ghost-mcp.exe ./cmd/ghost-mcp/

# Build (macOS/Linux)
go build -o ghost-mcp ./cmd/ghost-mcp/

On Windows, set CGO_CPPFLAGS and CGO_LDFLAGS before building (see the Windows prerequisites section above). Also copy the runtime DLLs next to the binary:

Copy-Item "$env:USERPROFILE\vcpkg\installed\x64-mingw-dynamic\bin\*.dll" . -Force

Configuration

MCP Client Configuration (mcp.json)

Add this to your MCP client configuration to connect to Ghost MCP:

For Claude Desktop (Windows)

{
  "mcpServers": {
    "ghost-mcp": {
      "command": "C:\\path\\to\\ghost-mcp.exe",
      "args": [],
      "env": {
        "GHOST_MCP_TOKEN": "your-secret-token-here",
        "TESSDATA_PREFIX": "C:\\Users\\<you>\\vcpkg\\installed\\x64-mingw-dynamic\\share\\tessdata"
      }
    }
  }
}

For Claude Desktop (macOS/Linux)

{
  "mcpServers": {
    "ghost-mcp": {
      "command": "/absolute/path/to/ghost-mcp",
      "args": [],
      "env": {
        "GHOST_MCP_TOKEN": "your-secret-token-here",
        "TESSDATA_PREFIX": "/usr/share/tesseract-ocr/5/tessdata"
      }
    }
  }
}

Replace your-secret-token-here with the same value you generated for GHOST_MCP_TOKEN.

Configuration File Locations

Platform	Config Location
Windows	`%APPDATA%\Claude\mcp.json`
macOS	`~/Library/Application Support/Claude/mcp.json`
Linux	`~/.config/Claude/mcp.json`

Environment Variables

Variable	Description	Default
`GHOST_MCP_TOKEN`	Required. Secret authentication token. Server refuses to start without it.	(none — must be set)
`TESSDATA_PREFIX`	Required for OCR. Directory that directly contains `eng.traineddata` (NOT its parent).	(none — OCR will fail if unset)
`GHOST_MCP_AUDIT_LOG`	Directory for audit log files. Created automatically if absent.	`<UserConfigDir>/ghost-mcp/audit/`
`GHOST_MCP_DEBUG`	Enable debug logging (`1` = on)	`0` (disabled)
`GHOST_MCP_VISUAL`	Show visual cursor pulse on mouse actions (`1` = on)	`0` (disabled)
`GHOST_MCP_KEEP_SCREENSHOTS`	Keep screenshot files on disk after use (`1` = keep, default deletes them)	`0` (deleted after use)
`GHOST_MCP_SCREENSHOT_DIR`	Directory where screenshot files are written	System temp directory
`GHOST_MCP_TRANSPORT`	Transport mode: `stdio` or `http`	`stdio`
`GHOST_MCP_HTTP_ADDR`	Listen address for HTTP/SSE mode	`localhost:8080`
`GHOST_MCP_HTTP_BASE_URL`	Public base URL advertised to SSE clients	`http://<addr>`

Security note: GHOST_MCP_TOKEN must be set to a random secret. Generate one with:
# Linux/macOS
export GHOST_MCP_TOKEN=$(openssl rand -hex 32)

# Windows (PowerShell)
$env:GHOST_MCP_TOKEN = -join ((1..32) | % { '{0:x}' -f (Get-Random -Max 256) })
Then add it to the env block of your MCP client configuration (see examples below).

Usage

Starting the Server

The server runs via stdio and is typically started by your MCP client automatically. To test manually:

# Run the server (will wait for stdin/stdout communication)
./ghost-mcp

# With debug logging
GHOST_MCP_DEBUG=1 ./ghost-mcp

Example Tool Calls

Once connected, AI clients can use the tools like this:

// Find text on screen and click it — start here for any click task
{
  "tool": "find_and_click",
  "arguments": {"text": "Save"}
}
// Response: {"success": true, "found": "Save", "x": 960, "y": 540, "button": "left", "occurrence": 1}

// Move and click in one call
{
  "tool": "click_at",
  "arguments": {"x": 960, "y": 540, "button": "left"}
}
// Response: {"success": true, "button": "left", "x": 960, "y": 540}

// Double-click to open a file
{
  "tool": "double_click",
  "arguments": {"x": 960, "y": 540}
}
// Response: {"success": true, "x": 960, "y": 540}

// Scroll down in a list
{
  "tool": "scroll",
  "arguments": {"x": 960, "y": 540, "direction": "down", "amount": 5}
}
// Response: {"success": true, "x": 960, "y": 540, "direction": "down", "amount": 5}

// Type text into the focused field
{
  "tool": "type_text",
  "arguments": {"text": "Hello, World!"}
}
// Response: {"success": true, "characters_typed": 13}

// Press Enter key
{
  "tool": "press_key",
  "arguments": {"key": "enter"}
}
// Response: {"success": true, "key": "enter"}

// Take screenshot — returns JSON metadata + PNG image content
{
  "tool": "take_screenshot"
}
// Response: {"success": true, "filepath": "/tmp/ghost-mcp-screenshot-....png", "width": 1920, "height": 1080}
// (image data returned as a separate image/png content block)

// Read all text from screen with word positions
{
  "tool": "read_screen_text"
}
// Response: {"success": true, "text": "...", "words": [{"text": "Save", "x": 940, "y": 530, "width": 40, "height": 20, "confidence": 98.5}, ...]}

Security

🔐 Token Authentication

Ghost MCP controls your mouse, keyboard, and screen — so it requires authentication before it will serve any requests.

How it works:

Set GHOST_MCP_TOKEN to a random secret string in your shell and in your MCP client's env config block.
The server reads the token at startup and refuses to start if it is not set.
Every MCP request is validated against the startup token via a server hook. If the environment variable is cleared or altered after startup, all subsequent requests are rejected.

What this protects against:

Running the server without prior configuration (no token → no service)
Unauthorized processes that can execute the binary but don't know the token value

Generating a token:

# Linux/macOS
openssl rand -hex 32

# Windows (PowerShell)
-join ((1..32) | % { '{0:x}' -f (Get-Random -Max 256) })

Use the output as the value for GHOST_MCP_TOKEN in both your shell environment and your MCP client's env block.

📋 Audit Logging

Ghost MCP writes a tamper-evident audit trail for every event: tool invocations, authentication failures, client connections, and server lifecycle events.

Log format: JSON Lines (one JSON object per line), one file per UTC day.

Default location:

Platform	Default path
Windows	`%AppData%\ghost-mcp\audit\ghost-mcp-audit-YYYY-MM-DD.jsonl`
macOS	`~/Library/Application Support/ghost-mcp/audit/ghost-mcp-audit-YYYY-MM-DD.jsonl`
Linux	`~/.config/ghost-mcp/audit/ghost-mcp-audit-YYYY-MM-DD.jsonl`

Override with GHOST_MCP_AUDIT_LOG=/your/dir.

Example log entries:

// Server startup
{"seq":1,"timestamp":"2025-01-15T09:00:00.123Z","event":"SERVER_START","params":{"platform":"linux/amd64","version":"1.0.0"},"prev_hash":"0000...","hash":"a3f2..."}

// Client connected (from MCP initialize handshake)
{"seq":2,"timestamp":"2025-01-15T09:00:01.456Z","event":"CLIENT_CONNECTED","client_id":"claude-desktop","params":{"client_name":"claude-desktop","client_version":"1.0"},"prev_hash":"a3f2...","hash":"b7c1..."}

// Tool invocation with parameters
{"seq":3,"timestamp":"2025-01-15T09:00:05.789Z","event":"TOOL_CALL","tool":"type_text","client_id":"claude-desktop","params":{"text":"Hello world"},"prev_hash":"b7c1...","hash":"c9d4..."}

// Screenshot audit trail
{"seq":4,"timestamp":"2025-01-15T09:00:06.012Z","event":"SCREENSHOT_REQUESTED","tool":"take_screenshot","client_id":"claude-desktop","prev_hash":"c9d4...","hash":"d2e5..."}

// Authentication failure
{"seq":5,"timestamp":"2025-01-15T09:00:10.345Z","event":"AUTH_FAILURE","error":"invalid or missing GHOST_MCP_TOKEN","prev_hash":"d2e5...","hash":"e8f3..."}

Tamper detection: Each entry carries:

hash — SHA-256 of the entry's own content
prev_hash — hash of the previous entry (hash chain)

If any entry is modified, deleted, or reordered, the chain breaks. Verify a log file with:

# Verify integrity of a specific log file
# (built-in to the ghost-mcp source — see VerifyLogFile in audit.go)
go run . verify-log /path/to/ghost-mcp-audit-2025-01-15.jsonl

Logged events:

Event	Trigger
`SERVER_START`	Server starts
`SERVER_STOP`	Server shuts down
`CLIENT_CONNECTED`	MCP initialize handshake completes
`TOOL_CALL`	Tool invoked (includes sanitized parameters)
`TOOL_SUCCESS`	Tool completed successfully
`TOOL_FAILURE`	Tool returned an error result
`SCREENSHOT_REQUESTED`	`take_screenshot` tool succeeded
`AUTH_FAILURE`	Request rejected due to missing/invalid token
`REQUEST_ERROR`	MCP-level error (unknown tool, malformed request)

HTTP/SSE Transport

By default Ghost MCP communicates over stdio, which is the standard MCP transport used by Claude Desktop. Set GHOST_MCP_TRANSPORT=http to start an HTTP/SSE server instead.

Configuration

{
  "mcpServers": {
    "ghost-mcp": {
      "command": "C:\\path\\to\\ghost-mcp.exe",
      "env": {
        "GHOST_MCP_TOKEN": "your-secret-token-here",
        "GHOST_MCP_TRANSPORT": "http",
        "GHOST_MCP_HTTP_ADDR": "localhost:8080",
        "GHOST_MCP_HTTP_BASE_URL": "http://localhost:8080"
      }
    }
  }
}

Endpoints

Endpoint	Method	Description
`/sse`	GET	Open SSE stream (MCP connection)
`/message`	POST	Send MCP messages

Authentication

HTTP mode uses the same GHOST_MCP_TOKEN as Bearer authentication. Every request must include:

Authorization: Bearer <your-token>

Requests without a valid Bearer token are rejected with HTTP 401 and logged as AUTH_FAILURE events in the audit log.

Example with curl

# Open SSE stream (keep this running in a terminal)
curl -N -H "Authorization: Bearer $GHOST_MCP_TOKEN" http://localhost:8080/sse

# Send an MCP initialize message
curl -X POST \
  -H "Authorization: Bearer $GHOST_MCP_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","clientInfo":{"name":"test","version":"1.0"},"capabilities":{}}}' \
  "http://localhost:8080/message?sessionId=<session-id-from-sse>"

Safety Features

🛡️ Failsafe Mechanism

Ghost MCP includes an emergency shutdown feature:

Trigger: Move your mouse to the absolute top-left corner of the screen (coordinates 0,0)
Effect: The server will log an error and initiate graceful shutdown
Purpose: Prevents runaway AI automation loops from causing damage

⚠️ WARNING: Do not move your mouse to (0,0) during normal operation!

Logging Safety

All application logs are written to stderr, never to stdout. This ensures:

MCP JSON-RPC protocol on stdout remains clean
No corruption of messages between client and server
Debug output doesn't interfere with tool responses

Troubleshooting

Common Issues

"Permission Denied" or "Access Denied"

macOS: Grant accessibility permissions in System Settings
Linux: Ensure you have X11 permissions (xhost + for testing)
Windows: Run as Administrator if needed

"RobotGo initialization failed"

Verify platform-specific dependencies are installed (see Prerequisites)
Check that no other application is blocking accessibility APIs

MCP Client Can't Connect

Verify the binary path in mcp.json is correct
Ensure the binary is executable (chmod +x ghost-mcp on Unix)
Check that no firewall is blocking stdio communication

Screenshots Fail

Ensure sufficient disk space in temp directory
Verify write permissions to system temp folder

OCR Fails / `read_screen_text` Returns Error

TESSDATA_PREFIX must point to the directory that directly contains eng.traineddata (not its parent)
On Windows with vcpkg: TESSDATA_PREFIX = %USERPROFILE%\vcpkg\installed\x64-mingw-dynamic\share\tessdata
On Linux: TESSDATA_PREFIX = /usr/share/tesseract-ocr/5/tessdata (verify with find /usr/share -name eng.traineddata)
Verify: ls $TESSDATA_PREFIX\eng.traineddata should find the file
Check the server startup log — it prints the value of TESSDATA_PREFIX at start
Set GHOST_MCP_KEEP_SCREENSHOTS=1 to keep the image files OCR is processing for manual inspection

Debug Mode

Enable verbose logging for troubleshooting:

{
  "mcpServers": {
    "ghost-mcp": {
      "command": "/path/to/ghost-mcp",
      "env": {
        "GHOST_MCP_DEBUG": "1"
      }
    }
  }
}

Development

Running Tests

Ghost MCP includes comprehensive tests including a test fixture GUI for integration testing.

# Windows
test_runner.bat              # Unit tests only
test_runner.bat integration  # Integration tests (requires GCC)
test_runner.bat all          # All tests
test_runner.bat fixture      # Start fixture server

# macOS/Linux
chmod +x test_runner.sh
./test_runner.sh
./test_runner.sh integration
./test_runner.sh all
./test_runner.sh fixture

# Direct go test
go test -v -short ./...                 # Unit tests
set INTEGRATION=1 && go test -v ./...   # Integration tests (Windows)
export INTEGRATION=1 && go test -v ./... # Integration tests (Unix)

For detailed testing documentation, see TESTING.md.

Building for Release

# Windows
GOOS=windows GOARCH=amd64 go build -o ghost-mcp.exe -ldflags="-s -w" ./cmd/ghost-mcp/

# macOS
GOOS=darwin GOARCH=amd64 go build -o ghost-mcp -ldflags="-s -w" ./cmd/ghost-mcp/

# Linux
GOOS=linux GOARCH=amd64 go build -o ghost-mcp -ldflags="-s -w" ./cmd/ghost-mcp/

Architecture

See ARCHITECTURE.md for detailed information about:

Server structure and components
MCP protocol implementation
Tool handling flow
Failsafe mechanism design

Documentation

Document	Description
README.md	Getting started and usage
USAGE.md	Interactive examples and API reference
ARCHITECTURE.md	System design and internals
TESTING.md	Testing guide and fixture docs

License

See LICENSE file for details.

Contributing

Contributions welcome! Please ensure:

All tests pass (go test ./...)
Code follows Go formatting (go fmt ./...)
No linting errors (go vet ./...)
Documentation is updated for new features

Acknowledgments

MCP SDK - Model Context Protocol implementation
RobotGo - Cross-platform automation library

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
cmd/ghost-mcp		cmd/ghost-mcp
docs		docs
installers		installers
internal		internal
mcpclient		mcpclient
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum
run_tests.tmp.bat		run_tests.tmp.bat
test_runner.bat		test_runner.bat
test_runner.sh		test_runner.sh

Folders and files

Latest commit

History

Repository files navigation

Ghost MCP Server

Overview

Features

Available Tools

Core Tools

OCR Tools ⭐ (Require Tesseract)

⭐ New Tools for AI Accuracy

Prerequisites

Platform-Specific Dependencies

Windows

macOS

Linux

Installation

Quick Install (Recommended)

Windows

Linux

OCR Support

Manual Build

Configuration

MCP Client Configuration (mcp.json)

For Claude Desktop (Windows)

For Claude Desktop (macOS/Linux)

Configuration File Locations

Environment Variables

Usage

Starting the Server

Example Tool Calls

Security

🔐 Token Authentication

📋 Audit Logging

HTTP/SSE Transport

Configuration

Endpoints

Authentication

Example with curl

Safety Features

🛡️ Failsafe Mechanism

Logging Safety

Troubleshooting

Common Issues

"Permission Denied" or "Access Denied"

"RobotGo initialization failed"

MCP Client Can't Connect

Screenshots Fail

OCR Fails / read_screen_text Returns Error

Debug Mode

Development

Running Tests

Building for Release

Architecture

Documentation

License

Contributing

Acknowledgments

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

OCR Fails / `read_screen_text` Returns Error

Packages