A Model Context Protocol (MCP) server that exposes Win32 APIs for desktop application automation, window enumeration, and process management. This server enables AI assistants and automation tools to interact with native Windows applications through a standardized protocol. Features a system tray icon, configurable agentic mode, permission dialogs, and comprehensive activity tracking.
- System Tray Integration: Always-accessible tray icon with context menu
- Activity Monitor: Real-time GUI showing all API operations and their status
- Settings Dialog: Comprehensive configuration with tabbed interface
- Permission Dialogs: User-friendly permission requests for API operations
- Configurable Automation: Toggle between automatic and manual API execution
- Permission System: Fine-grained control over API access with user prompts
- Activity Tracking: Complete history of all operations with filtering and search
- Security Controls: Configurable restrictions on elevated process access
- Window Enumeration: List all visible desktop applications and their windows
- Child Window Discovery: Enumerate child windows and controls within applications
- Window Information: Get detailed window properties (title, class, position, state)
- Message Posting: Send messages to controls (click buttons, input text)
- Native Win32 Messaging: Direct interaction with Windows controls
- Focus Management: Set focus to specific windows and controls
- Process Enumeration: List all running processes with detailed information
- Application Type Detection: Identify console, Win32, and .NET applications
- Process Metadata: Get executable paths, command lines, and resource usage
-
MCP Protocol Layer
- JSON-RPC 2.0 communication
- Tool registration and discovery
- Request/response handling
-
Win32 API Wrappers
WindowManager
: Window enumeration and informationProcessManager
: Process discovery and analysisMessageSender
: Control interaction and automationSecurityManager
: Safe API access and validation
-
Data Models
- Window information structures
- Process metadata
- Control interaction schemas
- .NET 8.0 SDK or later
- Windows 10/11 (required for Win32 APIs)
- Visual Studio 2022 or VS Code (recommended)
-
Clone the repository
git clone <your-repo-url> cd WinAPIMCP
-
Build the project
cd src dotnet build
-
Run the MCP server
dotnet run
The server supports various configuration options:
dotnet run -- --port 3000 --log-level Debug --allow-elevated
Options:
--port
: Server port (default: 3000)--log-level
: Logging level (Debug, Info, Warn, Error)--allow-elevated
: Allow interaction with elevated processes--config
: Path to configuration file
To connect an MCP client to the Windows API MCP Server, use the following configuration:
Recommended: JSON-RPC over HTTP (Multiple Clients)
First, start the Windows API MCP Server:
cd D:\Development\Personal\WinAPIMCP\src
dotnet run -- --port 3000
Then add this to your claude_desktop_config.json
:
{
"mcpServers": {
"windows-api": {
"command": "dotnet",
"args": [
"run",
"--project",
"D:\\Development\\Personal\\WinAPIMCP\\src",
"--",
"--port",
"3000"
],
"env": {
"DOTNET_ENVIRONMENT": "Production"
}
}
}
}
Alternative: STDIO Transport (Single Client)
For single client connections using STDIO:
{
"mcpServers": {
"windows-api": {
"command": "dotnet",
"args": [
"run",
"--project",
"D:\\Development\\Personal\\WinAPIMCP\\src"
]
}
}
}
{
"name": "windows-api-mcp",
"description": "Windows API automation server",
"transport": {
"type": "stdio",
"command": "dotnet",
"args": [
"run",
"--project",
"path/to/WinAPIMCP/src",
"--",
"--port", "3000",
"--log-level", "Info"
]
},
"capabilities": {
"tools": [
"enumerate_windows",
"enumerate_child_windows",
"get_window_info",
"set_window_focus",
"show_window",
"find_windows_by_title",
"find_windows_by_class",
"enumerate_processes",
"get_process_info",
"find_processes_by_name",
"click_at_coordinates",
"click_control",
"send_text",
"send_keys",
"get_control_text",
"set_control_text",
"find_elements_by_text",
"take_screenshot",
"get_cursor_position",
"move_cursor",
"drag_from_to",
"scroll_window"
]
}
}
For VS Code MCP extension, add to your settings:
{
"mcp.servers": {
"windows-api": {
"command": "dotnet",
"args": [
"run",
"--project",
"D:\\Development\\Personal\\WinAPIMCP\\src"
]
}
}
}
- Ensure .NET Runtime: The client machine must have .NET 8.0 runtime installed
- Windows Platform: Server only runs on Windows 10/11
- Permissions: Some operations may require elevated privileges
- Firewall: Ensure the configured port (default 3000) is accessible
When connecting clients, consider the agentic mode setting:
- Agentic Mode ON: API calls execute automatically without user prompts
- Agentic Mode OFF: User will see permission dialogs for each API operation
You can toggle this through:
- System tray context menu
- Settings dialog (accessible from tray)
- Main application window toolbar
The server exposes HTTP endpoints for testing and direct tool access:
# Check server status and basic information
Invoke-RestMethod -Uri "http://localhost:3000/" -Method Get
# Test endpoint with sample MCP requests
Invoke-RestMethod -Uri "http://localhost:3000/test" -Method Get
# Get comprehensive tool list with documentation
Invoke-RestMethod -Uri "http://localhost:3000/mcp/tools" -Method Get
# Standard MCP JSON-RPC tool call
$body = @{
jsonrpc = "2.0"
method = "tools/call"
id = 1
params = @{
name = "enumerate_windows"
arguments = @{
include_minimized = $true
}
}
} | ConvertTo-Json -Depth 4
Invoke-RestMethod -Uri "http://localhost:3000/" -Method Post -Body $body -ContentType "application/json"
Lists all visible desktop windows.
{
"name": "enumerate_windows",
"arguments": {
"include_minimized": false,
"filter_by_title": "optional_regex_pattern"
}
}
Response:
{
"windows": [
{
"handle": "0x12345678",
"title": "Notepad",
"class_name": "Notepad",
"process_id": 1234,
"is_visible": true,
"bounds": { "x": 100, "y": 100, "width": 800, "height": 600 }
}
]
}
Enumerates child windows and controls within a parent window.
{
"name": "enumerate_child_windows",
"arguments": {
"parent_handle": "0x12345678",
"include_all_descendants": true
}
}
Gets detailed information about a specific window.
{
"name": "get_window_info",
"arguments": {
"window_handle": "0x12345678"
}
}
Clicks at specific screen coordinates.
{
"name": "click_at_coordinates",
"arguments": {
"x": 100,
"y": 200,
"button": "Left",
"click_count": 1
}
}
Sends text input to a window or control.
{
"name": "send_text",
"arguments": {
"window_handle": "0x12345678",
"text": "Hello, World!",
"control_handle": "0x87654321"
}
}
Captures a screenshot of window or full screen.
{
"name": "take_screenshot",
"arguments": {
"window_handle": "0x12345678"
}
}
Lists running processes with type detection.
{
"name": "enumerate_processes",
"arguments": {
"include_system": false,
"filter_by_name": "optional_pattern"
}
}
Response:
{
"processes": [
{
"id": 1234,
"name": "notepad.exe",
"type": "win32",
"executable_path": "C:\\Windows\\System32\\notepad.exe",
"command_line": "notepad.exe document.txt",
"window_count": 1,
"is_elevated": false
}
]
}
Gets detailed information about a specific process.
{
"name": "get_process_info",
"arguments": {
"process_id": 1234
}
}
- Privilege Management: The server respects Windows security boundaries
- Process Elevation: Elevated processes require explicit permission
- Input Validation: All Win32 API calls are validated and sanitized
- Access Control: Configurable restrictions on system-level operations
WinAPIMCP/
├── src/ # Main application source
│ ├── Configuration/ # Settings and app configuration
│ ├── MCP/ # MCP protocol implementation
│ ├── Models/ # Data models and DTOs
│ ├── Services/ # Core business logic services
│ ├── UI/ # Windows Forms UI components
│ └── Win32/ # Win32 API P/Invoke declarations
└── docs/ # Documentation
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Architecture Overview - Detailed system design and component documentation
The application runs in the system tray with the following features:
- Tray Icon: Look for the "W" icon in your system tray
- Context Menu: Right-click the tray icon for quick actions
- Double-Click: Opens the main activity monitor window
Control how the server handles API requests:
- From Tray: Right-click tray icon → "Agentic Mode" (checkmark when enabled)
- From Settings: Right-click tray icon → "Settings..." → General tab
- From Main Window: Toolbar button or Tools menu
View real-time API operations:
- Open Monitor: Double-click tray icon or select "Show Main Window"
- Activity List: Real-time display of all API calls with color-coded status
- Details View: Double-click any activity for full details
- Filtering: Use the View menu to filter activities by type or date
Access comprehensive settings:
- Open Settings: Right-click tray icon → "Settings..."
- General Tab: Agentic mode, notifications, startup options
- Security Tab: Elevated process access, permission settings
- Advanced Tab: Activity history limits, file locations
- Enable agentic mode for automatic API execution
- Monitor activities in real-time through main window
- Use debug logging for detailed operation tracking
- Disable agentic mode for manual approval
- Configure notifications for API requests
- Review activity history regularly
- Tray Icon Not Visible: Check system tray overflow area or restart application
- Permission Denied: Run as administrator for elevated process access
- Port Already in Use: Change port with
--port
argument or through Settings - Win32 API Failures: Check Windows compatibility and permissions
- Agentic Mode Confusion: Check tray icon tooltip or main window status bar
- Settings Not Persisting: Ensure write access to
%AppData%\WinAPIMCP\
Enable debug logging for troubleshooting:
dotnet run -- --log-level Debug
Logs are written to:
- File:
logs/winapimcp-[date].log
(in application directory) - Activity Monitor: Real-time display in the main GUI window
- Console: Only when running from command line
You can also change log level through Settings → General → Log Level.
This project is licensed under the MIT License - see the LICENSE file for details.
- PInvoke.net for Win32 API signatures
- MCP Specification for protocol documentation
- Microsoft Windows API documentation
Note: This server requires Windows 10/11 with .NET 8.0 runtime. The GUI components use Windows Forms and require a desktop environment. Some features may require administrator privileges depending on target applications.
Agentic Mode: When disabled, users will see permission dialogs for each API operation. Enable for automated workflows or disable for manual control and security.