|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +WebDriverIO MCP Server is a Model Context Protocol (MCP) server that enables Claude Desktop to interact with web browsers using WebDriverIO for browser automation. The server is published as an npm package (`webdriverio-mcp`) and runs via stdio transport. |
| 8 | + |
| 9 | +## Development Commands |
| 10 | + |
| 11 | +### Build and Package |
| 12 | +```bash |
| 13 | +npm run bundle # Clean, build with tsup, make executable, and create .tgz package |
| 14 | +npm run prebundle # Clean lib directory and .tgz files |
| 15 | +npm run postbundle # Create npm package tarball |
| 16 | +``` |
| 17 | + |
| 18 | +### Run Server |
| 19 | +```bash |
| 20 | +npm run dev # Run development server with tsx (no build) |
| 21 | +npm start # Run built server from lib/server.js |
| 22 | +``` |
| 23 | + |
| 24 | +## Architecture |
| 25 | + |
| 26 | +### Core Components |
| 27 | + |
| 28 | +**Server Entry Point** (`src/server.ts`) |
| 29 | +- Initializes MCP server using `@modelcontextprotocol/sdk` |
| 30 | +- Redirects console output to stderr to avoid interfering with MCP protocol (Chrome writes to stdout) |
| 31 | +- Registers all tool handlers with the MCP server |
| 32 | +- Uses StdioServerTransport for communication with Claude Desktop |
| 33 | + |
| 34 | +**Browser State Management** (`src/tools/browser.tool.ts`) |
| 35 | +- Maintains global state with `browsers` Map and `currentSession` tracking |
| 36 | +- `getBrowser()` helper retrieves the current active browser instance |
| 37 | +- `startBrowserTool` creates new WebDriverIO remote session with configurable options: |
| 38 | + - Headless mode support |
| 39 | + - Custom window dimensions (400-3840 width, 400-2160 height) |
| 40 | + - Chrome-specific arguments (sandbox, security, media stream, etc.) |
| 41 | +- `closeSessionTool` properly cleans up browser sessions |
| 42 | + |
| 43 | +**Tool Pattern** |
| 44 | +All tools follow a consistent pattern: |
| 45 | +1. Export Zod schema for arguments validation (e.g., `navigateToolArguments`) |
| 46 | +2. Export ToolCallback function (e.g., `navigateTool`) |
| 47 | +3. Use `getBrowser()` to access current session |
| 48 | +4. Return `CallToolResult` with text content |
| 49 | +5. Wrap operations in try-catch and return errors as text content |
| 50 | + |
| 51 | +**Browser Script Execution** (`src/scripts/get-interactable-elements.ts`) |
| 52 | +- Returns a function that executes in the browser context (not Node.js) |
| 53 | +- `getInteractableElements()` finds all visible, interactable elements on the page |
| 54 | +- Uses modern `element.checkVisibility()` API with fallback for older browsers |
| 55 | +- Generates CSS selectors using IDs, classes, or nth-child path-based selectors |
| 56 | +- Returns element metadata: tagName, type, id, className, textContent, value, placeholder, href, ariaLabel, role, cssSelector, isInViewport |
| 57 | + |
| 58 | +### Build Configuration |
| 59 | + |
| 60 | +**TypeScript** (`tsconfig.json`) |
| 61 | +- Target: ES2022, Module: ESNext |
| 62 | +- Source: `src/`, Output: `build/` (but not used for distribution) |
| 63 | +- Strict mode disabled |
| 64 | +- Includes types for Node.js and `@wdio/types` |
| 65 | + |
| 66 | +**Bundler** (`tsup.config.ts`) |
| 67 | +- Entry: `src/server.ts` |
| 68 | +- Output: `lib/` directory (ESM format only) |
| 69 | +- Generates declaration files and sourcemaps |
| 70 | +- Externalizes `zod` dependency |
| 71 | +- The shebang `#!/usr/bin/env node` in server.ts is preserved for CLI execution |
| 72 | + |
| 73 | +### Selector Syntax |
| 74 | + |
| 75 | +The project uses WebDriverIO selector strategies: |
| 76 | +- CSS selectors: `button.my-class`, `#element-id` |
| 77 | +- XPath: `//button[@class='my-class']` |
| 78 | +- Text matching: `button=Exact text` (exact match), `a*=Link containing` (partial match) |
| 79 | + |
| 80 | +### Key Implementation Details |
| 81 | + |
| 82 | +1. **Console Output Redirection**: All console methods (log, info, warn, debug) are redirected to stderr because Chrome writes to stdout, which would corrupt the MCP stdio protocol. |
| 83 | + |
| 84 | +2. **Element Visibility**: The `get-interactable-elements.ts` script runs in the browser and must be completely self-contained (no external dependencies). It filters for visible, non-disabled elements and returns all of them regardless of viewport status. |
| 85 | + |
| 86 | +3. **Scroll Behavior**: Click operations default to scrolling elements into view (`scrollIntoView` with center alignment) before clicking. |
| 87 | + |
| 88 | +4. **Session Management**: The server maintains a Map of browser sessions keyed by sessionId, but only tracks one `currentSession` at a time. All tools operate on the current session. |
| 89 | + |
| 90 | +5. **Error Handling**: Tools catch errors and return them as text content rather than throwing, ensuring the MCP protocol remains stable. |
| 91 | + |
| 92 | +## Adding New Tools |
| 93 | + |
| 94 | +To add a new tool: |
| 95 | + |
| 96 | +1. Create a new file in `src/tools/` (e.g., `my-tool.tool.ts`) |
| 97 | +2. Define Zod schema for arguments: `export const myToolArguments = { ... }` |
| 98 | +3. Implement the tool callback: `export const myTool: ToolCallback = async ({ args }) => { ... }` |
| 99 | +4. Import and register in `src/server.ts`: `server.tool('my_tool', 'description', myToolArguments, myTool)` |
| 100 | + |
| 101 | +Example: |
| 102 | +```typescript |
| 103 | +import { getBrowser } from './browser.tool'; |
| 104 | +import { z } from 'zod'; |
| 105 | +import { ToolCallback } from '@modelcontextprotocol/sdk/server/mcp'; |
| 106 | + |
| 107 | +export const myToolArguments = { |
| 108 | + param: z.string().describe('Description of parameter'), |
| 109 | +}; |
| 110 | + |
| 111 | +export const myTool: ToolCallback = async ({ param }: { param: string }) => { |
| 112 | + try { |
| 113 | + const browser = getBrowser(); |
| 114 | + // ... implementation |
| 115 | + return { |
| 116 | + content: [{ type: 'text', text: `Success: ${result}` }], |
| 117 | + }; |
| 118 | + } catch (e) { |
| 119 | + return { |
| 120 | + content: [{ type: 'text', text: `Error: ${e}` }], |
| 121 | + }; |
| 122 | + } |
| 123 | +}; |
| 124 | +``` |
0 commit comments