Skip to content

Commit f2f075f

Browse files
committed
chore: Adding CLAUDE.md
1 parent 2b3c7c7 commit f2f075f

1 file changed

Lines changed: 124 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
WebDriverIO MCP Server is a Model Context Protocol (MCP) server that enables Claude Desktop to interact with web browsers using WebDriverIO for browser automation. The server is published as an npm package (`webdriverio-mcp`) and runs via stdio transport.
8+
9+
## Development Commands
10+
11+
### Build and Package
12+
```bash
13+
npm run bundle # Clean, build with tsup, make executable, and create .tgz package
14+
npm run prebundle # Clean lib directory and .tgz files
15+
npm run postbundle # Create npm package tarball
16+
```
17+
18+
### Run Server
19+
```bash
20+
npm run dev # Run development server with tsx (no build)
21+
npm start # Run built server from lib/server.js
22+
```
23+
24+
## Architecture
25+
26+
### Core Components
27+
28+
**Server Entry Point** (`src/server.ts`)
29+
- Initializes MCP server using `@modelcontextprotocol/sdk`
30+
- Redirects console output to stderr to avoid interfering with MCP protocol (Chrome writes to stdout)
31+
- Registers all tool handlers with the MCP server
32+
- Uses StdioServerTransport for communication with Claude Desktop
33+
34+
**Browser State Management** (`src/tools/browser.tool.ts`)
35+
- Maintains global state with `browsers` Map and `currentSession` tracking
36+
- `getBrowser()` helper retrieves the current active browser instance
37+
- `startBrowserTool` creates new WebDriverIO remote session with configurable options:
38+
- Headless mode support
39+
- Custom window dimensions (400-3840 width, 400-2160 height)
40+
- Chrome-specific arguments (sandbox, security, media stream, etc.)
41+
- `closeSessionTool` properly cleans up browser sessions
42+
43+
**Tool Pattern**
44+
All tools follow a consistent pattern:
45+
1. Export Zod schema for arguments validation (e.g., `navigateToolArguments`)
46+
2. Export ToolCallback function (e.g., `navigateTool`)
47+
3. Use `getBrowser()` to access current session
48+
4. Return `CallToolResult` with text content
49+
5. Wrap operations in try-catch and return errors as text content
50+
51+
**Browser Script Execution** (`src/scripts/get-interactable-elements.ts`)
52+
- Returns a function that executes in the browser context (not Node.js)
53+
- `getInteractableElements()` finds all visible, interactable elements on the page
54+
- Uses modern `element.checkVisibility()` API with fallback for older browsers
55+
- Generates CSS selectors using IDs, classes, or nth-child path-based selectors
56+
- Returns element metadata: tagName, type, id, className, textContent, value, placeholder, href, ariaLabel, role, cssSelector, isInViewport
57+
58+
### Build Configuration
59+
60+
**TypeScript** (`tsconfig.json`)
61+
- Target: ES2022, Module: ESNext
62+
- Source: `src/`, Output: `build/` (but not used for distribution)
63+
- Strict mode disabled
64+
- Includes types for Node.js and `@wdio/types`
65+
66+
**Bundler** (`tsup.config.ts`)
67+
- Entry: `src/server.ts`
68+
- Output: `lib/` directory (ESM format only)
69+
- Generates declaration files and sourcemaps
70+
- Externalizes `zod` dependency
71+
- The shebang `#!/usr/bin/env node` in server.ts is preserved for CLI execution
72+
73+
### Selector Syntax
74+
75+
The project uses WebDriverIO selector strategies:
76+
- CSS selectors: `button.my-class`, `#element-id`
77+
- XPath: `//button[@class='my-class']`
78+
- Text matching: `button=Exact text` (exact match), `a*=Link containing` (partial match)
79+
80+
### Key Implementation Details
81+
82+
1. **Console Output Redirection**: All console methods (log, info, warn, debug) are redirected to stderr because Chrome writes to stdout, which would corrupt the MCP stdio protocol.
83+
84+
2. **Element Visibility**: The `get-interactable-elements.ts` script runs in the browser and must be completely self-contained (no external dependencies). It filters for visible, non-disabled elements and returns all of them regardless of viewport status.
85+
86+
3. **Scroll Behavior**: Click operations default to scrolling elements into view (`scrollIntoView` with center alignment) before clicking.
87+
88+
4. **Session Management**: The server maintains a Map of browser sessions keyed by sessionId, but only tracks one `currentSession` at a time. All tools operate on the current session.
89+
90+
5. **Error Handling**: Tools catch errors and return them as text content rather than throwing, ensuring the MCP protocol remains stable.
91+
92+
## Adding New Tools
93+
94+
To add a new tool:
95+
96+
1. Create a new file in `src/tools/` (e.g., `my-tool.tool.ts`)
97+
2. Define Zod schema for arguments: `export const myToolArguments = { ... }`
98+
3. Implement the tool callback: `export const myTool: ToolCallback = async ({ args }) => { ... }`
99+
4. Import and register in `src/server.ts`: `server.tool('my_tool', 'description', myToolArguments, myTool)`
100+
101+
Example:
102+
```typescript
103+
import { getBrowser } from './browser.tool';
104+
import { z } from 'zod';
105+
import { ToolCallback } from '@modelcontextprotocol/sdk/server/mcp';
106+
107+
export const myToolArguments = {
108+
param: z.string().describe('Description of parameter'),
109+
};
110+
111+
export const myTool: ToolCallback = async ({ param }: { param: string }) => {
112+
try {
113+
const browser = getBrowser();
114+
// ... implementation
115+
return {
116+
content: [{ type: 'text', text: `Success: ${result}` }],
117+
};
118+
} catch (e) {
119+
return {
120+
content: [{ type: 'text', text: `Error: ${e}` }],
121+
};
122+
}
123+
};
124+
```

0 commit comments

Comments
 (0)