BrowserGenie MCP Server

An MCP (Model Context Protocol) server that gives AI models full control over a Chrome browser. It pairs with the BrowserGenie Extension to expose 50+ browser automation tools over stdio — navigation, clicking, typing, screenshots, touch gestures, macro recording, and complete DevTools access.

Two-repo setup: This is the server half. The Chrome extension lives in a separate repository. Both are required.

How It Works

AI Client (Claude, Cursor, etc.)
    │  stdio  (JSON-RPC / MCP)
    ▼
MCP Server  ◄── this repo
    │  WebSocket  ws://localhost:7890
    ▼
Chrome Extension
    ├── chrome.tabs         → Navigation, tab management
    ├── chrome.debugger     → DevTools Protocol (CDP)
    ├── chrome.scripting    → Content script injection
    ├── chrome.cookies      → Cookie management
    └── Content Scripts      → Real DOM event simulation

The MCP server bridges your AI client (over stdio) and the Chrome extension (over WebSocket). Every MCP tool call is forwarded to the extension, executed in the browser, and the result is returned to the AI client.

Requirements

Node.js 18+
npm
The companion BrowserGenie Extension installed in Chrome

Installation

Option A — npx (recommended)

No installation required. Run directly from npm:

npx browser-genie-mcp-server

Or install globally:

npm install -g browser-genie-mcp-server
browser-genie-mcp-server

Option B — Run from source

For development or to use the latest unreleased changes:

git clone https://github.com/BrowserGenie/mcp.git
cd mcp
npm install
npm run build
node dist/index.js

Configuration

Add the server to your AI client's MCP configuration.

Claude Desktop

File: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
File: %APPDATA%\Claude\claude_desktop_config.json (Windows)

{
  "mcpServers": {
    "browser-genie": {
      "command": "npx",
      "args": ["browser-genie-mcp-server"]
    }
  }
}

Claude Code

File: ~/.claude/settings.json or project-level .mcp.json:

{
  "mcpServers": {
    "browser-genie": {
      "command": "npx",
      "args": ["browser-genie-mcp-server"]
    }
  }
}

Cursor / other MCP clients

{
  "mcpServers": {
    "browser-genie": {
      "command": "npx",
      "args": ["browser-genie-mcp-server"]
    }
  }
}

If you installed the package globally with npm install -g browser-genie-mcp-server, you can use "command": "browser-genie-mcp-server" with no args instead.

After saving the config, restart your AI client.

Environment Variables

Variable	Default	Description
`WEBSOCKET_PORT`	`7890`	Port the WebSocket server listens on. The extension must use the same port.

Pass via the MCP config env block:

{
  "mcpServers": {
    "browser-genie": {
      "command": "npx",
      "args": ["browser-genie-mcp-server"],
      "env": { "WEBSOCKET_PORT": "8080" }
    }
  }
}

If you change the port, also update WEBSOCKET_URL in constants.ts inside the extension repo.

Verifying the Connection

Once the server is running and the extension is loaded in Chrome, click the extension icon. The popup should show a green Connected indicator. If it shows Disconnected, ensure the MCP server process is running.

MCP Tools Reference

Every tool accepts an optional apiKey (string) when authentication is enabled in the extension popup, and an optional tabId (number) to target a specific tab (defaults to the active tab).

Tip: If you see "No active tab found" or "Cannot access a chrome:// URL", use list_tabs to find a valid tab ID, or navigate to a regular web page first.

Navigation

Tool	Description	Key Parameters
`navigate_to_url`	Navigate to a URL	`url` (required)
`navigate_back`	Go back in history	—
`navigate_forward`	Go forward in history	—
`navigate_reload`	Reload the page	`ignoreCache` (bool)

Tab Management

Tool	Description	Key Parameters
`list_tabs`	List all open tabs	—
`select_tab`	Focus a tab	`tabId` (required)
`new_tab`	Open a new tab	`url` (optional)
`close_tab`	Close a tab	`tabId` (required)
`get_tab_state`	Capture URL, title, and DOM hash for state comparison	—
`assert_tabs_match`	Verify two tabs have identical state	`tabIdA`, `tabIdB`
`test_storage_sync`	Test cross-tab localStorage sync	`tabIdA`, `tabIdB`, `key`, `value`

Keyboard

Tool	Description	Key Parameters
`press_key`	Press a key with optional modifiers	`key`, `modifiers[]`
`type_text`	Type text character by character	`text`, `delay` (ms)

Mouse & Interaction

Tool	Description	Key Parameters
`click_element`	Click via coordinates, CSS, or XPath with human-like Bézier curve movement and randomized delays	`target.type`, `target.value`, `button`, `doubleClick`
`input_and_type`	Click a field, optionally clear it, then type text with per-character jitter	`selector`, `text`, `clearFirst`, `submit`
`drag_and_drop`	Drag from one point to another	`from`, `to`
`hover_element`	Hover to trigger CSS states / tooltips with randomized dwell time	`target`

Click behavior: doubleClick: true fires two rapid clicks at the same position. The element's own handlers determine focus/select behavior — the tool does not automatically select text or set focus beyond what the browser does natively.

Touch Gestures

Tool	Description	Key Parameters
`swipe`	Touch swipe from point A to B with configurable duration	`from`, `to`, `duration`
`long_press`	Long-press on an element or coordinates	`target`, `duration`
`pinch`	Pinch zoom with two-finger convergence/divergence	`center`, `startRadius`, `endRadius`
`double_tap`	Double-tap for mobile interactions (zoom, edit)	`target`, `interval`

Ensure the viewport is set to mobile with touch: true via resize_viewport or emulate_device before using touch gestures.

Screenshots

Tool	Description	Key Parameters
`screenshot_viewport`	Capture visible viewport	`format`, `quality`
`screenshot_full_page`	Capture full scrollable page	`format`, `quality`

Both return an image content block.

DevTools — Sources

Tool	Description	Key Parameters
`read_page_html`	Full outerHTML of the page	—
`read_stylesheets`	CSS stylesheet sources	`url` (filter)
`read_scripts`	JavaScript sources	`url` (filter)
`read_page_resources`	List all resources with URLs & sizes	`type` filter
`find_in_source`	Search regex pattern across HTML and all loaded scripts	`pattern`, `contextLines`

DevTools — Modify

Tool	Description	Key Parameters
`modify_html`	Live DOM mutation	`selector`, `action`, `value`, `attributeName`
`modify_css`	Set inline styles	`selector`, `styles` (object)

DevTools — Network

Tool	Description	Key Parameters
`get_network_logs`	Get collected request/response logs	`filter.urlPattern`, `filter.method`, `filter.statusCode`
`get_network_request_detail`	Full details of one request	`requestId`, `includeBody`
`clear_network_logs`	Clear collected logs	—
`get_network_errors`	Get only failed/errored requests (4xx, 5xx, failed)	`clear`

Network logs are collected from when the debugger attaches. Call any DevTools tool first to trigger attachment before the traffic you want to capture.

DevTools — Storage

Tool	Description
`get_cookies` / `set_cookie` / `delete_cookie`	Cookie CRUD
`get_local_storage` / `set_local_storage` / `remove_local_storage`	localStorage
`get_session_storage` / `set_session_storage` / `remove_session_storage`	sessionStorage

DevTools — Console

Tool	Description	Key Parameters
`get_console_logs`	Retrieve console messages by level	`level`, `clear`
`execute_javascript`	Run JS in the page and return result	`expression`

Accessibility & Auditing

Tool	Description	Key Parameters
`browser_snapshot`	Text-based accessibility tree snapshot	—
`get_accessibility_tree`	Raw accessibility tree as JSON	`selector` (optional filter)
`run_accessibility_audit`	Run axe-core WCAG audit	`selector`, `tags`
`check_color_contrast`	Check text contrast ratios	`selector`
`get_tab_order`	Static list of focusable elements in tab order with unique selectors	—
`record_focus_path`	Interactive — simulate Tab presses and record where focus lands, flagging invisible targets	`steps`
`get_performance_metrics`	Single snapshot of navigation timing, LCP, CLS, FID, memory	—
`record_performance_timeline`	Start/stop/get timeline recording of memory, LCP, CLS over time	`action`, `interval`
`check_font_loading`	Verify web font loading status	—
`audit_broken_resources`	Find broken images, stylesheets, fonts	—
`check_security_headers`	Inspect CSP, HSTS, X-Frame-Options, etc.	`url`
`detect_cookie_banners`	Detect cookie consent banners and CMP patterns	—

When to use get_tab_order vs record_focus_path: Use get_tab_order for a one-time snapshot of all focusable elements and their order. Use record_focus_path when you want to verify the actual focus behavior during keyboard navigation — it presses Tab repeatedly and records where focus lands, catching invisible or hidden focus traps.

Element Inspection

Tool	Description	Key Parameters
`find_element`	Find by text, role, aria-label, CSS, or XPath	`text`, `role`, `css`, `xpath`, `nth`
`get_element_state`	Get exists, visible, enabled, focused, checked, etc.	`selector`, `selectorType`
`query_shadow_dom`	Query inside a single shadow root	`hostSelector`, `innerSelector`
`deep_query_shadow_dom`	Query through nested shadow roots by host path	`hostPath[]`, `innerSelector`
`get_shadow_dom_tree`	Return full shadow DOM tree as JSON	`hostSelector`, `maxDepth`
`get_computed_styles`	Get computed CSS styles for an element	`selector`, `properties`, `pseudoElement`

QA & Assertions

Tool	Description	Key Parameters
`assert_element`	Assert conditions on an element (exists, visible, text, etc.)	`assertion`, `selector`
`assert_no_console_errors`	Assert zero console errors/warnings	`level`, `clear`
`assert_no_network_errors`	Assert zero failed network requests	`clear`
`assert_css_property`	Assert computed style value	`selector`, `property`, `expected`
`assert_network_request_made`	Assert a matching request was made	`urlPattern`, `method`, `minCount`
`assert_page_load_time`	Assert navigation timing is within threshold	`threshold`
`check_form_validity`	Check HTML5 form validation state	`selector`, `checkAll`
`tab_to_next`	Simulate Tab key and track focus movement	`direction`, `shift`
`set_input_files`	Set files on a file input	`selector`, `files`
`emulate_network_conditions`	Simulate slow/offline network	`offline`, `latency`, `throughput`
`intercept_requests`	Block, modify, or allow network requests	`action`, `urlPattern`
`snapshot_page_state`	Capture full page state (HTML, storage, cookies)	—
`restore_page_state`	Restore from a snapshot	`snapshot`
`wait_for_condition`	Poll a JS expression until true or timeout	`expression`, `timeout`, `interval`
`stress_test_refresh`	Refresh page N times and run assertion each time	`iterations`, `assertionScript`, `bypassCache`
`get_all_issues`	Unified diagnostic — console + network + resource errors in one call	`includeConsole`, `includeNetwork`, `includeResources`

Interaction Inspection

Tool	Description	Key Parameters
`hover_and_inspect`	Hover and capture DOM/style changes	`target`, `captureChanges`
`force_pseudo_state`	Force hover/focus/active and read computed styles	`selector`, `pseudoState`
`get_tooltip_text`	Extract tooltip text from title, aria-describedby, aria-labelledby, or CSS	`target`, `waitForTooltip`

Macro Recording

Tool	Description	Key Parameters
`start_recording_macro`	Start recording clicks, typing, and changes	—
`stop_recording_macro`	Stop recording and return events JSON	—
`replay_macro`	Replay recorded events with speed multiplier	`events`, `speed`

Visual Regression

Tool	Description	Key Parameters
`compare_screenshots`	Pixel-by-pixel diff of two base64 PNGs using pixelmatch	`beforeImage`, `afterImage`, `threshold`

Project Structure

browser-genie-mcp-server/
├── src/
│   ├── index.ts              # Entry point — stdio MCP transport
│   ├── server.ts             # McpServer setup & tool registration
│   ├── websocket-bridge.ts   # WebSocket server + request/response correlation
│   ├── auth.ts               # API key validation helper
│   ├── types.ts              # Shared types & constants (port, message shapes)
│   └── tools/                # One file per tool category
│       ├── navigation.ts
│       ├── tab-management.ts
│       ├── click.ts
│       ├── input.ts
│       ├── keyboard.ts
│       ├── hover.ts
│       ├── drag-drop.ts
│       ├── screenshot.ts
│       ├── gestures.ts
│       ├── macros.ts
│       ├── visual-regression.ts
│       ├── devtools-sources.ts
│       ├── devtools-modify.ts
│       ├── devtools-network.ts
│       ├── devtools-storage.ts
│       ├── devtools-console.ts
│       ├── accessibility.ts
│       ├── emulation.ts
│       ├── elements.ts
│       ├── audit.ts
│       ├── interaction.ts
│       ├── monitoring.ts
│       └── qa.ts
├── dist/                     # Compiled output (after npm run build)
├── package.json
├── tsconfig.json
├── .gitignore
└── LICENSE

Development

# Watch mode — recompiles on every file save
npm run dev

# One-shot build
npm run build

# Run the compiled server directly
npm start

Important Notes

Debugger Banner

When any DevTools feature is first used on a tab, Chrome shows an "Extension is debugging this browser" banner. This is a Chrome security requirement and cannot be suppressed. The debugger attaches lazily — only when a DevTools tool is first called for that tab.

Service Worker Lifecycle

Chrome MV3 service workers terminate after ~30 seconds of inactivity. The extension uses chrome.alarms to keep the WebSocket alive, with automatic exponential-backoff reconnection (1 s → 2 s → 4 s → … → 30 s max).

Human-Like Interactions

All mouse and keyboard interactions include randomized delays and natural movement patterns to avoid bot detection. Click uses Bézier curves, typing has per-character jitter, and hover has randomized dwell time.

Contributing

Contributions are welcome! Please open an issue first to discuss what you'd like to change, then submit a pull request.

Fork the repository
Create a feature branch: git checkout -b feat/my-feature
Commit your changes: git commit -m 'feat: add my feature'
Push and open a Pull Request

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

BrowserGenie MCP Server

How It Works

Requirements

Installation

Option A — npx (recommended)

Option B — Run from source

Configuration

Claude Desktop

Claude Code

Cursor / other MCP clients

Environment Variables

Verifying the Connection

MCP Tools Reference

Navigation

Tab Management

Keyboard

Mouse & Interaction

Touch Gestures

Screenshots

DevTools — Sources

DevTools — Modify

DevTools — Network

DevTools — Storage

DevTools — Console

Accessibility & Auditing

Element Inspection

QA & Assertions

Interaction Inspection

Macro Recording

Visual Regression

Project Structure

Development

Important Notes

Debugger Banner

Service Worker Lifecycle

Human-Like Interactions

Contributing

Related

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages