A lightweight MCP server that exports CAMEL framework's HybridBrowserToolkit as MCP-compatible tools.
This project provides an MCP (Model Control Protocol) interface for CAMEL's HybridBrowserToolkit, enabling browser automation capabilities through a standardized protocol. It allows LLM-based applications to control web browsers, navigate pages, interact with elements, and capture screenshots.
Key features:
- Full browser automation capabilities (click, type, navigate, etc.)
- Screenshot capture with visual element identification
- Multi-tab management
- JavaScript execution in browser console
- Async operation support
You can install the package directly from source:
git clone https://github.com/yourusername/hybrid-browser-mcp.git
cd hybrid-browser-mcp
pip install -e .
Or using pip:
pip install hybrid-browser-mcp
To use this MCP server with Claude Desktop, add it to your configuration file.
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Windows:
%APPDATA%\Claude\claude_desktop_config.json
- Linux:
~/.config/Claude/claude_desktop_config.json
Add the following to your claude_desktop_config.json
:
{
"mcpServers": {
"hybrid-browser": {
"command": "python",
"args": [
"-m",
"hybrid_browser_mcp.server"
]
}
}
}
Make sure to:
- Use the correct path to your Python interpreter (you can find it with
which python
) - Ensure the package is installed in that Python environment
- Restart Claude Desktop completely after updating the configuration
After restarting Claude Desktop:
- Click the 🔌 (plug icon) in the conversation interface
- You should see "hybrid-browser" listed among available tools
- The browser automation tools will be available (browser_open, browser_click, etc.)
Configuration Success Example:

claude_desktop_config.json with hybrid-browser MCP server configured
Browser Tools in Action:

Using browser automation tools in Claude Desktop to interact with web pages
The browser behavior is configured through hybrid_browser_mcp/config.py
. You can modify this file to customize the browser settings:
BROWSER_CONFIG = {
"headless": False, # Run browser in headless mode
"stealth": True, # Enable stealth mode
"viewport_limit": False, # Include all elements in snapshots
"cache_dir": "tmp/", # Cache directory for screenshots
"enabled_tools": [ # List of enabled browser tools
"browser_open", "browser_close", "browser_visit_page",
"browser_back", "browser_forward", "browser_get_som_screenshot",
"browser_click", "browser_type", "browser_select",
"browser_scroll", "browser_enter", "browser_mouse_control",
"browser_mouse_drag", "browser_press_key", "browser_switch_tab",
# Uncomment to enable additional tools:
# "browser_get_page_snapshot",
# "browser_close_tab",
# "browser_console_view",
# "browser_console_exec",
],
}
Option | Description | Default | Type |
---|---|---|---|
headless |
Run browser in headless mode (no window) | False |
bool |
stealth |
Enable stealth mode to avoid detection | False |
bool |
viewport_limit |
Only include elements in current viewport in snapshots | False |
bool |
cache_dir |
Directory for storing cache files | "tmp/" |
str |
enabled_tools |
List of enabled tools | None * |
list or None |
*When enabled_tools
is None
, these default tools are enabled: browser_open
, browser_close
, browser_visit_page
, browser_back
, browser_forward
, browser_click
, browser_type
, browser_switch_tab
1. Headless mode for automation:
USER_BROWSER_CONFIG = {
"headless": True,
}
2. Stealth mode with visible browser:
USER_BROWSER_CONFIG = {
"headless": False,
"stealth": True,
}
3. Limited tools for safety:
USER_BROWSER_CONFIG = {
"enabled_tools": [
"browser_open",
"browser_visit_page",
"browser_get_page_snapshot",
"browser_close",
],
}
4. Enable all available tools:
USER_BROWSER_CONFIG = {
"enabled_tools": [
"browser_open", "browser_close", "browser_visit_page",
"browser_back", "browser_forward", "browser_get_page_snapshot",
"browser_get_som_screenshot", "browser_click", "browser_type",
"browser_select", "browser_scroll", "browser_enter",
"browser_switch_tab", "browser_close_tab", "browser_get_tab_info",
"browser_mouse_control", "browser_mouse_drag", "browser_press_key",
"browser_wait_user", "browser_console_view", "browser_console_exec",
],
}
The server exposes the following browser control tools:
browser_open()
: Opens a new browser sessionbrowser_close()
: Closes the browser sessionbrowser_visit_page(url)
: Navigates to a specific URLbrowser_back()
: Goes back in browser historybrowser_forward()
: Goes forward in browser history
browser_click(ref)
: Clicks on an element by its reference IDbrowser_type(ref, text, inputs)
: Types text into input fieldsbrowser_select(ref, value)
: Selects an option in a dropdownbrowser_scroll(direction, amount)
: Scrolls the pagebrowser_enter()
: Presses the Enter keybrowser_press_key(keys)
: Presses specific keyboard keys
browser_get_page_snapshot()
: Gets a textual snapshot of interactive elementsbrowser_get_som_screenshot(read_image, instruction)
: Captures a screenshot with element annotationslist_browser_functions()
: Lists all available browser functions
browser_switch_tab(tab_id)
: Switches to a different browser tabbrowser_close_tab(tab_id)
: Closes a specific tabbrowser_get_tab_info()
: Gets information about all open tabs
browser_console_view()
: Views console logsbrowser_console_exec(code)
: Executes JavaScript in the browser consolebrowser_mouse_control(control, x, y)
: Controls mouse actions at coordinatesbrowser_mouse_drag(from_ref, to_ref)
: Drags elementsbrowser_wait_user(timeout_sec)
: Waits for user input
# Open browser and navigate
await browser_open()
await browser_visit_page("https://www.google.com")
# Get page snapshot to see available elements
snapshot = await browser_get_page_snapshot()
print(snapshot)
# Interact with elements
await browser_type(ref="search-input", text="CAMEL AI framework")
await browser_enter()
# Take a screenshot
await browser_get_som_screenshot()
# Close browser
await browser_close()
The server works by:
- Wrapping CAMEL's HybridBrowserToolkit with async support
- Exposing toolkit methods as MCP-compatible tools
- Managing a singleton browser instance per session
- Handling WebSocket communication for real-time browser control
To set up a development environment:
pip install -e ".[dev]"
Run tests:
pytest
-
Check if the package is installed correctly:
# Should output the path to the executable which hybrid-browser-mcp
-
Test the server manually:
hybrid-browser-mcp # Should start without errors # Press Ctrl+C to stop
-
Check Claude Desktop logs for errors:
# macOS tail -f ~/Library/Logs/Claude/mcp*.log # Windows Get-Content "$env:APPDATA\Claude\logs\mcp*.log" -Tail 20 -Wait
-
Verify the configuration file:
# macOS cat ~/Library/Application\ Support/Claude/claude_desktop_config.json # Windows type %APPDATA%\Claude\claude_desktop_config.json
Solution: Use the full Python path in your configuration:
{
"mcpServers": {
"hybrid-browser": {
"command": "/usr/bin/python3", // or your Python path
"args": ["-m", "hybrid_browser_mcp.server"]
}
}
}
Solution: The HybridBrowserToolkit uses a TypeScript-based browser controller that runs on Node.js. It will automatically download and manage browser binaries. If you encounter issues:
- Ensure Node.js is installed on your system
- The TypeScript server will start automatically when needed
- Browser binaries will be downloaded on first use
To see detailed logs, you can run the server with debug output:
python -m hybrid_browser_mcp.server 2> debug.log
Then check debug.log
for any error messages.
This project is licensed under the MIT License - see the LICENSE file for details.