A Model Context Protocol (MCP) server implementation for Selenium WebDriver, allowing AI agents to control web browsers.
npm install -g @angiejones/mcp-selenium
Start the server by running:
mcp-selenium
Or use with NPX in your MCP configuration:
{
"mcpServers": {
"selenium": {
"command": "npx",
"args": [
"-y",
"@angiejones/mcp-selenium"
]
}
}
}
Launches a browser session.
Parameters:
browser
(required): Browser to launch- Type: string
- Enum: ["chrome", "firefox"]
options
: Browser configuration options- Type: object
- Properties:
headless
: Run browser in headless mode- Type: boolean
arguments
: Additional browser arguments- Type: array of strings
Example:
{
"tool": "start_browser",
"parameters": {
"browser": "chrome",
"options": {
"headless": true,
"arguments": ["--no-sandbox"]
}
}
}
Navigates to a URL.
Parameters:
url
(required): URL to navigate to- Type: string
Example:
{
"tool": "navigate",
"parameters": {
"url": "https://www.example.com"
}
}
Finds an element on the page.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "find_element",
"parameters": {
"by": "id",
"value": "search-input",
"timeout": 5000
}
}
Clicks an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "click_element",
"parameters": {
"by": "css",
"value": ".submit-button"
}
}
Sends keys to an element (typing).
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
text
(required): Text to enter into the element- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "send_keys",
"parameters": {
"by": "name",
"value": "username",
"text": "testuser"
}
}
Gets the text() of an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "get_element_text",
"parameters": {
"by": "css",
"value": ".message"
}
}
Moves the mouse to hover over an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "hover",
"parameters": {
"by": "css",
"value": ".dropdown-menu"
}
}
Drags an element and drops it onto another element.
Parameters:
by
(required): Locator strategy for source element- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the source locator strategy- Type: string
targetBy
(required): Locator strategy for target element- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
targetValue
(required): Value for the target locator strategy- Type: string
timeout
: Maximum time to wait for elements in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "drag_and_drop",
"parameters": {
"by": "id",
"value": "draggable",
"targetBy": "id",
"targetValue": "droppable"
}
}
Performs a double click on an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "double_click",
"parameters": {
"by": "css",
"value": ".editable-text"
}
}
Performs a right click (context click) on an element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "right_click",
"parameters": {
"by": "css",
"value": ".context-menu-trigger"
}
}
Simulates pressing a keyboard key.
Parameters:
key
(required): Key to press (e.g., 'Enter', 'Tab', 'a', etc.)- Type: string
Example:
{
"tool": "press_key",
"parameters": {
"key": "Enter"
}
}
Uploads a file using a file input element.
Parameters:
by
(required): Locator strategy- Type: string
- Enum: ["id", "css", "xpath", "name", "tag", "class"]
value
(required): Value for the locator strategy- Type: string
filePath
(required): Absolute path to the file to upload- Type: string
timeout
: Maximum time to wait for element in milliseconds- Type: number
- Default: 10000
Example:
{
"tool": "upload_file",
"parameters": {
"by": "id",
"value": "file-input",
"filePath": "/path/to/file.pdf"
}
}
Captures a screenshot of the current page.
Parameters:
outputPath
(optional): Path where to save the screenshot. If not provided, returns base64 data.- Type: string
Example:
{
"tool": "take_screenshot",
"parameters": {
"outputPath": "/path/to/screenshot.png"
}
}
- Chrome
- Firefox
- Start browser sessions with customizable options
- Navigate to URLs
- Find elements using various locator strategies
- Click, type, and interact with elements
- Perform mouse actions (hover, drag and drop)
- Handle keyboard input
- Take screenshots
- Upload files
- Support for headless mode
To work on this project:
- Clone the repository
- Install dependencies:
npm install
- Run the server:
npm start
MIT