A LM Studio plugin that provides comprehensive browser automation capabilities using Selenium WebDriver for LLMs.
This plugin enables large language models in LM Studio to control web browsers (Chrome, Firefox, Edge, Safari) through a set of powerful automation tools. It provides full browser interaction capabilities including navigation, element finding, clicking, typing, scrolling, and more.
- Multi-Browser Support - Chrome, Firefox, Edge, and Safari
- Tab Management - Open, close, and switch between multiple tabs
- Element Interaction - Click, type, find elements by various selectors
- Console Logs - Read browser developer console logs
- Screenshots - Take screenshots for AI vision analysis
- JavaScript Execution - Run custom JavaScript on pages
- Navigation - Navigate, go back/forward, refresh pages
| Tool | Description |
|---|---|
| Browser Start | Launch a browser instance (chrome/firefox/edge/safari) with optional driver and binary paths |
| Browser Close | Close a browser session |
| Browser Navigate | Navigate to a specific URL |
| Browser New Tab | Open a new tab (optionally with URL) |
| Browser Close Tab | Close the current tab |
| Browser Switch Tab | Switch to a specific tab by index |
| Browser Get Tabs | List all open tabs with titles and URLs |
| Browser Scroll | Scroll page by pixels or to element |
| Browser Click | Click at specific coordinates |
| Browser Click Element | Click an element by CSS/XPath/ID |
| Browser Type | Type text into element or at cursor |
| Browser Find Element | Find elements by selector |
| Browser Find Text | Search for text on page |
| Browser Get Logs | Get browser console logs |
| Browser Screenshot | Save screenshot to file |
| Browser Read Screenshot | Get screenshot as base64 for AI vision |
| Browser Execute JS | Execute custom JavaScript |
| Browser Get Page Source | Get page HTML |
| Browser Get Info | Get page title and URL |
| Browser Wait | Wait for specified milliseconds |
| Browser Refresh | Refresh the current page |
| Browser Go Back/Forward | Navigate browser history |
| Browser List | List all active browser sessions |
# Install dependencies
npm installThis plugin supports multiple ways to configure browsers:
If no driver path is specified, the plugin automatically downloads the latest compatible WebDriver:
- Chrome: ChromeDriver (latest version matching your Chrome)
- Firefox: GeckoDriver (latest version)
- Edge: EdgeDriver (latest version matching your Edge)
- Safari: WebDriver (built into Safari)
You can specify a custom path to the browser driver:
Browser Start: { browser: "chrome", driverPath: "/path/to/chromedriver" }
You can also specify a custom path to the browser executable:
Browser Start: { browser: "firefox", binaryPath: "/usr/bin/firefox" }
Browser Start: { browser: "chrome", binaryPath: "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" }
Browser Start: {
browser: "chrome",
headless: false,
driverPath: "/path/to/chromedriver",
binaryPath: "/usr/bin/google-chrome",
url: "https://example.com"
}
By default, the plugin uses Firefox if no browser is specified:
Browser Start: { }
# Equivalent to:
Browser Start: { browser: "firefox" }
During development, use the LM Studio CLI to run the plugin in development mode:
npm run dev
# or directly:
lms devThis command runs lms dev which:
- Compiles TypeScript to JavaScript
- Starts the plugin in LM Studio's development mode
- Enables hot-reloading for quick iteration
- Allows testing the plugin directly within LM Studio
When ready to share your plugin, push it to the LM Studio Hub:
npm run push
# or directly:
lms pushThis command runs lms push which:
- Builds the plugin for production
- Publishes the plugin to the LM Studio registry
- Makes it available for other users to install
Note: Before pushing, ensure your
manifest.jsonhas correct owner and name values configured.
- Node.js
- LM Studio
- Selenium WebDriver
- Browser drivers (ChromeDriver, GeckoDriver, EdgeDriver)
# Start Firefox (default)
Browser Start: { }
# Start a Chrome browser
Browser Start: { browser: "chrome", url: "https://example.com" }
# Start Firefox with custom binary path
Browser Start: { browser: "firefox", binaryPath: "/usr/bin/firefox" }
# Start Chrome with custom driver and binary
Browser Start: { browser: "chrome", driverPath: "/path/to/chromedriver", binaryPath: "/usr/bin/google-chrome" }
# Start in headless mode
Browser Start: { browser: "edge", headless: true, url: "https://example.com" }
# Find and click a button
Browser Click Element: { selector: "#submit-button", selectorType: "css" }
# Type into a search box
Browser Type: { selector: "input[name='search']", text: "query", pressEnter: true }
# Take a screenshot for AI analysis
Browser Read Screenshot: { }
# Get console logs
Browser Get Logs: { logLevel: "all" }
# Execute custom JavaScript
Browser Execute JS: { script: "return document.title;" }
| Option | Type | Default | Description |
|---|---|---|---|
browser |
string | "firefox" | Browser to use: chrome, firefox, edge, or safari |
headless |
boolean | false | Run browser without visible window |
url |
string | - | Initial URL to navigate to |
driverPath |
string | auto | Path to WebDriver executable. If not specified, latest driver is downloaded automatically |
binaryPath |
string | system default | Path to browser binary/executable |
├── src/
│ ├── index.ts # Plugin entry point
│ ├── config.ts # Plugin configuration
│ └── toolsProvider.ts # Browser automation tools implementation
├── manifest.json # Plugin manifest
├── package.json # NPM package configuration
├── tsconfig.json # TypeScript configuration
└── thumbnail.png # Plugin thumbnail
MIT
Egor Levin