Skip to content

LevPro/lm_studio_selenium

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LM Studio Browser Automation Plugin

A LM Studio plugin that provides comprehensive browser automation capabilities using Selenium WebDriver for LLMs.

Overview

This plugin enables large language models in LM Studio to control web browsers (Chrome, Firefox, Edge, Safari) through a set of powerful automation tools. It provides full browser interaction capabilities including navigation, element finding, clicking, typing, scrolling, and more.

Features

  • Multi-Browser Support - Chrome, Firefox, Edge, and Safari
  • Tab Management - Open, close, and switch between multiple tabs
  • Element Interaction - Click, type, find elements by various selectors
  • Console Logs - Read browser developer console logs
  • Screenshots - Take screenshots for AI vision analysis
  • JavaScript Execution - Run custom JavaScript on pages
  • Navigation - Navigate, go back/forward, refresh pages

Available Tools

Tool Description
Browser Start Launch a browser instance (chrome/firefox/edge/safari) with optional driver and binary paths
Browser Close Close a browser session
Browser Navigate Navigate to a specific URL
Browser New Tab Open a new tab (optionally with URL)
Browser Close Tab Close the current tab
Browser Switch Tab Switch to a specific tab by index
Browser Get Tabs List all open tabs with titles and URLs
Browser Scroll Scroll page by pixels or to element
Browser Click Click at specific coordinates
Browser Click Element Click an element by CSS/XPath/ID
Browser Type Type text into element or at cursor
Browser Find Element Find elements by selector
Browser Find Text Search for text on page
Browser Get Logs Get browser console logs
Browser Screenshot Save screenshot to file
Browser Read Screenshot Get screenshot as base64 for AI vision
Browser Execute JS Execute custom JavaScript
Browser Get Page Source Get page HTML
Browser Get Info Get page title and URL
Browser Wait Wait for specified milliseconds
Browser Refresh Refresh the current page
Browser Go Back/Forward Navigate browser history
Browser List List all active browser sessions

Installation

# Install dependencies
npm install

Browser Configuration

This plugin supports multiple ways to configure browsers:

Automatic Driver Download (Default)

If no driver path is specified, the plugin automatically downloads the latest compatible WebDriver:

  • Chrome: ChromeDriver (latest version matching your Chrome)
  • Firefox: GeckoDriver (latest version)
  • Edge: EdgeDriver (latest version matching your Edge)
  • Safari: WebDriver (built into Safari)

Custom Driver Path

You can specify a custom path to the browser driver:

Browser Start: { browser: "chrome", driverPath: "/path/to/chromedriver" }

Custom Browser Binary

You can also specify a custom path to the browser executable:

Browser Start: { browser: "firefox", binaryPath: "/usr/bin/firefox" }
Browser Start: { browser: "chrome", binaryPath: "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe" }

Full Configuration Example

Browser Start: {
  browser: "chrome",
  headless: false,
  driverPath: "/path/to/chromedriver",
  binaryPath: "/usr/bin/google-chrome",
  url: "https://example.com"
}

Default Browser

By default, the plugin uses Firefox if no browser is specified:

Browser Start: { }
# Equivalent to:
Browser Start: { browser: "firefox" }

Development Mode

During development, use the LM Studio CLI to run the plugin in development mode:

npm run dev
# or directly:
lms dev

This command runs lms dev which:

  • Compiles TypeScript to JavaScript
  • Starts the plugin in LM Studio's development mode
  • Enables hot-reloading for quick iteration
  • Allows testing the plugin directly within LM Studio

Publishing to LM Studio Hub

When ready to share your plugin, push it to the LM Studio Hub:

npm run push
# or directly:
lms push

This command runs lms push which:

  • Builds the plugin for production
  • Publishes the plugin to the LM Studio registry
  • Makes it available for other users to install

Note: Before pushing, ensure your manifest.json has correct owner and name values configured.

Requirements

  • Node.js
  • LM Studio
  • Selenium WebDriver
  • Browser drivers (ChromeDriver, GeckoDriver, EdgeDriver)

Usage Example

# Start Firefox (default)
Browser Start: { }

# Start a Chrome browser
Browser Start: { browser: "chrome", url: "https://example.com" }

# Start Firefox with custom binary path
Browser Start: { browser: "firefox", binaryPath: "/usr/bin/firefox" }

# Start Chrome with custom driver and binary
Browser Start: { browser: "chrome", driverPath: "/path/to/chromedriver", binaryPath: "/usr/bin/google-chrome" }

# Start in headless mode
Browser Start: { browser: "edge", headless: true, url: "https://example.com" }

# Find and click a button
Browser Click Element: { selector: "#submit-button", selectorType: "css" }

# Type into a search box
Browser Type: { selector: "input[name='search']", text: "query", pressEnter: true }

# Take a screenshot for AI analysis
Browser Read Screenshot: { }

# Get console logs
Browser Get Logs: { logLevel: "all" }

# Execute custom JavaScript
Browser Execute JS: { script: "return document.title;" }

Configuration Options

Option Type Default Description
browser string "firefox" Browser to use: chrome, firefox, edge, or safari
headless boolean false Run browser without visible window
url string - Initial URL to navigate to
driverPath string auto Path to WebDriver executable. If not specified, latest driver is downloaded automatically
binaryPath string system default Path to browser binary/executable

Project Structure

├── src/
│   ├── index.ts       # Plugin entry point
│   ├── config.ts      # Plugin configuration
│   └── toolsProvider.ts # Browser automation tools implementation
├── manifest.json      # Plugin manifest
├── package.json       # NPM package configuration
├── tsconfig.json      # TypeScript configuration
└── thumbnail.png      # Plugin thumbnail

License

MIT

Author

Egor Levin

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors