Skip to content

jasimea/xbrowser

Repository files navigation

XBrowser - AI Browser Automation

An open-source Chrome extension for AI-powered browser automation with natural language commands. Similar to OpenAI Operator/Atlas.

Features

  • 🤖 Multi-Agent System - Planner, Navigator, and Validator agents work together
  • 💬 Natural Language Commands - Just tell it what you want to do
  • 🎯 Floating Toolbar - Quick access from any web page
  • 📱 Side Panel UI - Full chat interface with action history
  • 🔌 Multiple LLM Providers - OpenAI, Anthropic, Gemini, Ollama
  • 🔒 Privacy First - Your API keys, your data

Screenshots

Coming soon

Installation

From Source

  1. Clone the repository:

    git clone https://github.com/jasimea/xbrowser.git
    cd xbrowser
  2. Install dependencies:

    npm install
  3. Build the extension:

    npm run build
  4. Load in Chrome:

    • Go to chrome://extensions/
    • Enable "Developer mode"
    • Click "Load unpacked"
    • Select the dist/ folder

Development

# Start development server with hot reload
npm run dev

# Type check
npm run type-check

# Build for production
npm run build

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Task Executor                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌──────────────┐    ┌──────────────┐    ┌──────────────┐     │
│   │   Planner    │───>│  Navigator   │───>│  Validator   │     │
│   │    Agent     │    │    Agent     │    │    Agent     │     │
│   └──────────────┘    └──────────────┘    └──────────────┘     │
│         │                    │                    │              │
│   Strategy &            Execute            Verify              │
│   Planning              Actions            Results              │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Tech Stack

  • TypeScript - Type-safe code
  • React 18 - UI components
  • Tailwind CSS - Styling
  • Vite - Build tool
  • LangChain.js - AI framework
  • Zustand - State management
  • Zod - Schema validation

Configuration

  1. Click the XBrowser extension icon
  2. Go to Settings tab
  3. Enter your API keys for the LLM providers you want to use
  4. Assign models to each agent (Planner, Navigator, Validator)

Usage

Quick Commands (Toolbar)

Type directly in the floating toolbar at the top of any page:

  • "Click the login button"
  • "Fill the form with test data"
  • "Extract all product prices"

Side Panel

Click the extension icon to open the side panel for:

  • Full conversation history
  • Action history and replay
  • Detailed settings

Roadmap

  • Phase 2: Storage & Configuration Layer
  • Phase 3: LLM Provider Factory
  • Phase 4: Browser Automation Layer
  • Phase 5: Action System (20+ actions)
  • Phase 6: Multi-Agent System
  • Phase 7: Message Passing & Service Worker
  • Phase 8: Side Panel UI
  • Phase 9: Polish & Testing

Contributing

Contributions are welcome! Please read our contributing guidelines first.

License

MIT License - see LICENSE for details.

Acknowledgments

About

AI-powered browser automation Chrome extension with natural language commands

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •