ClawBridge

Bridge Open-Source AI Agents to Your Desktop & Browser

ClawBridge is a local-first AI agent platform that unifies multiple automation engines — browser-use, OpenClaw, and Anthropic computer-use — into a single dashboard with task management, live streaming, and safety controls.

Submit a task, pick an engine (or let Auto choose), and watch it run. Everything stays on your machine — or bridge to the cloud.

Version: 0.3.5 | Website | Changelog | Discord

Repository

GitHub: NickRomanek/clawbridge

Installation

Windows Installer (Recommended)

Download ClawBridge-Setup.exe and run it. The installer:

Installs ClawBridge to C:\Program Files\ClawBridge (or user folder)
Bundles Python 3.12, Playwright, and all dependencies
Creates Start Menu shortcuts
Optionally installs OpenClaw engine (can also install later from dashboard)
Creates .env from template on first run
Shows a progress bar during post-install setup (Playwright download, engine config)

Optional installer tasks:

Desktop shortcut: Quick access from your desktop
Start with Windows: Auto-launch on login
Install OpenClaw: Adds memory & skills support (recommended for power users)

After installation, launch ClawBridge from the Start Menu and open http://127.0.0.1:8765 in your browser.

Quick Start (Single File)

The monolith clawbridge.py is the primary entry point — one file, no package structure needed:

pip install fastapi uvicorn pydantic python-dotenv httpx websockets anthropic pyautogui mss pillow pywinauto pynput
# Copy .env.example to .env and add at least one API key
cp .env.example .env
python clawbridge.py

Opens http://127.0.0.1:8765 — the ClawBridge Dashboard.

Quick Start (Docker)

git clone <repo-url>
cd clawbridge
cp .env.example .env
# Edit .env -- add at least one API key
docker-compose up

Open http://localhost:8765 in your browser.

Quick Start (Manual Package Install)

Requires: Python 3.11+, Node 22+ (for OpenClaw)

cp .env.example .env
# Edit .env -- add at least one API key
pip install -e .
python -m clawbridge

Getting Started

When you first open ClawBridge, you'll see a Getting Started checklist to help you set up:

1. Configure an API Key

ClawBridge requires at least one LLM provider API key. Get yours here:

Provider	Get Key	Used By
Anthropic	console.anthropic.com/settings/keys	browser-use, computer-use
OpenAI	platform.openai.com/api-keys	browser-use
OpenRouter	openrouter.ai/keys	All engines (proxy)

Add your key in the dashboard's Config panel or edit .env directly.

2. Set Your Identity (Optional)

Customize the agent's personality by editing workspace/memory/IDENTITY.md. This gives the AI context about who it's helping.

3. Run Your First Task

Type a task in the chat input and press Enter:

Web task: "Search Google for ClawBridge AI"
Desktop task: "Open Notepad and write Hello World"
Research task: "What are the top 3 news stories today?"

4. Launch Browser Engine

For authenticated web tasks (tasks that need your logins), click Launch Chrome Session in the Config panel. This opens Chrome with a persistent profile where you can sign into your accounts once.

Architecture

ClawBridge has two deployment forms that share the same logic:

Form	File	Use Case
Monolith	`clawbridge.py` (~10,400 lines)	Primary. Single file, easy to share/deploy
Package	`clawbridge/` directory	Modular. For development, testing, extensibility

How It Works

User → Dashboard (http://127.0.0.1:8765)
         ↓
    Task Manager (routes, queues, concurrency)
         ↓
    Engine Selection (auto or manual)
         ↓
  ┌──────┼──────────┐
  ↓      ↓          ↓
browser-use  computer-use  OpenClaw
(Playwright)  (pyautogui)   (Node.js CDP)
                 ↑
           Perception Layer
           (screenshot + UIA accessibility)
                 ↑
           Recorder / Replay
           (pynput capture → adaptive replay)
  ↓      ↓          ↓
  └──────┼──────────┘
         ↓
    Live View (WebSocket screenshots)
    Audit Log (SQLite)
    Result Synthesis

Engines

Engine	Technology	Best For	Status
browser-use	Python + Playwright	Web automation, extraction, form filling	Working
computer-use	Anthropic API + pyautogui + mss + pywinauto	Full desktop control — any app, any window	Working (accessibility-first navigation)
OpenClaw	Node.js + Chrome DevTools Protocol	AI agent with persistent memory & skills	Requires separate install (`npm i -g openclaw`)

Engine Selection (Smart Auto Routing)

ClawBridge uses intelligent engine selection to pick the best engine for each task:

Task Type	Routes To	Detection
Web tasks	browser-use	URLs, "search", "browse", "navigate", web domains
Desktop tasks	computer-use	App names (notepad, excel, telegram), "click", "open app", "desktop"
General tasks	openclaw	Fallback for conversational/research tasks

Auto mode (default): Smart selection based on URL patterns and keyword detection
Manual mode: User picks engine explicitly from dropdown
Economy mode: Toggle Performance/Economy to use cheaper models (gpt-4o-mini for browser-use, Haiku for replay steps)

Computer-Use Engine Details

The computer-use engine controls the full Windows desktop via screenshots + mouse/keyboard. Key features:

Accessibility-first navigation: Uses Windows UIA (via pywinauto) to enumerate interactive elements. Model clicks by element ID instead of guessing pixel coordinates — far more reliable.
Dual screenshot strategy: Sends full screen (for coordinates) + zoomed crop of foreground window (for reading text)
Auto-focus: Detects target app from prompt and brings it to foreground before starting
DPI-aware: Calls SetProcessDPIAware() so all coordinate systems (pyautogui, mss, GetWindowRect) are consistent
Forced reasoning: Model must follow [OBSERVE]/[GOAL]/[PLAN]/[ACTION] protocol before every action
Stale detection: Perceptual hash comparison warns when screenshots don't change after an action
Hybrid mechanical + AI execution: Deterministic actions (app launch, URL navigation, typing) handled programmatically at zero AI cost. AI only invoked when visual reasoning is needed.
Mechanical pre-navigation: Extracts URLs from prompts and navigates deterministically via webbrowser.open_new() or Ctrl+L hotkeys before AI engagement — saves an entire round of LLM reasoning.
Vision fallback: When UIA tree returns < 5 elements (Electron apps, games, custom UIs), a fast vision model identifies UI elements from the screenshot. Results merged with UIA elements via 30-pixel deduplication.
Dual API path: Direct Anthropic API uses native computer_20250124/computer_20251124 tools; OpenRouter uses function-tool schema. Configurable via COMPUTER_USE_API setting.
Smart model routing: Uses Haiku for routine replay steps, Sonnet for complex ones — ~50% cost savings
Prompt caching: System prompt cached after first API call in multi-step tasks — 50-90% input token savings
Workflow recording & replay: Record desktop actions via pynput, save as workflows, replay adaptively with confidence-tiered execution (mechanical/verification/AI), element matching, and LLM fallback (see Workflow Recording)

Browser-Use Engine Details

Headless mode: Runs Chromium in background, no visible browser
CDP mode: Connects to an existing Chrome via --remote-debugging-port=9222
User Data Dir mode: Persistent Chrome profile with stored logins
Launch Chrome Session: Dashboard button launches Chrome with persistent profile at %LOCALAPPDATA%\ClawBridge\ChromeProfile

Workflow Recording & Replay

ClawBridge can record your desktop actions and replay them adaptively — even when UI elements move or change.

Recording

From Chat (recommended):

Click Record in the chat input bar (or type /record)
Perform your desktop actions (clicks, typing, keyboard shortcuts)
Click Stop (or type /stop) — a save card appears with a pre-filled name
Click Save or customize the name first

From Workflows Tab:

Navigate to the Workflows tab
Click Start Recording — a red indicator and timer appear
Perform your desktop actions
Click Stop Recording — enter a name and save

Replay

Click Replay on any saved workflow, or type /replay Workflow Name in chat
Confidence-tiered execution: Each action scored automatically:
- >= 0.95: Pure mechanical replay (free, instant)
- 0.7 - 0.95: Mechanical + visual verification (window title, perceptual hash, LLM check)
- < 0.7: AI-powered replay via LLM with screenshot context
Element matching via accessibility tree comparison:
1. automation_id exact match (confidence 1.0)
2. name + type + parent (0.95)
3. name + type (0.85)
4. type + proximity (0.6)
Adaptive timing: Polls UIA tree stability instead of fixed delays
Outcome learning: Tracks success/failure per step. After 3+ mechanical successes, promotes to high confidence. After repeated failures, demotes to AI replay.
Auto-detects target app from recorded window titles, process names, and known app signatures
Handles app launch patterns (Win key, search, Enter)

Parameterized Replay

After recording, ClawBridge can detect typed text that varies between runs (search queries, filenames, etc.)
Save parameter defaults and run with custom values each time
Dashboard shows parameter input form with Save/Run buttons
Safety-scanned per parameter value

Perception Layer

The recording system is backed by a standalone perception module (clawbridge/perception/):

Screenshot utilities: Async full-screen and window-crop capture, perceptual similarity comparison
Accessibility tree: Enhanced pywinauto UIA wrapper with ElementSnapshot dataclass, multi-strategy element matching
A11y enrichment at record time: Click events enriched with element metadata from UIA tree while correct window is in focus

Dashboard

The web dashboard at http://127.0.0.1:8765 provides:

Chat interface: Submit tasks, see results in a message-bubble layout with inline cost/duration info
Engine selector: Chip bar (Auto / Browser / Desktop / Chat) with tooltips
Slash commands: Type / for autocomplete dropdown — /record, /stop, /replay <name>, /browser, /computer, /chat
Stop button: Send button swaps to red Stop while a task is running — always visible, one-click cancel
Workflow recording from chat: Click Record, perform actions, click Stop — save card appears with pre-filled name
Live View: Real-time screenshot stream from browser or desktop
Engine status: See which engines are available/running/errored
Config panel: API key management, browser session controls, machine ID
Activity feed: Audit trail of every action taken
Workflows tab: Record, save, and replay desktop workflows
Soul/Memory tabs: Edit agent personality and view memory logs

Configuration

All configuration lives in .env. See .env.example for the full list.

API Keys (BYOK)

You need at least one:

Key	Provider	Used By
`ANTHROPIC_API_KEY`	Anthropic (direct)	browser-use, computer-use
`OPENAI_API_KEY`	OpenAI	browser-use
`OPENROUTER_API_KEY`	OpenRouter (proxy)	computer-use, browser-use

Key Settings

# Server
CLAWBRIDGE_HOST=127.0.0.1
CLAWBRIDGE_PORT=8765

# Engines
ENABLED_ENGINES=browser_use,computer_use    # comma-separated
DEFAULT_MODEL=openai/gpt-4o                 # for browser-use

# Computer-Use
COMPUTER_USE_MODEL=anthropic/claude-sonnet-4.5   # primary model
COMPUTER_USE_MODEL_FAST=anthropic/claude-haiku-4-5  # cheap model for routine replay
COMPUTER_USE_API=auto                            # auto | direct | openrouter
COMPUTER_USE_MAX_SCREEN_WIDTH=1920
COMPUTER_USE_MAX_SCREEN_HEIGHT=1080
COMPUTER_USE_ACTION_DELAY_MS=500

# Economy Mode
ECONOMY_MODEL=                                   # optional: google/gemini-flash-2.0

# Recording
RECORDING_SCREENSHOTS=true
RECORDING_INTENT_EXTRACTION=true
SCREENPIPE_INTEGRATION=true

# Browser
BROWSER_HEADLESS=true
BROWSER_MODE=default                        # default | cdp | user_data_dir
BROWSER_CDP_URL=http://localhost:9222
BROWSER_USER_DATA_DIR=

# Policy
POLICY_MODE=guarded                         # guarded | permissive | strict
MAX_CONCURRENT_TASKS=3
MAX_ACTIONS_PER_TASK=50

# Remote Bridge (beta)
REMOTE_BRIDGE_URL=
REMOTE_AUTH_TOKEN=

Automation Modes

ClawBridge supports two automation modes to balance speed vs. safety:

Mode	Behavior	Best For
Supervised (default)	Pauses for approval before high-risk actions	Financial tasks, unfamiliar workflows, production systems
Autonomous	Runs without interruption	Trusted tasks, development, demos

Supervised Mode Features

When running in Supervised mode, ClawBridge automatically detects and pauses for:

Sensitive Domains (banking, shopping, cloud admin):

Banking: chase.com, bankofamerica.com, wellsfargo.com, paypal.com, etc.
Shopping: amazon.com, ebay.com, walmart.com checkout pages
Cloud: console.aws.amazon.com, portal.azure.com, console.cloud.google.com
Email: gmail.com, outlook.com (compose/send actions)

High-Risk Actions:

Purchases and payments (buy, purchase, checkout, pay)
Form submissions (submit, confirm, send)
Deletions (delete, remove, clear)
Account changes (password, settings, account)

When a high-risk action is detected:

Task pauses and shows an approval modal in the dashboard
You see exactly what action the AI wants to take
Click Approve to proceed or Deny to block
2-minute timeout auto-denies if no response

Changing Modes

From Dashboard: Use the Automation Mode toggle in the Config panel

From .env:

AUTOMATION_MODE=supervised   # or: autonomous

Tip: Start with Supervised mode until you're comfortable with the AI's behavior, then switch to Autonomous for trusted workflows.

Security

All data stays on your machine in local mode. No cloud egress.
API keys are never logged or transmitted.
Dashboard authentication: Token-based auth with HttpOnly cookie, CSRF token protection on all state-changing endpoints.
WebSocket authentication: Token verified before connection is accepted.
Actions classified as safe/sensitive/high-risk with configurable policy.
Sensitive domain detection (banking, cloud consoles) auto-elevates risk level.
Credential and PII detection with automatic redaction before memory storage.
Prompt injection pattern detection and filtering in stored memory.
Path traversal protection on personality file endpoints.
Remote bridge requires HTTPS for non-localhost URLs.
XSS protection via DOMPurify with safe fallback.
Full audit trail in SQLite database.

API Endpoints

Method	Path	Description
`GET`	`/`	Dashboard UI
`GET`	`/health`	Health check
`POST`	`/api/tasks`	Create task
`GET`	`/api/tasks`	List all tasks
`GET`	`/api/tasks/{id}`	Get single task
`PATCH`	`/api/tasks/{id}`	Pause/resume/cancel
`DELETE`	`/api/tasks/{id}`	Remove task
`DELETE`	`/api/tasks`	Clear all tasks
`GET`	`/api/tasks/{id}/steps`	Get step-by-step replay
`GET`	`/api/engines`	List engines + status
`POST`	`/api/engines/openclaw/install`	Install OpenClaw engine
`GET`	`/api/config`	Get config (keys redacted, includes version)
`POST`	`/api/config/keys`	Save API keys to .env
`POST`	`/api/config/automation`	Set automation mode (supervised/autonomous)
`POST`	`/api/browser/launch`	Launch Chrome with CDP
`GET`	`/api/browser/status`	Check Chrome connection
`GET`	`/api/schedules`	List task schedules
`POST`	`/api/schedules`	Create recurring schedule
`DELETE`	`/api/schedules/{id}`	Delete schedule
`GET`	`/api/templates`	List task templates
`POST`	`/api/templates`	Create task template
`GET`	`/api/workflows`	List saved workflows
`GET`	`/api/workflows/{id}`	Get workflow details
`POST`	`/api/workflows`	Create workflow from recorded actions
`DELETE`	`/api/workflows/{id}`	Delete workflow
`POST`	`/api/workflows/{id}/replay`	Trigger workflow replay
`POST`	`/api/workflows/{id}/replay-parameterized`	Replay with parameter substitution
`POST`	`/api/workflows/{id}/save-params`	Save parameter defaults
`POST`	`/api/workflows/{id}/extract-intent`	Trigger intent extraction
`POST`	`/api/config/model-tier`	Switch Performance/Economy mode
`POST`	`/api/config/computer-use-api`	Switch API path (Auto/Direct/OpenRouter)
`POST`	`/api/recording/start`	Start desktop recording
`POST`	`/api/recording/stop`	Stop recording, return actions
`POST`	`/api/auth/login`	Authenticate and set HttpOnly session cookie
`WS`	`/ws`	WebSocket (tasks, frames, audit, approvals, workflows)

WebSocket Events

Event Type	Direction	Description
`task_update`	Server → Client	Task status change
`browser_frame`	Server → Client	Screenshot stream (base64)
`audit_event`	Server → Client	Audit log entry
`approval_request`	Server → Client	High-risk action needs approval
`approval_response`	Client → Server	User approves/denies action
`approval_ack`	Server → Client	Confirmation of approval processing
`recording_status`	Server → Client	Recording started/stopped status
`recording_result`	Server → Client	Recorded actions after stop
`workflow_update`	Server → Client	Workflow list changed
`workflow_saved`	Server → Client	Workflow saved confirmation
`replay_started`	Server → Client	Workflow replay task created
`recording_event`	Server → Client	Live action during recording
`engine_status`	Server → Client	Engine status/model info changed
`config_update`	Server → Client	Configuration setting changed
`safety_warning`	Server → Client	Safety scan flag detected

Remote Bridge (Beta)

ClawBridge can connect to a remote orchestration service:

Set REMOTE_BRIDGE_URL and REMOTE_AUTH_TOKEN in .env
Local instance polls remote for tasks every 10 seconds
Each machine identified by persistent clawbridge.id (UUID)
Remote tasks execute locally, results flow back
Dashboard shows "Bridge Online/Offline" status

This enables the bridge architecture: local machines provide the "hands" (desktop/browser access), remote service provides the "brain" (task orchestration, hosted engines).

Project Structure

clawbridge.py                 # Monolith — primary entry point (~10,400 lines)
clawbridge_mcp.py             # MCP server (stdio/HTTP proxy to REST API)
clawbridge/
  config.py                   # Settings & BYOK key management
  engines/
    base.py                   # EngineBase abstract interface
    browser_use_engine.py     # Playwright-based web automation
    computer_use_engine.py    # Desktop control via Anthropic API
    openclaw_engine.py        # Node.js CDP agent
  perception/                 # Perception layer (v0.2.0)
    screenshot.py             # Async screenshot utilities
    accessibility.py          # Enhanced UIA wrapper + element matching
  recorder/                   # Workflow recording (v0.2.0)
    capture.py                # pynput mouse/keyboard capture
    processor.py              # Raw event → enriched action processing
  orchestrator/
    manager.py                # Task lifecycle, engine routing
  server/
    app.py                    # FastAPI app factory
    routes/
      tasks.py                # Task CRUD endpoints
      engines.py              # Engine status endpoints
      config_routes.py        # Config & key management
      ws.py                   # WebSocket streaming
  policy/
    safety.py                 # Action classification, injection detection
  telemetry/
    logger.py                 # Audit logging to SQLite
  shared/
    schemas.py                # Pydantic models
build.py                      # Portable Windows build system
installer.iss                 # Inno Setup installer script
.env.example                  # Configuration template
.mcp.json                     # MCP server registration for Claude Code

MCP Server

ClawBridge exposes its full API as an MCP (Model Context Protocol) server, enabling integration with Claude Code, Cursor, and other MCP-compatible tools.

# Register with Claude Code
claude mcp add clawbridge -- python clawbridge_mcp.py

# Or run standalone with HTTP transport
python clawbridge_mcp.py --http

15 tools available: run_task, get_task_status, list_tasks, cancel_task, list_engines, get_task_steps, get_task_audit, search_memory, get_agent_context, append_memory, list_schedules, create_schedule, get_config, get_license_info, list_workflows

See .mcp.json for project-level registration.

Roadmap

Completed

Phase 2: Reliability (Next)

Goal: >90% success rate on recorded workflow replays

Self-verification loops for live computer-use tasks (screenshot after each action, verify success, retry on failure)
Set-of-Mark (SoM) visual prompting (overlay numbered markers on screenshots using UIA element positions)
OmniParser V2 visual fallback (when UIA tree returns < 5 elements, use vision-based element detection with IoU dedup)
Increase UIA element limit (40 -> 80, make depth configurable)
Cross-workflow outcome learning (opt-in, share action fingerprints across workflows for same app)
Expand economy mode (Haiku for browser-use, Gemini Flash via ECONOMY_MODEL)

Phase 3: Distribution

Goal: 1,000 active users

macOS full support (AXUIElement accessibility, AppleScript app control)
Auto-update mechanism in installer
Bundled API key option (OpenRouter partnership for zero-config users)
Template/workflow gallery (pre-built automations for common tasks)
ProductHunt + HackerNews launch
Code signing certificate for Windows installer

Phase 4: Monetization

Goal: First paying customers

Cloud sync service (optional workflow sync + remote replay)
Team workflow sharing
Pro tier launch ($29/mo)
Workflow marketplace (community-shared templates)
pip install clawbridge one-command setup
macOS .dmg packaging via GitHub Actions

Contributing

We welcome contributions! Here's how:

Fork the repo
Create a feature branch (git checkout -b feat/my-feature)
Make your changes and test them
Commit (git commit -m "Add my feature")
Push and open a PR

Please open an issue first for large changes so we can discuss the approach.

License

Apache License 2.0 — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github		.github
ClawBridgeImages		ClawBridgeImages
assets		assets
clawbridge		clawbridge
tests		tests
website/frontend/src/pages		website/frontend/src/pages
workspace		workspace
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLA.md		CLA.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
build.py		build.py
build_macos.py		build_macos.py
clawbridge.ico		clawbridge.ico
clawbridge.py		clawbridge.py
clawbridge_mcp.py		clawbridge_mcp.py
dmg_settings.py		dmg_settings.py
docker-compose.yml		docker-compose.yml
install.py		install.py
installer.iss		installer.iss
kill_server.ps1		kill_server.ps1
loading.html		loading.html
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.ps1		setup.ps1
setup.sh		setup.sh
test.py		test.py
walkthrough.md		walkthrough.md

License

Licenses found

NickRomanek/clawbridge

Folders and files

Latest commit

History

Repository files navigation

ClawBridge

Repository

Installation

Windows Installer (Recommended)

Quick Start (Single File)

Quick Start (Docker)

Quick Start (Manual Package Install)

Getting Started

1. Configure an API Key

2. Set Your Identity (Optional)

3. Run Your First Task

4. Launch Browser Engine

Architecture

How It Works

Engines

Engine Selection (Smart Auto Routing)

Computer-Use Engine Details

Browser-Use Engine Details

Workflow Recording & Replay

Recording

Replay

Parameterized Replay

Perception Layer

Dashboard

Configuration

API Keys (BYOK)

Key Settings

Automation Modes

Supervised Mode Features

Changing Modes

Security

API Endpoints

WebSocket Events

Remote Bridge (Beta)

Project Structure

MCP Server

Roadmap

Completed

Phase 2: Reliability (Next)

Phase 3: Distribution

Phase 4: Monetization

Contributing

License

About

Topics

Resources

License

Licenses found

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Contributors 2

Uh oh!

Languages