Skip to content
/ wbac Public
forked from kortix-ai/wbac

API Server to control Cloud Browser with natural language instructions using AI.

License

Notifications You must be signed in to change notification settings

lblwb/wbac

 
 

Repository files navigation

Web Browser AI-Control API Server

A hosted API server for AI agents to control web browsers in the cloud using Browserbase, Stagehand, and Playwright.

Overview

This server provides a REST API for AI agents to:

  • Create and manage browser sessions in the cloud
  • Control browsers using natural language commands
  • Extract structured data from web pages
  • Monitor browser activity and errors
  • Take screenshots and inspect DOM state
  • Execute custom Playwright automation scripts

Built on top of:

  • Browserbase for cloud browser infrastructure
  • Stagehand for AI-powered browser control
  • Playwright for low-level browser automation
  • Express.js for the API server

API Reference

For complete API documentation, visit wbac-api-docs.netlify.app

Session Management

  • POST /api/sessions/create-session - Create new browser session
  • POST /api/sessions/stop-session/:sessionId - Stop session
  • GET /api/sessions/running-sessions - List active sessions
  • GET /api/sessions/session/:sessionId - Get session information
  • GET /api/sessions/debug/:sessionId - Get session debug URLs

Browser Control

  • POST /api/browser/navigate/:sessionId - Navigate to URL
  • POST /api/browser/act/:sessionId - Perform action via natural language
  • POST /api/browser/extract/:sessionId - Extract structured data
  • POST /api/browser/observe/:sessionId - Get possible actions

Monitoring

  • GET /api/browser/console-logs/:sessionId - Get console logs
  • GET /api/browser/network-logs/:sessionId - Get network logs
  • GET /api/browser/dom-state/:sessionId - Get DOM state
  • POST /api/browser/screenshot/:sessionId - Take screenshot
  • POST /api/browser/clear-logs/:sessionId - Clear logs

Key Features

Browser Session Management

  • Create new browser sessions
  • Resume existing sessions
  • List running sessions
  • Stop/cleanup sessions

AI-Powered Browser Control

  • Natural language actions via act()
  • Structured data extraction via extract()
  • Page observation via observe()
  • Vision-based interaction support

Monitoring & Debugging

  • Console log monitoring
  • Network request/response logging
  • Error tracking
  • Screenshot capture
  • DOM state inspection

Getting Started

Prerequisites

  • Node.js 16+
  • Browserbase account and credentials
  • OpenAI or Anthropic API key for AI features

Installation

git clone https://github.com/kortix-ai/wbac
cd wbac
npm i

Configuration

Create a .env file with:

BROWSERBASE_API_KEY=your_api_key
BROWSERBASE_PROJECT_ID=your_project_id
OPENAI_API_KEY=your_openai_key 
ANTHROPIC_API_KEY=your_anthropic_key

Running the Server

npm start

UI Interface

A Streamlit-based UI is included for testing and debugging:

pip install streamlit
streamlit run streamlit_ui.py

Use Cases

  • AI agents that need web browsing capabilities
  • Automated web testing with AI assistance
  • Web scraping with natural language commands
  • Browser automation monitoring and debugging

Contributing

Contributions welcome! Please read our contributing guidelines and submit pull requests.

License

MIT License - see LICENSE file for details

Acknowledgements

About

API Server to control Cloud Browser with natural language instructions using AI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 53.8%
  • Python 45.9%
  • Shell 0.3%