Skip to content

browser-flow-ai/browser-flow-TS

Browser-Flow

A modern browser automation integration toolkit based on Stagehand, designed for developers and automation enthusiasts.

δΈ­ζ–‡η‰ˆ | English

Project Overview

Browser-Flow is a modular browser automation solution that breaks down complex browser operations into multiple independent and composable components. Each component can be used individually or combined together, providing developers with flexible and powerful browser automation capabilities.

Core Components

  • browser-common - Base dependency library providing common browser operation interfaces, serves as the overall dependency layer, currently mainly for LLM calls and logging
  • browser-control - Intelligent browser operation engine based on large language models, provides intelligent encapsulation of page operations, supporting clicks, key presses, input, scrolling, etc.
  • browser-flow - Customizable browser workflow orchestration system, handles action orchestration and operation flow planning, as well as result aggregation
  • browser-wrapper - Browser wrapper layer that ensures stable browser operation while obtaining A11y Tree, provides web page operation encapsulation, ensuring stability during dynamic runtime and successful atree retrieval

Features

  • πŸš€ Modular Architecture: Each component can be used independently or combined
  • πŸ€– AI-Powered: Intelligent browser operations using large language models
  • πŸ”§ Highly Customizable: Flexible workflow orchestration system
  • πŸ“± Accessibility Support: Built-in A11y Tree support for better automation
  • πŸ› οΈ Developer Friendly: TypeScript support with comprehensive type definitions
  • πŸ“¦ Monorepo Structure: Easy to maintain and extend

Quick Start

Prerequisites

  • Node.js 18+
  • pnpm (recommended) or npm

Environment Variables

Currently supports DeepSeek and Qwen large language models. Set the following environment variables:

AGENTQL_API_KEY="your_key_here"
DASHSCOPE_API_KEY="your_key_here"

Installation

# Clone the repository
git clone <repository-url>
cd browser-flow-TS

# Install dependencies
pnpm install

# Build all packages
pnpm build

Basic Usage

Using BrowserFlow (Recommended)

import { BrowserFlow } from '@browser-flow/browser-flow';

async function example() {
  const workflow = new BrowserFlow({
    maxSteps: 10,
    verbose: true
  });
  
  const result = await workflow.execute("Open https://example.com and click login");
  
  if (result.success) {
    console.log("Workflow completed successfully!");
    console.log(`Steps executed: ${result.stepsExecuted}`);
    console.log(`Execution time: ${result.executionTime}ms`);
  } else {
    console.error("Workflow failed:", result.error);
  }
  
  await workflow.dispose();
}

Using AgentHand (Low-level)

import { AgentHand } from '@browser-flow/browser-control';

async function example() {
  const hand = new AgentHand("");
  await hand.init();
  
  await hand.goto("https://example.com");
  await hand.act("Click the login button");
  
  await hand.close();
}

Advanced Usage

import { BrowserFlow } from '@browser-flow/browser-flow';

// Sequential execution
async function sequentialExample() {
  const workflow = new BrowserFlow();
  
  const instructions = [
    "Open https://example.com",
    "Click on the login button",
    "Enter username and password"
  ];
  
  const results = await workflow.executeSequence(instructions);
  
  results.forEach((result, i) => {
    console.log(`Workflow ${i + 1}: ${result.success ? 'Success' : 'Failed'}`);
  });
  
  await workflow.dispose();
}

// Parallel execution
async function parallelExample() {
  const workflow = new BrowserFlow();
  
  const instructions = [
    "Open https://google.com and search for 'TypeScript'",
    "Open https://github.com and search for 'browser automation'"
  ];
  
  const results = await workflow.executeParallel(instructions);
  
  console.log(`Executed ${results.length} workflows in parallel`);
  
  await workflow.dispose();
}

Package Structure

packages/
β”œβ”€β”€ browser-common/     # Base utilities and shared interfaces
β”œβ”€β”€ browser-control/    # AI-powered browser automation
β”œβ”€β”€ browser-flow/       # Workflow orchestration system
└── browser-wrapper/    # Browser wrapper with A11y support

Development

Building

# Build all packages
pnpm build

# Build specific package
pnpm --filter @browser-flow/browser-control build

Testing

# Run all tests
pnpm test

# Run tests for specific package
pnpm --filter @browser-flow/browser-control test

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Roadmap

  • Enhanced AI model support
  • More browser automation features
  • Performance optimizations
  • Additional workflow templates

About

The TypeScript version of The AI Browser Automation Framework

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published