A modern browser automation integration toolkit based on Stagehand, designed for developers and automation enthusiasts.
Browser-Flow is a modular browser automation solution that breaks down complex browser operations into multiple independent and composable components. Each component can be used individually or combined together, providing developers with flexible and powerful browser automation capabilities.
- browser-common - Base dependency library providing common browser operation interfaces, serves as the overall dependency layer, currently mainly for LLM calls and logging
- browser-control - Intelligent browser operation engine based on large language models, provides intelligent encapsulation of page operations, supporting clicks, key presses, input, scrolling, etc.
- browser-flow - Customizable browser workflow orchestration system, handles action orchestration and operation flow planning, as well as result aggregation
- browser-wrapper - Browser wrapper layer that ensures stable browser operation while obtaining A11y Tree, provides web page operation encapsulation, ensuring stability during dynamic runtime and successful atree retrieval
- π Modular Architecture: Each component can be used independently or combined
- π€ AI-Powered: Intelligent browser operations using large language models
- π§ Highly Customizable: Flexible workflow orchestration system
- π± Accessibility Support: Built-in A11y Tree support for better automation
- π οΈ Developer Friendly: TypeScript support with comprehensive type definitions
- π¦ Monorepo Structure: Easy to maintain and extend
- Node.js 18+
- pnpm (recommended) or npm
Currently supports DeepSeek and Qwen large language models. Set the following environment variables:
AGENTQL_API_KEY="your_key_here"
DASHSCOPE_API_KEY="your_key_here"
# Clone the repository
git clone <repository-url>
cd browser-flow-TS
# Install dependencies
pnpm install
# Build all packages
pnpm build
import { BrowserFlow } from '@browser-flow/browser-flow';
async function example() {
const workflow = new BrowserFlow({
maxSteps: 10,
verbose: true
});
const result = await workflow.execute("Open https://example.com and click login");
if (result.success) {
console.log("Workflow completed successfully!");
console.log(`Steps executed: ${result.stepsExecuted}`);
console.log(`Execution time: ${result.executionTime}ms`);
} else {
console.error("Workflow failed:", result.error);
}
await workflow.dispose();
}
import { AgentHand } from '@browser-flow/browser-control';
async function example() {
const hand = new AgentHand("");
await hand.init();
await hand.goto("https://example.com");
await hand.act("Click the login button");
await hand.close();
}
import { BrowserFlow } from '@browser-flow/browser-flow';
// Sequential execution
async function sequentialExample() {
const workflow = new BrowserFlow();
const instructions = [
"Open https://example.com",
"Click on the login button",
"Enter username and password"
];
const results = await workflow.executeSequence(instructions);
results.forEach((result, i) => {
console.log(`Workflow ${i + 1}: ${result.success ? 'Success' : 'Failed'}`);
});
await workflow.dispose();
}
// Parallel execution
async function parallelExample() {
const workflow = new BrowserFlow();
const instructions = [
"Open https://google.com and search for 'TypeScript'",
"Open https://github.com and search for 'browser automation'"
];
const results = await workflow.executeParallel(instructions);
console.log(`Executed ${results.length} workflows in parallel`);
await workflow.dispose();
}
packages/
βββ browser-common/ # Base utilities and shared interfaces
βββ browser-control/ # AI-powered browser automation
βββ browser-flow/ # Workflow orchestration system
βββ browser-wrapper/ # Browser wrapper with A11y support
# Build all packages
pnpm build
# Build specific package
pnpm --filter @browser-flow/browser-control build
# Run all tests
pnpm test
# Run tests for specific package
pnpm --filter @browser-flow/browser-control test
We welcome contributions! Please see our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.
- π Documentation
- π Issue Tracker
- π¬ Discussions
- Enhanced AI model support
- More browser automation features
- Performance optimizations
- Additional workflow templates