A comprehensive AI-powered web browser with autonomous navigation, conversational interface, vision capabilities, and programmatic API control. Built with Claude AI, Electron, and Playwright.
- Navigate websites automatically
- Click buttons, fill forms, and complete complex tasks
- Execute multi-step workflows autonomously
- Smart decision-making based on page content
- Content summarization
- Intelligent information extraction
- Visual page analysis with AI vision
- Context-aware assistance
- Chat with AI about current page
- Natural language navigation commands
- Ask questions and get instant answers
- Context-aware conversations
- RESTful API for browser automation
- Headless browsing capabilities
- Session management
- Perfect for automation and testing
- Node.js 18+ and npm
- Anthropic API key (Get one here)
# Clone the repository
git clone <your-repo-url>
cd ai-web-browser
# Install dependencies
npm install
# Install Playwright browsers
npx playwright install chromium
# Set up environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEYEdit .env file:
ANTHROPIC_API_KEY=your_api_key_here
API_PORT=3000
API_HOST=localhost
HEADLESS=false
DEFAULT_VIEWPORT_WIDTH=1280
DEFAULT_VIEWPORT_HEIGHT=720# Build the application
npm run build
# Start the Electron app
npm startFeatures in the Desktop App:
- Visual browser with AI sidebar
- Chat with AI about any webpage
- Execute autonomous tasks
- Summarize, analyze, and extract information
- Real-time screenshot updates
# Start the API server
npm run serverThe API will be available at http://localhost:3000
import { AutonomousAgent } from './src/ai/autonomous-agent';
async function example() {
// Initialize the agent
const agent = new AutonomousAgent(process.env.ANTHROPIC_API_KEY);
await agent.initialize();
// Execute an autonomous task
const result = await agent.executeTask({
task: 'Search for latest AI news on Google and summarize the top 3 articles',
url: 'https://google.com'
});
console.log(result.finalResult);
console.log('Steps taken:', result.steps);
// Chat about a page
await agent.getBrowserController().navigate('https://example.com');
const response = await agent.chat('What is this page about?');
console.log(response);
// Summarize content
const summary = await agent.summarizeCurrentPage();
console.log(summary);
// Extract specific information
const info = await agent.extractInformation('What are the main contact details?');
console.log(info);
// Analyze with vision
const analysis = await agent.analyzeCurrentPage('What elements are visible on this page?');
console.log(analysis);
await agent.close();
}POST /api/sessionsResponse:
{
"success": true,
"sessionId": "session_123abc",
"message": "Browser session created"
}POST /api/sessions/:sessionId/navigate
Content-Type: application/json
{
"url": "https://example.com"
}POST /api/sessions/:sessionId/task
Content-Type: application/json
{
"task": "Search for information about AI",
"url": "https://google.com",
"maxSteps": 20
}POST /api/sessions/:sessionId/chat
Content-Type: application/json
{
"message": "What is this page about?"
}GET /api/sessions/:sessionId/summarizePOST /api/sessions/:sessionId/extract
Content-Type: application/json
{
"query": "Extract all email addresses"
}POST /api/sessions/:sessionId/analyze
Content-Type: application/json
{
"query": "What products are shown on this page?"
}GET /api/sessions/:sessionId/contentGET /api/sessions/:sessionId/screenshotReturns base64-encoded PNG image.
DELETE /api/sessions/:sessionIdGET /api/sessionsconst task = {
task: 'Book a flight from NYC to LAX on October 15th',
url: 'https://airline-website.com'
};
const result = await agent.executeTask(task);const task = {
task: 'Research the top 5 competitors in the AI browser space and create a summary',
url: 'https://google.com'
};
const result = await agent.executeTask(task);await agent.getBrowserController().navigate('https://news-site.com');
const articles = await agent.extractInformation(
'Extract all article titles and their summaries'
);const task = {
task: 'Find the price of iPhone 15 Pro on this e-commerce site',
url: 'https://shop.example.com'
};
const result = await agent.executeTask(task);const task = {
task: 'Fill out the contact form with: Name: John Doe, Email: john@example.com, Message: Hello!',
url: 'https://example.com/contact'
};
const result = await agent.executeTask(task);βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Electron Desktop UI β
β ββββββββββββββ ββββββββββββββββ βββββββββββββββββββ β
β β Browser β β AI Chat β β Task Executor β β
β β View β β Interface β β β β
β ββββββββββββββ ββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Autonomous Agent Layer β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β’ Task Planning β’ Conversation Management β β
β β β’ Vision Analysis β’ Content Extraction β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββββββ ββββββββββββββββββββββββββββ
β Claude AI Engine β β Browser Controller β
β β β (Playwright) β
β β’ GPT-4 Vision β β β
β β’ Task Planning β β β’ Navigation β
β β’ Content Analysis β β β’ DOM Interaction β
β β’ Summarization β β β’ Screenshots β
ββββββββββββββββββββββββ ββββββββββββββββββββββββββββ
β β
ββββββββββββββββ¬ββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β REST API Server β
β (Express.js - Port 3000) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ai-web-browser/
βββ src/
β βββ ai/
β β βββ claude-engine.ts # AI engine (Claude API)
β β βββ browser-controller.ts # Playwright browser automation
β β βββ autonomous-agent.ts # Main agent orchestrator
β βββ api/
β β βββ server.ts # REST API server
β βββ main/
β β βββ main.ts # Electron main process
β βββ renderer/
β β βββ index.html # UI
β β βββ renderer.ts # UI logic
β βββ shared/
β βββ types.ts # Shared TypeScript types
βββ dist/ # Compiled JavaScript
βββ package.json
βββ tsconfig.json
βββ README.md
# Install dependencies
npm install
# Development mode (watch mode)
npm run dev
# Build
npm run build
# Run tests
npm test
# Start API server
npm run server
# Start Electron app
npm startconst agent = new AutonomousAgent(apiKey);
const browserController = new BrowserController({
headless: true,
viewport: { width: 1920, height: 1080 },
userAgent: 'Custom User Agent'
});const result = await agent.executeTask({
task: 'Your task description',
url: 'https://starting-url.com',
context: 'Additional context for the AI',
maxSteps: 30 // Maximum number of automation steps
});Contributions are welcome! Please feel free to submit a Pull Request.
MIT
This is an AI-powered browser automation tool. Please ensure you:
- Comply with websites' Terms of Service
- Respect robots.txt files
- Use responsibly and ethically
- Don't use for malicious purposes
- Be mindful of rate limiting
Issue: "Browser not initialized"
- Ensure Playwright browsers are installed:
npx playwright install chromium
Issue: "API Key not found"
- Make sure you've set
ANTHROPIC_API_KEYin your.envfile
Issue: "Screenshot not loading"
- Check if the page has fully loaded
- Try increasing wait times in browser configuration
Issue: "Task execution timeout"
- Increase
maxStepsin task configuration - Check your internet connection
- Verify the website is accessible
Check out the examples/ directory for more usage examples:
- Web scraping
- Automated testing
- Data extraction
- Form automation
- Research workflows
Built with β€οΈ using Claude AI, Electron, and Playwright