Skip to content

RipyD/Claude-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– AI Web Browser

A comprehensive AI-powered web browser with autonomous navigation, conversational interface, vision capabilities, and programmatic API control. Built with Claude AI, Electron, and Playwright.

✨ Features

1. Autonomous AI Browser Agent

  • Navigate websites automatically
  • Click buttons, fill forms, and complete complex tasks
  • Execute multi-step workflows autonomously
  • Smart decision-making based on page content

2. AI-Enhanced Traditional Browser

  • Content summarization
  • Intelligent information extraction
  • Visual page analysis with AI vision
  • Context-aware assistance

3. Conversational Web Browsing

  • Chat with AI about current page
  • Natural language navigation commands
  • Ask questions and get instant answers
  • Context-aware conversations

4. Programmatic API Control

  • RESTful API for browser automation
  • Headless browsing capabilities
  • Session management
  • Perfect for automation and testing

πŸš€ Quick Start

Prerequisites

Installation

# Clone the repository
git clone <your-repo-url>
cd ai-web-browser

# Install dependencies
npm install

# Install Playwright browsers
npx playwright install chromium

# Set up environment variables
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY

Configuration

Edit .env file:

ANTHROPIC_API_KEY=your_api_key_here
API_PORT=3000
API_HOST=localhost
HEADLESS=false
DEFAULT_VIEWPORT_WIDTH=1280
DEFAULT_VIEWPORT_HEIGHT=720

πŸ’» Usage

Desktop Application

# Build the application
npm run build

# Start the Electron app
npm start

Features in the Desktop App:

  • Visual browser with AI sidebar
  • Chat with AI about any webpage
  • Execute autonomous tasks
  • Summarize, analyze, and extract information
  • Real-time screenshot updates

API Server

# Start the API server
npm run server

The API will be available at http://localhost:3000

Programmatic Usage

import { AutonomousAgent } from './src/ai/autonomous-agent';

async function example() {
  // Initialize the agent
  const agent = new AutonomousAgent(process.env.ANTHROPIC_API_KEY);
  await agent.initialize();

  // Execute an autonomous task
  const result = await agent.executeTask({
    task: 'Search for latest AI news on Google and summarize the top 3 articles',
    url: 'https://google.com'
  });

  console.log(result.finalResult);
  console.log('Steps taken:', result.steps);

  // Chat about a page
  await agent.getBrowserController().navigate('https://example.com');
  const response = await agent.chat('What is this page about?');
  console.log(response);

  // Summarize content
  const summary = await agent.summarizeCurrentPage();
  console.log(summary);

  // Extract specific information
  const info = await agent.extractInformation('What are the main contact details?');
  console.log(info);

  // Analyze with vision
  const analysis = await agent.analyzeCurrentPage('What elements are visible on this page?');
  console.log(analysis);

  await agent.close();
}

πŸ“‘ API Documentation

Create Session

POST /api/sessions

Response:

{
  "success": true,
  "sessionId": "session_123abc",
  "message": "Browser session created"
}

Navigate

POST /api/sessions/:sessionId/navigate
Content-Type: application/json

{
  "url": "https://example.com"
}

Execute Autonomous Task

POST /api/sessions/:sessionId/task
Content-Type: application/json

{
  "task": "Search for information about AI",
  "url": "https://google.com",
  "maxSteps": 20
}

Chat

POST /api/sessions/:sessionId/chat
Content-Type: application/json

{
  "message": "What is this page about?"
}

Summarize Page

GET /api/sessions/:sessionId/summarize

Extract Information

POST /api/sessions/:sessionId/extract
Content-Type: application/json

{
  "query": "Extract all email addresses"
}

Analyze with Vision

POST /api/sessions/:sessionId/analyze
Content-Type: application/json

{
  "query": "What products are shown on this page?"
}

Get Page Content

GET /api/sessions/:sessionId/content

Take Screenshot

GET /api/sessions/:sessionId/screenshot

Returns base64-encoded PNG image.

Close Session

DELETE /api/sessions/:sessionId

List Sessions

GET /api/sessions

🎯 Use Cases

1. Web Automation

const task = {
  task: 'Book a flight from NYC to LAX on October 15th',
  url: 'https://airline-website.com'
};
const result = await agent.executeTask(task);

2. Research Assistant

const task = {
  task: 'Research the top 5 competitors in the AI browser space and create a summary',
  url: 'https://google.com'
};
const result = await agent.executeTask(task);

3. Content Extraction

await agent.getBrowserController().navigate('https://news-site.com');
const articles = await agent.extractInformation(
  'Extract all article titles and their summaries'
);

4. Price Monitoring

const task = {
  task: 'Find the price of iPhone 15 Pro on this e-commerce site',
  url: 'https://shop.example.com'
};
const result = await agent.executeTask(task);

5. Form Filling

const task = {
  task: 'Fill out the contact form with: Name: John Doe, Email: john@example.com, Message: Hello!',
  url: 'https://example.com/contact'
};
const result = await agent.executeTask(task);

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Electron Desktop UI                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  Browser   β”‚  β”‚   AI Chat    β”‚  β”‚  Task Executor  β”‚ β”‚
β”‚  β”‚   View     β”‚  β”‚  Interface   β”‚  β”‚                 β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Autonomous Agent Layer                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚  β€’ Task Planning      β€’ Conversation Management    β”‚ β”‚
β”‚  β”‚  β€’ Vision Analysis    β€’ Content Extraction         β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚                              β”‚
           β–Ό                              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Claude AI Engine   β”‚      β”‚  Browser Controller      β”‚
β”‚                      β”‚      β”‚  (Playwright)            β”‚
β”‚  β€’ GPT-4 Vision      β”‚      β”‚                          β”‚
β”‚  β€’ Task Planning     β”‚      β”‚  β€’ Navigation            β”‚
β”‚  β€’ Content Analysis  β”‚      β”‚  β€’ DOM Interaction       β”‚
β”‚  β€’ Summarization     β”‚      β”‚  β€’ Screenshots           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚                              β”‚
           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   REST API Server                        β”‚
β”‚          (Express.js - Port 3000)                        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

ai-web-browser/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ ai/
β”‚   β”‚   β”œβ”€β”€ claude-engine.ts      # AI engine (Claude API)
β”‚   β”‚   β”œβ”€β”€ browser-controller.ts # Playwright browser automation
β”‚   β”‚   └── autonomous-agent.ts   # Main agent orchestrator
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── server.ts             # REST API server
β”‚   β”œβ”€β”€ main/
β”‚   β”‚   └── main.ts               # Electron main process
β”‚   β”œβ”€β”€ renderer/
β”‚   β”‚   β”œβ”€β”€ index.html            # UI
β”‚   β”‚   └── renderer.ts           # UI logic
β”‚   └── shared/
β”‚       └── types.ts              # Shared TypeScript types
β”œβ”€β”€ dist/                         # Compiled JavaScript
β”œβ”€β”€ package.json
β”œβ”€β”€ tsconfig.json
└── README.md

πŸ› οΈ Development

# Install dependencies
npm install

# Development mode (watch mode)
npm run dev

# Build
npm run build

# Run tests
npm test

# Start API server
npm run server

# Start Electron app
npm start

πŸ”§ Advanced Configuration

Custom Browser Configuration

const agent = new AutonomousAgent(apiKey);
const browserController = new BrowserController({
  headless: true,
  viewport: { width: 1920, height: 1080 },
  userAgent: 'Custom User Agent'
});

Task Execution Options

const result = await agent.executeTask({
  task: 'Your task description',
  url: 'https://starting-url.com',
  context: 'Additional context for the AI',
  maxSteps: 30  // Maximum number of automation steps
});

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“ License

MIT

⚠️ Disclaimer

This is an AI-powered browser automation tool. Please ensure you:

  • Comply with websites' Terms of Service
  • Respect robots.txt files
  • Use responsibly and ethically
  • Don't use for malicious purposes
  • Be mindful of rate limiting

πŸ› Troubleshooting

Common Issues

Issue: "Browser not initialized"

  • Ensure Playwright browsers are installed: npx playwright install chromium

Issue: "API Key not found"

  • Make sure you've set ANTHROPIC_API_KEY in your .env file

Issue: "Screenshot not loading"

  • Check if the page has fully loaded
  • Try increasing wait times in browser configuration

Issue: "Task execution timeout"

  • Increase maxSteps in task configuration
  • Check your internet connection
  • Verify the website is accessible

πŸ“š Additional Resources

πŸŽ‰ Examples

Check out the examples/ directory for more usage examples:

  • Web scraping
  • Automated testing
  • Data extraction
  • Form automation
  • Research workflows

Built with ❀️ using Claude AI, Electron, and Playwright

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors