🤖 WebRouter (Web-enable AI Agent)

A powerful AI-powered web automation tool that enables natural language interaction with web browsers. This project combines the capabilities of various Large Language Models (LLMs) accessed via OpenRouter with Playwright for sophisticated web automation and interaction.

🌟 Features

🧠 Advanced AI-powered web navigation and interaction
💬 Natural language understanding and processing
🎯 Precise web element identification and interaction
🖥️ Support for multiple browser automation features
📊 Rich visual feedback through Gradio interface
🔄 Real-time browser state observation
🎨 Beautiful and intuitive user interface

🛠️ Technical Stack

LLM Providers: Various Models via OpenRouter
AI Models: Multiple LLMs (e.g., OpenAI GPT-4o, OpenAI o3/o4, Google Gemini 2.5 Flash/Pro, DeepSeek V3/R1)
Web Automation: Playwright
User Interface: Gradio
Accessibility: Built-in support for AXTree, DOM, and screenshot analysis
Action Execution: Performs a wide range of actions, including:
- Filling forms (fill)
- Clicking elements (click, dblclick)
- Selecting options (select_option)
- Navigating between pages (goto, go_back, go_forward)
- Opening and closing tabs (new_tab, tab_close)
- Scrolling (scroll)
- Mouse interactions (mouse_move, mouse_click, mouse_drag_and_drop)
- Keyboard interactions (keyboard_type, keyboard_press)
- File uploads (upload_file)
- And more!

📋 Prerequisites

Python 3.8 or higher
OpenRouter API Key
Modern web browser

⚙️ Installation

Clone the repository:

git clone <repository-url>
cd web-agent

Install dependencies:

pip install -r requirements.txt

Install Playwright browsers:

playwright install chromium

🚀 Quick Start

Set up your OpenRouter API Key: You can either set it as an environment variable:
```
OPENROUTER_API_KEY="your-openrouter-api-key"
```
Or enter it directly into the application's UI.
Launch the application:

python gradio_app.py

Configure the agent in the UI:
- Enter your OpenRouter API Key (if not set as env variable)
- Select your preferred model from the dropdown
- Configure additional settings as needed
- Click "Initialize Agent"
Start using the agent:
- Enter a URL to navigate
- Interact with the agent using natural language
- View real-time browser feedback in the interface

💡 Usage Examples

Here are some examples of what you can do with WebRouter:

"Navigate to google.com and search for latest news"
"Fill out this contact form with my information"
"Find the best price for this product across different tabs"
"Log into my account using these credentials"

🔧 Configuration Options

Model Selection: Choose from a wide range of models available through OpenRouter (e.g., GPT-4o, Gemini 2.5 Pro/Flash, DeepSeek V3/R1)
Observation Settings:
- HTML parsing
- Accessibility Tree analysis
- Screenshot capture
Browser Options:
- Headless mode
- Custom viewport settings
- Network conditions

🏗️ Project Structure

./
├── action/               # Action handling and execution
├── agent/               # Core agent implementation
├── browser/             # Browser automation and observation
├── gradio_app.py        # Gradio UI implementation
└── requirements.txt     # Project dependencies

🔐 Security Considerations

Never store sensitive credentials in plain text
Use environment variables for sensitive configuration
Be cautious when granting web automation permissions
Review and validate all automated actions
Monitor automated sessions for security

🤝 Contributing

Contributions are welcome! Please feel free to submit pull requests. For major changes, please open an issue first to discuss what you would like to change.

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a Pull Request

🙏 Acknowledgments

Huge inspiration from BrowserGym works
OpenRouter team for API access
OpenAI, Google, DeepSeek, and other model providers
Playwright team for browser automation
Gradio team for the UI framework
All contributors and supporters

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🚀 Future Plans

Support for additional AI models (easily added via OpenRouter)
Enhanced multi-tab coordination
Advanced workflow automation
Improved error handling and recovery
Extended browser compatibility

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 WebRouter (Web-enable AI Agent)

🌟 Features

🛠️ Technical Stack

📋 Prerequisites

⚙️ Installation

🚀 Quick Start

💡 Usage Examples

🔧 Configuration Options

🏗️ Project Structure

🔐 Security Considerations

🤝 Contributing

🙏 Acknowledgments

📝 License

🚀 Future Plans

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
action		action
agent		agent
asset		asset
browser		browser
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
gradio_app.py		gradio_app.py
requirements.txt		requirements.txt

License

hqm7/web-router

Folders and files

Latest commit

History

Repository files navigation

🤖 WebRouter (Web-enable AI Agent)

🌟 Features

🛠️ Technical Stack

📋 Prerequisites

⚙️ Installation

🚀 Quick Start

💡 Usage Examples

🔧 Configuration Options

🏗️ Project Structure

🔐 Security Considerations

🤝 Contributing

🙏 Acknowledgments

📝 License

🚀 Future Plans

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages