
Automate any computer task with natural language
π Website β’ π Docs β’ π¬ Discord β’ π Twitter
Bytebot is a self-hosted AI desktop agent that transforms how you interact with computers. By combining powerful AI with a containerized Linux desktop, Bytebot can perform complex computer tasks. Think of it as your virtual employee that can actually use a computer β clicking, typing, browsing, and completing workflows just like a human would.
- Complete Privacy: Your tasks and data never leave your infrastructure
- Full Control: Customize the desktop environment and installed applications
- No Usage Limits: Use your own LLM API keys without platform restrictions
- Secure Isolation: Each desktop runs in its own container, isolated from your host
setup_email_example.mp4
email_json_attachment_example.mp4
- Click the Deploy Now button in the Bytebot Railway template.
- Paste your
ANTHROPIC_API_KEY
in the single required environment variable. - Press Deploy. Railway will spin up the Desktop, Agent, UI and Postgres services using pre-built container images, connect them via private networking and expose only the UI publicly.
- In about two minutes your agent will be live at your project's public URL.
For an in-depth guide see here.
- Docker β₯ 20.10
- Docker Compose
- AI API key from one of these providers:
- Anthropic (get one here) - Claude models
- OpenAI (get one here) - GPT models
- Google (get one here) - Gemini models
- Clone and configure:
git clone https://github.com/bytebot-ai/bytebot.git
cd bytebot
# Configure your AI provider (choose one):
echo "ANTHROPIC_API_KEY=your_api_key_here" > docker/.env # For Claude
# echo "OPENAI_API_KEY=your_api_key_here" > docker/.env # For OpenAI
# echo "GOOGLE_API_KEY=your_api_key_here" > docker/.env # For Gemini
- Start the agent stack:
docker-compose -f docker/docker-compose.yml up -d
- Open the chat interface:
http://localhost:9992
That's it! Start chatting with your AI desktop agent. Watch it work in real-time through the embedded desktop viewer.
- "Research the top 5 competitors for [product] and create a comparison spreadsheet"
- "Fill out this web form with the data from my CSV file"
- "Check my email and summarize important messages"
- "Download all PDFs from this website and organize them by date"
- "Monitor this webpage and alert me when the price drops below $50"
Bytebot supports multiple AI providers to power your desktop agent:
- Anthropic Claude: Claude 3.5 Sonnet (default) - Best for complex reasoning and visual tasks
- OpenAI: GPT-4, GPT-4o - Excellent for general automation tasks
- Google Gemini: Gemini 1.5 Pro, Flash - Fast and efficient for routine tasks
Choose the model that best fits your needs and budget. Simply set the appropriate API key in your environment configuration.
Bytebot consists of four main components working together:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Your Browser β
β http://localhost:9992 β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β Bytebot UI (Next.js) β
β β’ Task interface β
β β’ Desktop viewer (VNC) β
β β’ Task management β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β WebSocket
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β Bytebot Agent (NestJS) β
β β’ Multi-LLM integration (Claude/GPT/Gemini) β
β β’ Task orchestration β
β β’ Action planning β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β REST API
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ
β Bytebot Desktop (Ubuntu + XFCE) β
β β’ Full Linux desktop β
β β’ Browser, email, office apps β
β β’ Automation daemon (bytebotd) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Natural Language Control: Just describe what you want done
- Visual Feedback: Watch the AI work in real-time
- Task History: Review and replay previous automations
- Browser-Based: No software to install on your machine
- REST API: Integrate desktop automation into your applications
- Extensible: Add custom tools and applications to the desktop
- Scriptable: Create complex workflows with the automation API
- Observable: Full logging and debugging capabilities
- Container-Based: Easy deployment with Docker
- Resource Efficient: Minimal overhead compared to VMs
- Network Isolated: Secure by default with customizable access
- Scalable: Run multiple instances for team use
- 2 CPU cores
- 4GB RAM
- 10GB storage
- Docker & Docker Compose
- 4+ CPU cores
- 8GB+ RAM
- 20GB+ storage
- Linux host OS for best performance
Create docker/.env
:
# Required - Choose one of these AI providers:
ANTHROPIC_API_KEY=sk-ant-... # For Claude models
# OPENAI_API_KEY=sk-... # For OpenAI models
# GOOGLE_API_KEY=... # For Google Gemini models
Add applications or configurations by extending the Dockerfile:
# docker/desktop/Dockerfile.custom
FROM bytebot/desktop:latest
# Install additional software
RUN apt-get update && apt-get install -y \
libreoffice \
gimp \
your-custom-app
# Copy custom configs
COPY configs/.config /home/user/.config
- API Keys: Keep your AI provider API keys secure and never commit them
- Network: By default, services are only accessible from localhost
- VNC: Change the default VNC password for production use
- Updates: Regularly update the container images for security patches
- Email management and responses
- Calendar scheduling
- Document organization
- Web research and data collection
- Form filling and data entry
- Report generation
- Competitive analysis
- Customer support tasks
- UI testing automation
- Cross-browser testing
- API integration testing
- Documentation screenshots
docker-compose -f docker/docker-compose.yml logs -f
docker-compose -f docker/docker-compose.yml down
docker-compose -f docker/docker-compose.yml pull
docker-compose -f docker/docker-compose.yml up -d
docker-compose -f docker/docker-compose.yml down -v
Control Bytebot via REST API:
import requests
# Create a task
response = requests.post('http://localhost:9991/tasks', json={
'description': 'Search for flights from NYC to London next month',
})
task_id = response.json()['id']
# Check task status
status = requests.get(f'http://localhost:9991/tasks/{task_id}')
print(status.json())
Use the computer control API for precise automation:
The core container also exposes an MCP endpoint.
Connect your MCP client to http://localhost:9990/mcp
to invoke these tools over SSE.
{
"mcpServers": {
"bytebot": {
"command": "npx",
"args": [
"mcp-remote",
"http://127.0.0.1:9990/mcp",
"--transport",
"http-first"
]
}
}
}
// Take screenshot
POST http://localhost:9990/computer-use
{
"action": "screenshot"
}
// Click at coordinates
POST http://localhost:9990/computer-use
{
"action": "click_mouse",
"coordinate": [500, 300]
}
// Type text
POST http://localhost:9990/computer-use
{
"action": "type_text",
"text": "Hello, Bytebot!"
}
We welcome contributions! Whether it's bug fixes, new features, or documentation improvements:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Discord: Join our community server for help and discussions
- Documentation: Comprehensive guides at docs.bytebot.ai
- Issues: Report bugs on GitHub
Built with amazing open source projects:
- nutjs - Desktop automation framework
- Anthropic Claude - AI reasoning engine
- OpenAI - GPT models for automation
- Google AI - Gemini models for efficient tasks
- noVNC - Browser-based VNC client
- Inspired by Anthropic's computer-use demo
Apache-2.0 license Β© 2025 Tantl Labs, Inc.
Start with the Quick Start guide above or dive into the full documentation.