The Open All-in-One Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.
-
Updated
Jun 30, 2025 - TypeScript
The Open All-in-One Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.
Your AI Operator for Web, Android, Automation & Testing.
c/ua is the Docker Container for Computer-Use AI Agents.
The most reliable AI agent framework that supports MCP.
Agent S: an open agentic framework that uses computers like a human
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
An open-sourced end-to-end VLM-based GUI Agent
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
Bytebot is the container for desktop agents.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
A framework to enable autonomous android and computer use using any LLM (local or remote)
Desktop app powered by Claude’s computer use capability to control your computer
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
Build, evaluate and run General Multi-Agent Assistance with ease
Browser Operator - The Chromium browser with built in Multi-Agent
Open source virtual desktops for AI agents
Add a description, image, and links to the computer-use topic page so that developers can more easily learn about it.
To associate your repository with the computer-use topic, visit your repo's landing page and select "manage topics."