Skip to content

trycua/cua

Repository files navigation

Cua logo

Python Swift macOS Discord
trycua%2Fcua | Trendshift

cua ("koo-ah") is Docker for Computer-Use Agents - it enables AI agents to control full operating systems in virtual containers and deploy them locally or to the cloud.

vibe-photoshop.mp4

With the Computer SDK, you can:

With the Agent SDK, you can:

  • run computer-use models with a consistent output
  • run composed agents using UI grounding models and any LLM
  • use any liteLLM provider (openai/, openrouter/, etc.) or our included local providers (huggingface-local/, mlx/)
  • quickly evaluate new UI agent models and UI grounding models
    • anthropic/claude-opus-4-1-20250805 (using Computer-Use Models)
    • openai/computer-use-preview
    • openrouter/z-ai/glm-4.5v
    • huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B
    • omniparser+{any LLM} (using Composed Agents)
    • huggingface-local/HelloKKMe/GTA1-7B+{any LLM}
    • huggingface/HelloKKMe/GTA1-32B+{any LLM}
    • vllm_hosted/HelloKKMe/GTA1-72B+{any LLM}
    • human/human (using Human-in-the-Loop)
  • benchmark on OSWorld-Verified, SheetBench-V2, and more with a single line of code using HUD (Notebook)

Missing a model? Raise a feature request or contribute!


Quick Start


Usage (Docs)

pip install cua-agent[all]
from agent import ComputerAgent

agent = ComputerAgent(
    model="anthropic/claude-3-5-sonnet-20241022",
    tools=[computer],
    max_trajectory_budget=5.0
)

messages = [{"role": "user", "content": "Take a screenshot and tell me what you see"}]

async for result in agent.run(messages):
    for item in result["output"]:
        if item["type"] == "message":
            print(item["content"][0]["text"])

Output format (OpenAI Agent Responses Format):

{ 
  "output": [
    # user input
    {
        "role": "user",
        "content": "go to trycua on gh"
    },
    # first agent turn adds the model output to the history
    {
        "summary": [
            {
                "text": "Searching Firefox for Trycua GitHub",
                "type": "summary_text"
            }
        ],
        "type": "reasoning"
    },
    {
        "action": {
            "text": "Trycua GitHub",
            "type": "type"
        },
        "call_id": "call_QI6OsYkXxl6Ww1KvyJc4LKKq",
        "status": "completed",
        "type": "computer_call"
    },
    # second agent turn adds the computer output to the history
    {
        "type": "computer_call_output",
        "call_id": "call_QI6OsYkXxl6Ww1KvyJc4LKKq",
        "output": {
            "type": "input_image",
            "image_url": "data:image/png;base64,..."
        }
    },
    # final agent turn adds the agent output text to the history
    {
        "type": "message",
        "role": "assistant",
        "content": [
          {
            "text": "Success! The Trycua GitHub page has been opened.",
            "type": "output_text"
          }
        ]
    }
  ], 
  "usage": {
      "prompt_tokens": 150,
      "completion_tokens": 75,
      "total_tokens": 225,
      "response_cost": 0.01,
  }
}

Computer (Docs)

pip install cua-computer[all]
from computer import Computer

async with Computer(
    os_type="linux",
    provider_type="cloud",
    name="your-container-name",
    api_key="your-api-key"
) as computer:
    # Take screenshot
    screenshot = await computer.interface.screenshot()

    # Click and type
    await computer.interface.left_click(100, 100)
    await computer.interface.type("Hello!")

Resources

Modules

Module Description Installation
Lume VM management for macOS/Linux using Apple's Virtualization.Framework curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
Lumier Docker interface for macOS and Linux VMs docker pull trycua/lumier:latest
Computer (Python) Python Interface for controlling virtual machines pip install "cua-computer[all]"
Computer (Typescript) Typescript Interface for controlling virtual machines npm install @trycua/computer
Agent AI agent framework for automating tasks pip install "cua-agent[all]"
MCP Server MCP server for using CUA with Claude Desktop pip install cua-mcp-server
SOM Self-of-Mark library for Agent pip install cua-som
Computer Server Server component for Computer pip install cua-computer-server
Core (Python) Python Core utilities pip install cua-core
Core (Typescript) Typescript Core utilities npm install @trycua/core

Community

Join our Discord community to discuss ideas, get assistance, or share your demos!

License

Cua is open-sourced under the MIT License - see the LICENSE file for details.

Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0) - see the OmniParser LICENSE file for details.

Contributing

We welcome contributions to CUA! Please refer to our Contributing Guidelines for details.

Trademarks

Apple, macOS, and Apple Silicon are trademarks of Apple Inc. Ubuntu and Canonical are registered trademarks of Canonical Ltd. Microsoft is a registered trademark of Microsoft Corporation. This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., or Microsoft Corporation.

Stargazers

Thank you to all our supporters!

Stargazers over time

Contributors

f-trycua
f-trycua

💻
Pedro Piñera Buendía
Pedro Piñera Buendía

💻
Amit Kumar
Amit Kumar

💻
Dung Duc Huynh (Kaka)
Dung Duc Huynh (Kaka)

💻
Zayd Krunz
Zayd Krunz

💻
Prashant Raj
Prashant Raj

💻
Leland Takamine
Leland Takamine

💻
ddupont
ddupont

💻
Ethan Gutierrez
Ethan Gutierrez

💻
Ricter Zheng
Ricter Zheng

💻
Rahul Karajgikar
Rahul Karajgikar

💻
trospix
trospix

💻
Evan smith
Evan smith

💻