## Agent

This notebook demonstrates how to use Cua's Agent to run a workflow in a virtual sandbox on Apple Silicon Macs.

### Installation

In [None]:
!pip uninstall -y cua-agent

In [None]:
!pip install "cua-agent[all]"

# Or install individual agent loops:
# !pip install cua-agent[openai]
# !pip install cua-agent[anthropic]
# !pip install cua-agent[uitars]
# !pip install cua-agent[omni]

In [None]:
# If locally installed, use this instead:
import os

os.chdir('../libs/agent')
!poetry install
!poetry build

!pip uninstall cua-agent -y
!pip install ./dist/cua_agent-0.1.0-py3-none-any.whl --force-reinstall

## Initialize a Computer Agent

Agent allows you to run an agentic workflow in a virtual sandbox instances on Apple Silicon. Here's a basic example:

In [None]:
from computer import Computer, VMProviderType
from agent import ComputerAgent, LLM, AgentLoop, LLMProvider

In [2]:
import os

# Get API keys from environment or prompt user
anthropic_key = os.getenv("ANTHROPIC_API_KEY") or input("Enter your Anthropic API key: ")
openai_key = os.getenv("OPENAI_API_KEY") or input("Enter your OpenAI API key: ")

os.environ["ANTHROPIC_API_KEY"] = anthropic_key
os.environ["OPENAI_API_KEY"] = openai_key

Similar to Computer, you can either use the async context manager pattern or initialize the ComputerAgent instance directly.

Let's start by creating an agent that relies on the OpenAI API computer-use-preview model.

In [None]:
import logging
from pathlib import Path

computer = Computer(verbosity=logging.INFO, provider_type=VMProviderType.LUME)

# Create agent with Anthropic loop and provider
agent = ComputerAgent(
        computer=computer,
        loop=AgentLoop.OPENAI,
        model=LLM(provider=LLMProvider.OPENAI),
        save_trajectory=True,
        trajectory_dir=str(Path("trajectories")),
        only_n_most_recent_images=3,
        verbosity=logging.INFO
    )

tasks = [
    "Look for a repository named trycua/cua on GitHub.",
    "Check the open issues, open the most recent one and read it.",
    "Clone the repository in users/lume/projects if it doesn't exist yet.",
    "Open the repository with an app named Cursor (on the dock, black background and white cube icon).",
    "From Cursor, open Composer if not already open.",
    "Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.",
]

for i, task in enumerate(tasks):
    print(f"\nExecuting task {i}/{len(tasks)}: {task}")
    async for result in agent.run(task):
        # print(result)
        pass

    print(f"\n✅ Task {i+1}/{len(tasks)} completed: {task}")

Or using the Omni Agent Loop:

In [None]:
import logging
from pathlib import Path
from agent import ComputerAgent, LLM, AgentLoop

computer = Computer(verbosity=logging.INFO)

# Create agent with Anthropic loop and provider
agent = ComputerAgent(
        computer=computer,
        loop=AgentLoop.OMNI,
        # model=LLM(provider=LLMProvider.ANTHROPIC, name="claude-3-7-sonnet-20250219"),
        # model=LLM(provider=LLMProvider.OPENAI, name="gpt-4.5-preview"),
        model=LLM(provider=LLMProvider.OLLAMA, name="gemma3:12b-it-q4_K_M"),
        save_trajectory=True,
        trajectory_dir=str(Path("trajectories")),
        only_n_most_recent_images=3,
        verbosity=logging.INFO
    )

tasks = [
    "Look for a repository named trycua/cua on GitHub.",
    "Check the open issues, open the most recent one and read it.",
    "Clone the repository in users/lume/projects if it doesn't exist yet.",
    "Open the repository with an app named Cursor (on the dock, black background and white cube icon).",
    "From Cursor, open Composer if not already open.",
    "Focus on the Composer text area, then write and submit a task to help resolve the GitHub issue.",
]

for i, task in enumerate(tasks):
    print(f"\nExecuting task {i}/{len(tasks)}: {task}")
    async for result in agent.run(task):
        # print(result)
        pass

    print(f"\n✅ Task {i+1}/{len(tasks)} completed: {task}")

## Using the Gradio UI

The agent includes a Gradio-based user interface for easy interaction. To use it:

In [4]:
import os

# Get API keys from environment or prompt user
anthropic_key = os.getenv("ANTHROPIC_API_KEY") or input("Enter your Anthropic API key: ")
openai_key = os.getenv("OPENAI_API_KEY") or input("Enter your OpenAI API key: ")

os.environ["ANTHROPIC_API_KEY"] = anthropic_key
os.environ["OPENAI_API_KEY"] = openai_key

In [None]:
from agent.ui.gradio.app import create_gradio_ui

app = create_gradio_ui()
app.launch(share=False)