Tell your computer what to do. Holo gets it done. holo-desktop-cli is the open-source client for Holo3, H Company's open-weight vision-language model. It launches the agent and fronts it as a CLI, MCP, ACP, and A2A surface. Use the hosted API, or run everything on your own machine for full privacy.
Docs: The HoloDesktop CLI docs cover setup guides, run examples, debugging advice, integration guides, and the full CLI reference.
Holo is three parts:
- This repo,
holo-desktop-cli, is the Apache-2.0-licensed client: the CLI plus the MCP / ACP / A2A surfaces. It launches the agent and drives it over loopback. - The agent runs inside H Company's
hai-agent-runtimebinary. That binary is closed-source and downloads itself on first run (sha256-verified). - The contract between them is the open
hai-agent-apipackage, so what the client sends is fully inspectable.
Point it at the hosted Holo3 models, or at your own server where nothing leaves your machine.
holo-desktop-cli is not (yet) on PyPI; run it from a checkout:
git clone https://github.com/hcompai/holo-desktop-cli
cd holo-desktop-cli && uv sync
uv run holo run "Catch me up on Slack"On first run:
- The
hai-agent-runtimebinary downloads itself to~/.holo/runtime/(sha256-verified). Developers can skip this by puttinghai-agent-runtime(or a wrapper script) onPATH. - Your browser opens to sign in at portal.hcompany.ai. Skip with
--base-urlfor a local model. - macOS only: grant the agent runtime Accessibility and Screen Recording in System Settings → Privacy & Security when prompted.
| Surface | Command | When |
|---|---|---|
| CLI | holo run "task" |
One-shot tasks from your terminal |
| MCP | holo install, or holo mcp in your host's config |
Delegate from Claude Code, Cursor, Codex, ... |
| ACP | holo acp |
ACP hosts (Hermes, OpenClaw, Zed, ...) |
| A2A | holo serve |
An A2A HTTP server on 127.0.0.1 for your own agents |
See the CLI reference or holo run --help for all flags.
Once the agent is driving the screen it's hard to take back control. Holo gives you an out-of-band panic stop: press Esc twice quickly and the current turn pauses, then cancels.
| Where you're running | What watches for the double-Esc |
|---|---|
holo run (interactive terminal) |
A listener embedded in the run; armed automatically (first use prompts for macOS Input Monitoring). |
holo mcp / holo acp / holo serve (headless) |
The always-on holo guard, installed by holo install and launched by the OS so it has its own permission identity. |
You can also stop without the keyboard:
holo stop # ask the running turn to pause then cancel (same as double-Esc)
holo stop --force # additionally SIGKILL the runtime — instant, but ends the session outright
holo guard # run the listener yourself in the foreground (e.g. if you skipped holo install)Good to know:
- The stop is step-bounded. It halts the next action; the runtime still finishes the action already in flight.
holo stop --forceis the only instant stop. holo stop --forcekills the runtime but leaves a headless host running with a dead backend.holo serve/holo mcp/holo acpspawn their runtime once at startup and keep pointing at it, so after a force-kill the host process stays up but every later task fails (its requests hit a runtime that no longer exists) until you restart the host. Prefer plainholo stopthere; reserve--forceforholo runor a wedged runtime.- The guard only inspects
Esctiming, never keystroke content — but it does hold Input Monitoring continuously while installed. Disable the embedded listener for a single run withholo run --no-kill-switch. holo stop --forcereads a pid file and does not yet verify process identity. A runtime that exited uncleanly can leave a stale~/.holo/agent-pid-<port>behind, so a force-kill may target a recycled pid. Robust start-time matching is tracked as follow-up.- Wayland (Linux) has no global key listener. Use
holo stopinstead, bound to a compositor hotkey.
The trigger and the lever are decoupled through a single one-line file, ~/.holo/stop, holding a wall-clock timestamp:
- Writing it (the trigger):
holo stop, theholo runlistener, andholo guardall writetime.time()to that file. They're separate processes, so a wall-clock value is what lets them and the running turn agree on ordering. - Reading it (the lever): every turn records its own
started_atand, while running, polls the file ~4×/s. It acts only if the file's timestamp is newer than itsstarted_at— then it pauses, then cancels at the next action boundary. - Clearing it: the file is never deleted. It's cleared by time — the next turn starts later, so a leftover request is automatically stale and can't kill it. This is also why a
holo stopfired before a run begins is ignored: nothing was running to stop.
holo stop --force is the exception to the step-bounded model: it reads the runtime's pid file (~/.holo/agent-pid-<port>) and SIGKILLs the process directly, so it doesn't wait for an action boundary.
holo_desktop.agent_client is the same client every CLI surface is built on: it spawns (or attaches to) the hai-agent-runtime binary on loopback and drives sessions over the agent API.
import asyncio
from holo_desktop.agent_client import AgentApiClient, SpawnConfig, ensure_running
from holo_desktop.agent_client.requests import build_session_request
async def main() -> None:
daemon = await ensure_running(SpawnConfig(port=18795))
try:
async with AgentApiClient(daemon.base_url, daemon.token) as client:
request = build_session_request(
task="Tell me how many unread emails I have", max_steps=None, max_time_s=None
)
stream = client.stream(await client.create_session(request))
async for event in stream.events():
print(event.type)
print(stream.answer)
finally:
await daemon.aclose()
asyncio.run(main())AgentApiClient also exposes pause / resume / cancel and mid-run send_message for interactive embedding.
Holo defaults to the H Company Models API. Your first holo run opens your browser, signs you in at portal.hcompany.ai, and saves a key to ~/.holo/.env. Run holo login to do this ahead of time. Holo3-35B is on the free tier; the 122B requires a paid plan.
To run on your own hardware instead, point --base-url at any OpenAI-compatible server. No holo login needed, and no screenshots, keystrokes, or app content leave your machine.
holo run --base-url http://localhost:8000/v1 "Open Safari and go to hcompany.ai"Hardware notes and ready-to-run vLLM and llama.cpp configs are in docs/self-hosting.md.
Holo runs as a sub-agent of Claude Code, Cursor, Codex, and other MCP / ACP hosts. When your main agent needs to read a screen or click through an app, it delegates to Holo and gets the answer back.
One command wires Holo into every supported host on your machine:
holo install # everything detected
holo install cursor # one host
holo install list # see what's availableEach host gets the MCP server in its config, plus a Skill (where supported) that teaches the parent when to delegate to Holo.
Interrupting a running task: over MCP a Holo task blocks until it finishes — stopping the turn in the host (Cursor, Codex, ...) does not abort the run already executing on your machine; it keeps clicking until it completes, times out, or the host kills the server process. Use ACP if you need a host-driven cancel.
| id | host | skill auto-load |
|---|---|---|
antigravity |
Antigravity (Google) | — |
claude-code |
Claude Code | ~/.claude/skills/ |
claude-desktop |
Claude Desktop | — |
codex |
Codex | ~/.agents/skills/ |
copilot |
GitHub Copilot CLI | — |
cursor |
Cursor | — |
hermes |
Hermes | — |
nemoclaw |
NemoClaw (sandbox bridge) | — |
openclaw |
OpenClaw | ~/.openclaw/skills/ |
opencode |
OpenCode | ~/.config/opencode/skills/ |
Beta. ACP support is still stabilising — interfaces and behaviour may change.
holo acpprints this notice to stderr on startup.
holo acp runs Holo as an ACP sub-agent over stdio. Unlike MCP, ACP hosts can cancel an in-flight task.
Hermes (NousResearch):
delegate_task(acp_command="holo acp", task="Open Authy and grab my AWS 2FA code")OpenClaw — ~/.openclaw/openclaw.json:
{ "runtimes": { "holo": { "runtime": "acp-standard", "command": "holo", "args": ["acp"] } } }All dependencies resolve from PyPI (the agent-API wire types come from hai-agent-api), so a plain checkout is all you need:
git clone https://github.com/hcompai/holo-desktop-cli && cd holo-desktop-cli
make setup
make check # ruff + mypy + pytestSee CONTRIBUTING.md.
The holo-desktop-cli client (this repository) is Apache-2.0-licensed. The hai-agent-runtime binary it downloads and drives is closed-source and distributed under H Company's own terms; the wire contract between the two is the open hai-agent-api package.
- Models: Holo3-35B-A3B · Holo3-122B-A10B
- Docs: Quickstart · Models API
- H Company