AI Overview

ClusterSpace ships with an optional AI co-pilot that can read from and write to every pane, drive browsers, orchestrate multi-pane work, and pursue goals autonomously. It is provider-agnostic — any OpenAI-compatible endpoint works (Claude via compat, OpenAI directly, Ollama, LM Studio, vLLM).

If you don't want it, ignore it. The whole subsystem is opt-in — you have to configure a provider before the chat panel does anything.

What the AI can do

Read and write any pane — read_terminal_output, write_to_terminal, wait_for_output, plus the full browser-tool catalog
Move around the UI — focus a pane, maximize it, capture a screenshot, create a workspace
Drive browsers — navigate, click, type, select, upload, scroll, run JavaScript, intercept the accessibility tree, save PDFs, set cookies
Verify visually — take a screenshot and ask a vision model "did the expected thing happen?" (see Vision-Verification)
Coordinate multi-pane work — assign tasks to per-pane agents, wait for one agent to finish before another starts, share context between them (see Agent-Orchestration)
Pursue goals autonomously — start a goal, walk away, come back when it's verified done (see Goal-Runner-Overview)

Full tool catalog: AI-Tools-Reference.

How it talks to the model

Renderer (AI Chat Panel)
    ↓ IPC: aiChatStream
AIManager (main process)
    ↓ POST /chat/completions  (OpenAI-compatible, stream=true)
LLM Provider (Claude / OpenAI / Ollama / …)
    ↑ SSE chunks: text deltas + tool_calls
    ↓ on [DONE]: emit AI_STREAM_END
Renderer receives final assistant message
    → dispatches each tool_call via AIManager.executeTool
    → tool results appended back to messages
    → next streamMessage call
    → repeat until model emits no more tool_calls (auto-loop, max 20 turns)

Backend: src/main/ai-manager.ts.

Token budgeting

Each request fits in a 16k token window (reserving 4k for the system prompt + tool definitions). When the conversation exceeds the budget, trimMessagesToFit:

Walks backwards from the most recent message, keeping as many as fit
Inserts a synthetic [CONTEXT TRIMMED: ...] system message documenting what was dropped
Always preserves the original user message as an anchor — re-inserted after the trim if it got elided — because strict OpenAI-compat endpoints reject requests with no user role

For long-running goal conversations, the runner relies on persistent step logs and the critic to keep relevant context fresh. The chat panel's normal conversation auto-saves with a 1 second debounce and is recoverable from the history dropdown.

Per-pane conversation isolation

When the AI is driving a specific pane, conversations are keyed by (providerId, workspaceId, paneId). So:

Two panes in the same workspace get two independent conversations
Switching providers gives you a fresh conversation
Switching workspaces preserves per-pane conversation history per workspace

This prevents the Builder persona in pane A from polluting the Tester persona in pane B with its context.

Source: src/main/ai-memory-store.ts:54-78.

What the model sees

The system prompt (per-provider, with a 700-line default) describes:

The environment (pane IDs, types, terminal modes)
Every available tool
The step protocol (declare_step → action → verify_step) personas are expected to follow
Web automation loop patterns
Pagination conventions for tool results
Agent orchestration tools and when to use them
Vision verification (when DOM checks aren't enough)

Override per-provider in the AI Settings dialog. The default lives in src/shared/types.ts:DEFAULT_AI_SYSTEM_PROMPT.

Per-pane agent state

Every pane has an agentState record: idle / working / blocked / complete / error. The label above each pane shows a status dot, a role badge (if assigned), and the current task. The Fleet Dashboard shows the full fleet at a glance.

The AI tools that affect this state are set_agent_role, assign_task, complete_task, fail_task, wait_for_agent. The state surface enables multi-pane orchestration (one pane waits for another, panes share context, the dashboard tracks progress).

See Agent-Orchestration and Fleet-Dashboard.

Safety: approval gates and policy

Two layers guard risky actions:

Approval gates (src/main/browser-approval.ts) — intercept browser tool calls that match sensitive patterns (password fields, checkout URLs, file uploads) and prompt the user.
Goal policy (src/main/goal-policy.ts) — when a goal is running, every tool call is checked against the goal's declared risk ceiling and sandbox dir. Tools beyond the policy prompt the user.

When both are absent (free-form chat from the panel with no active goal), the legacy regex-based gates run. When a goal is active, policy takes over.

See Goal-Policy-and-Risk-Levels.

Defaults shipped

Providers

None — you configure your own.

Personas (6)

admin, builder, monitor, reviewer, tester, claude-code-expert. See Personas.

Skills (2)

terminal-automation, claude-code-interaction. See Skills.

Task templates (2)

feature-development, deployment. See Task-Templates.

Tools (~52)

Step protocol (2), pane (6), terminal (4), orchestration (8), browser navigation (7), browser interaction T1 (9), browser interaction T2 (9), browser advanced (9), browser vision (2). Full reference: AI-Tools-Reference.

AI Overview

AI Overview

What the AI can do

How it talks to the model

Token budgeting

Per-pane conversation isolation

What the model sees

Per-pane agent state

Safety: approval gates and policy

Defaults shipped

Providers

Personas (6)

Skills (2)

Task templates (2)

Tools (~52)

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Start here

User guide

AI subsystem

Goal Runner

Developer

Reference

Clone this wiki locally