-
Notifications
You must be signed in to change notification settings - Fork 0
AI Overview
ClusterSpace ships with an optional AI co-pilot that can read from and write to every pane, drive browsers, orchestrate multi-pane work, and pursue goals autonomously. It is provider-agnostic — any OpenAI-compatible endpoint works (Claude via compat, OpenAI directly, Ollama, LM Studio, vLLM).
If you don't want it, ignore it. The whole subsystem is opt-in — you have to configure a provider before the chat panel does anything.
-
Read and write any pane —
read_terminal_output,write_to_terminal,wait_for_output, plus the full browser-tool catalog - Move around the UI — focus a pane, maximize it, capture a screenshot, create a workspace
- Drive browsers — navigate, click, type, select, upload, scroll, run JavaScript, intercept the accessibility tree, save PDFs, set cookies
- Verify visually — take a screenshot and ask a vision model "did the expected thing happen?" (see Vision-Verification)
- Coordinate multi-pane work — assign tasks to per-pane agents, wait for one agent to finish before another starts, share context between them (see Agent-Orchestration)
- Pursue goals autonomously — start a goal, walk away, come back when it's verified done (see Goal-Runner-Overview)
Full tool catalog: AI-Tools-Reference.
Renderer (AI Chat Panel)
↓ IPC: aiChatStream
AIManager (main process)
↓ POST /chat/completions (OpenAI-compatible, stream=true)
LLM Provider (Claude / OpenAI / Ollama / …)
↑ SSE chunks: text deltas + tool_calls
↓ on [DONE]: emit AI_STREAM_END
Renderer receives final assistant message
→ dispatches each tool_call via AIManager.executeTool
→ tool results appended back to messages
→ next streamMessage call
→ repeat until model emits no more tool_calls (auto-loop, max 20 turns)
Backend: src/main/ai-manager.ts.
Each request fits in a 16k token window (reserving 4k for the system prompt + tool definitions). When the conversation exceeds the budget, trimMessagesToFit:
- Walks backwards from the most recent message, keeping as many as fit
- Inserts a synthetic
[CONTEXT TRIMMED: ...]system message documenting what was dropped -
Always preserves the original user message as an anchor — re-inserted after the trim if it got elided — because strict OpenAI-compat endpoints reject requests with no
userrole
For long-running goal conversations, the runner relies on persistent step logs and the critic to keep relevant context fresh. The chat panel's normal conversation auto-saves with a 1 second debounce and is recoverable from the history dropdown.
When the AI is driving a specific pane, conversations are keyed by (providerId, workspaceId, paneId). So:
- Two panes in the same workspace get two independent conversations
- Switching providers gives you a fresh conversation
- Switching workspaces preserves per-pane conversation history per workspace
This prevents the Builder persona in pane A from polluting the Tester persona in pane B with its context.
Source: src/main/ai-memory-store.ts:54-78.
The system prompt (per-provider, with a 700-line default) describes:
- The environment (pane IDs, types, terminal modes)
- Every available tool
- The step protocol (
declare_step→ action →verify_step) personas are expected to follow - Web automation loop patterns
- Pagination conventions for tool results
- Agent orchestration tools and when to use them
- Vision verification (when DOM checks aren't enough)
Override per-provider in the AI Settings dialog. The default lives in src/shared/types.ts:DEFAULT_AI_SYSTEM_PROMPT.
Every pane has an agentState record: idle / working / blocked / complete / error. The label above each pane shows a status dot, a role badge (if assigned), and the current task. The Fleet Dashboard shows the full fleet at a glance.
The AI tools that affect this state are set_agent_role, assign_task, complete_task, fail_task, wait_for_agent. The state surface enables multi-pane orchestration (one pane waits for another, panes share context, the dashboard tracks progress).
See Agent-Orchestration and Fleet-Dashboard.
Two layers guard risky actions:
-
Approval gates (
src/main/browser-approval.ts) — intercept browser tool calls that match sensitive patterns (password fields, checkout URLs, file uploads) and prompt the user. -
Goal policy (
src/main/goal-policy.ts) — when a goal is running, every tool call is checked against the goal's declared risk ceiling and sandbox dir. Tools beyond the policy prompt the user.
When both are absent (free-form chat from the panel with no active goal), the legacy regex-based gates run. When a goal is active, policy takes over.
See Goal-Policy-and-Risk-Levels.
None — you configure your own.
admin, builder, monitor, reviewer, tester, claude-code-expert. See Personas.
terminal-automation, claude-code-interaction. See Skills.
feature-development, deployment. See Task-Templates.
Step protocol (2), pane (6), terminal (4), orchestration (8), browser navigation (7), browser interaction T1 (9), browser interaction T2 (9), browser advanced (9), browser vision (2). Full reference: AI-Tools-Reference.
- AI-Providers — set up your first provider
- AI-Chat-Panel — the UI surface
- AI-Tools-Reference — full tool catalog
- Goal-Runner-Overview — autonomous mode
- Architecture-Overview — where AIManager sits in the app
ClusterSpace · Issues · Releases · MIT License · Edit any page via the Edit button (top right of the wiki).
- Workspaces-and-Layout
- Terminal-Panes
- Per-Pane-Tabs
- SSH-and-tmux
- Browser-Panes
- Saved-Logins
- Command-Palette
- Broadcast-Mode
- Settings-and-Configuration
- AI-Overview
- AI-Providers
- AI-Chat-Panel
- AI-Tools-Reference
- Personas
- Skills
- Task-Templates
- Agent-Orchestration
- Fleet-Dashboard
- Goal-Runner-Overview
- Starting-a-Goal
- Success-Criteria
- Goal-Policy-and-Risk-Levels
- Critic-and-Replan
- Vision-Verification
- Goal-Dashboard