-
Notifications
You must be signed in to change notification settings - Fork 0
Goal Runner Internals
Deep dive for contributors. If you just want to use the goal runner, read Goal-Runner-Overview instead.
Source: src/main/goal-runner.ts, src/main/goal-store.ts, src/main/goal-policy.ts.
class GoalRunner {
constructor(
window: BrowserWindow,
aiManager: AIManager,
aiMemoryStore: AIMemoryStore,
aiStore: AIStore,
agentStore: AgentStore,
goalStore: GoalStore
)
start(input: StartGoalInput): Promise<{ goalId: string; error?: string }>
abort(goalId: string): boolean
status(goalId: string): { status, step, lastStep? } | null
}StartGoalInput:
{
paneId: string
goal: string
successCriterion: SuccessCriterion
policy: GoalPolicy
providerId?: string
personaId?: string
wallClockMs?: number // default 1 hour
criticIntervalSteps?: number // default 5
criticProviderId?: string // defaults to main provider
}interface RuntimeGoal {
checkpoint: GoalCheckpoint
state:
| { kind: 'running'; abortRequested: boolean; pauseRequested: boolean }
| { kind: 'done' }
startedAt: number
wallClockMs: number
criticIntervalSteps: number
criticProviderId?: string
stepsSinceLastCritic: number
pendingClaim?: { rationale: string }
pendingAbort?: { reason: string; report: string }
}pendingClaim is set by the claim_complete transient tool; pendingAbort by abort_with_report. The loop body picks them up on the next iteration.
registerTransientTools() registers two tools globally (into the regular toolRegistry):
claim_complete(rationale)
abort_with_report(reason, what_was_learned)
When called outside an active goal, they no-op with {success: false, message: "claim_complete called outside an active goal run."} — so they're safe to leave registered all the time.
When called inside an active goal, they set pendingClaim / pendingAbort on the most recently started runtime goal:
private findActiveForCurrentCaller(): RuntimeGoal | undefined {
let latest: RuntimeGoal | undefined
for (const g of this.running.values()) {
if (g.state.kind === 'running' && (!latest || g.startedAt > latest.startedAt)) {
latest = g
}
}
return latest
}For now, we only support one concurrent goal. Multi-goal disambiguation is a roadmap item — would require correlating the tool call back to the issuing AIManager invocation.
Simplified:
async runLoop(runtime, provider, apiKey, initialMessages) {
aiManager.setActivePolicy(runtime.checkpoint.policy)
const goalPrompt = {
id: uuidv4(),
role: 'user',
content: buildGoalPrompt(runtime.checkpoint), // includes contract
timestamp: Date.now()
}
const messages = [...initialMessages, goalPrompt]
while (true) {
// 1. Wall-clock cap
if (Date.now() - runtime.startedAt > runtime.wallClockMs) {
return endGoal(runtime, 'failed', 'Wall-clock cap exceeded')
}
// 2. External abort
if (runtime.state.kind === 'running' && runtime.state.abortRequested) {
return endGoal(runtime, 'aborted', 'User aborted')
}
// 3. Pause
if (runtime.state.kind === 'running' && runtime.state.pauseRequested) {
await sleep(500)
continue
}
// 4. One model turn
const assistant = await aiManager.streamMessage(messages, provider, apiKey)
if (!assistant) return endGoal(runtime, 'failed', 'No message from model')
messages.push(assistant)
const toolCalls = assistant.toolCalls ?? []
if (toolCalls.length === 0) {
// Nudge the model — it's not allowed to stop without claim_complete / abort_with_report
messages.push(makeUserNudge('You must continue the loop. Call a tool…'))
continue
}
// 5. Dispatch each tool call
let nonTransientSteps = 0
for (const tc of toolCalls) {
const result = await aiManager.executeTool(tc) // policy enforced inside
goalStore.appendStep(runtime.checkpoint.id, {
tool: tc.name, args: tc.arguments,
resultPreview: previewResult(result.result),
ok: !result.error,
elapsedMs: Date.now() - runtime.startedAt
})
emitEvent({type: 'step', goalId: runtime.checkpoint.id, tool: tc.name, ok: !result.error, preview: ...})
messages.push(makeToolResultMessage(result))
if (tc.name !== 'claim_complete' && tc.name !== 'abort_with_report') {
nonTransientSteps++
}
}
runtime.stepsSinceLastCritic += nonTransientSteps
// 6. Handle runner-driven exits
if (runtime.pendingAbort) {
const { reason, report } = runtime.pendingAbort
return endGoal(runtime, 'aborted', `${reason}\n\nWhat was learned:\n${report}`)
}
if (runtime.pendingClaim) {
const claim = runtime.pendingClaim
runtime.pendingClaim = undefined
const verdict = await verifySuccessCriterion(criterion, claim.rationale, provider, apiKey)
if (verdict.verified) {
return endGoal(runtime, 'completed', verdict.detail)
}
messages.push(makeSystemMessage(`Verification failed: ${verdict.detail}. Keep working…`))
emitEvent({type: 'verification_failed', goalId: runtime.checkpoint.id, detail: verdict.detail})
}
// 7. Critic check
if (runtime.criticIntervalSteps > 0 && runtime.stepsSinceLastCritic >= runtime.criticIntervalSteps) {
runtime.stepsSinceLastCritic = 0
await runCritic(runtime, provider, apiKey, messages)
}
}
}private async verifySuccessCriterion(c, rationale, provider, apiKey): Promise<{verified, detail?}> {
switch (c.type) {
case 'shell':
// spawn(c.command, {shell: true}); check exit code; surface stderr/stdout tail on fail
case 'manual':
// accept rationale verbatim
case 'model_question':
// sendMessage with strict YES/NO judge prompt; first token decides
case 'json_predicate':
// TODO — accepted with note for now
}
}See Success-Criteria for the user-facing description.
runCritic(runtime, mainProvider, mainApiKey, messages):
- Pick critic provider (alt provider via
criticProviderId, else main) - Take the last
criticIntervalSteps × 2steps fromruntime.checkpoint.steps - Build a strict judge prompt ("Reply on first line with PROGRESSING / STUCK / ACHIEVED / MISLED. Second line: reason.")
-
aiManager.sendMessage(judgePrompt, criticProvider, criticKey)(non-streaming) - Parse first token of first line; uppercase
- Emit
criticevent overgoal:eventIPC - Switch on verdict:
- PROGRESSING: no-op
- STUCK: inject system message asking for a different approach
- ACHIEVED: set
pendingClaimwith synthetic rationale - MISLED: inject re-anchor on original goal
Full details: Critic-and-Replan.
Backed by electron-store at clusterspace-goals.json.
Schema:
{ goals: GoalCheckpoint[], maxGoals: 50, maxStepsPerGoal: 500 }Behavior:
-
create()unshifts the new goal to position 0 (most recent first); trims tomaxGoalskeeping in-flight always -
appendStep()pushes ontogoal.steps; when count > 500, keeps head-50 + tail-450 (planning context preserved; recent context preserved; middle dropped) -
update()patches the goal record and bumpsupdatedAt -
prune()removes goals matching a predicate (default: any non-active status) -
listResumable()returns goals with statusrunningorpaused— eligible for resume on next launch (resume logic itself is roadmap)
The store is intentionally separate from ai-memory-store so goal-lifecycle changes can't pollute conversation history.
Emitted via window.webContents.send('goal:event', event).
Types:
type GoalRunnerEvent =
| { type: 'started'; goalId }
| { type: 'step'; goalId; tool; ok; preview }
| { type: 'verification_failed'; goalId; detail }
| { type: 'critic'; goalId; verdict; reason }
| { type: 'ended'; goalId; status; finalReport }The Goal-Dashboard subscribes and updates its UI in response.
AIManager.executeTool consults this.activePolicy (set by GoalRunner.start, cleared by endGoal):
async executeTool(toolCall) {
// 1. Policy gate (if active)
const policy = this.activePolicy
if (policy) {
const perm = getPermissions(toolCall.name, toolCall.arguments)
const verdict = evaluate(toolCall.name, perm, policy)
if (!verdict.allow) {
if (verdict.needsApproval) {
const approved = await requestApproval(toolCall, verdict.reason)
if (!approved) return { error: verdict.reason, ... }
} else {
return { error: verdict.reason, ... }
}
}
} else {
// 2. Legacy regex approval gate (chat panel, no goal)
// ...
}
// 3. Dispatch through registry
const ctx = this.buildToolContext()
const result = await toolRegistry.dispatch(toolCall.name, toolCall.arguments, ctx)
// 4. Action log (browser tools)
if (toolCall.name.startsWith('browser_')) appendActionLog(...)
// 5. Truncate result if too large
return formatAsToolResult(result)
}- Open the Goal-Dashboard — watch the step log live
- Look at the critic rail for verdicts. If you see
critic:stuckrepeatedly, the critic correctly detects it but the model isn't pivoting — check the system message it was injected with - Look at the verification rail for
verify:failentries — what's the failure detail? Often the issue is in the success criterion, not the AI - Click into the step log to inspect specific tool args / results
- Check the goal-store file directly:
<userData>/clusterspace-data/clusterspace-goals.json— contains the full step history even after the dashboard closes
- Extend
SuccessCriterioninsrc/shared/types.tswith the new variant - Add a case to
verifySuccessCriterioningoal-runner.ts - Add a case to
humanizeCriterion(for the dashboard display) - Add a UI tab to
GoalCreateDialog.tsxwith input fields - Type-check
The json_predicate type is a good stub to extend (currently TODO).
- Goal-Runner-Overview — user-facing description
-
Goal-Policy-and-Risk-Levels — what
activePolicydoes - Critic-and-Replan — the runCritic deep dive
-
Tool-Registry — what
toolRegistry.dispatchdoes - Data-Storage-and-Migration — goal-store on disk
ClusterSpace · Issues · Releases · MIT License · Edit any page via the Edit button (top right of the wiki).
- Workspaces-and-Layout
- Terminal-Panes
- Per-Pane-Tabs
- SSH-and-tmux
- Browser-Panes
- Saved-Logins
- Command-Palette
- Broadcast-Mode
- Settings-and-Configuration
- AI-Overview
- AI-Providers
- AI-Chat-Panel
- AI-Tools-Reference
- Personas
- Skills
- Task-Templates
- Agent-Orchestration
- Fleet-Dashboard
- Goal-Runner-Overview
- Starting-a-Goal
- Success-Criteria
- Goal-Policy-and-Risk-Levels
- Critic-and-Replan
- Vision-Verification
- Goal-Dashboard