Skip to content

Goal Runner Internals

nick3 edited this page May 28, 2026 · 1 revision

Goal Runner Internals

Deep dive for contributors. If you just want to use the goal runner, read Goal-Runner-Overview instead.

Source: src/main/goal-runner.ts, src/main/goal-store.ts, src/main/goal-policy.ts.


Class shape

class GoalRunner {
  constructor(
    window: BrowserWindow,
    aiManager: AIManager,
    aiMemoryStore: AIMemoryStore,
    aiStore: AIStore,
    agentStore: AgentStore,
    goalStore: GoalStore
  )

  start(input: StartGoalInput): Promise<{ goalId: string; error?: string }>
  abort(goalId: string): boolean
  status(goalId: string): { status, step, lastStep? } | null
}

StartGoalInput:

{
  paneId: string
  goal: string
  successCriterion: SuccessCriterion
  policy: GoalPolicy
  providerId?: string
  personaId?: string
  wallClockMs?: number          // default 1 hour
  criticIntervalSteps?: number  // default 5
  criticProviderId?: string     // defaults to main provider
}

Per-run state (RuntimeGoal)

interface RuntimeGoal {
  checkpoint: GoalCheckpoint
  state:
    | { kind: 'running'; abortRequested: boolean; pauseRequested: boolean }
    | { kind: 'done' }
  startedAt: number
  wallClockMs: number
  criticIntervalSteps: number
  criticProviderId?: string
  stepsSinceLastCritic: number
  pendingClaim?: { rationale: string }
  pendingAbort?: { reason: string; report: string }
}

pendingClaim is set by the claim_complete transient tool; pendingAbort by abort_with_report. The loop body picks them up on the next iteration.


Transient tools

registerTransientTools() registers two tools globally (into the regular toolRegistry):

claim_complete(rationale)
abort_with_report(reason, what_was_learned)

When called outside an active goal, they no-op with {success: false, message: "claim_complete called outside an active goal run."} — so they're safe to leave registered all the time.

When called inside an active goal, they set pendingClaim / pendingAbort on the most recently started runtime goal:

private findActiveForCurrentCaller(): RuntimeGoal | undefined {
  let latest: RuntimeGoal | undefined
  for (const g of this.running.values()) {
    if (g.state.kind === 'running' && (!latest || g.startedAt > latest.startedAt)) {
      latest = g
    }
  }
  return latest
}

For now, we only support one concurrent goal. Multi-goal disambiguation is a roadmap item — would require correlating the tool call back to the issuing AIManager invocation.


The loop (runLoop)

Simplified:

async runLoop(runtime, provider, apiKey, initialMessages) {
  aiManager.setActivePolicy(runtime.checkpoint.policy)

  const goalPrompt = {
    id: uuidv4(),
    role: 'user',
    content: buildGoalPrompt(runtime.checkpoint),  // includes contract
    timestamp: Date.now()
  }
  const messages = [...initialMessages, goalPrompt]

  while (true) {
    // 1. Wall-clock cap
    if (Date.now() - runtime.startedAt > runtime.wallClockMs) {
      return endGoal(runtime, 'failed', 'Wall-clock cap exceeded')
    }

    // 2. External abort
    if (runtime.state.kind === 'running' && runtime.state.abortRequested) {
      return endGoal(runtime, 'aborted', 'User aborted')
    }

    // 3. Pause
    if (runtime.state.kind === 'running' && runtime.state.pauseRequested) {
      await sleep(500)
      continue
    }

    // 4. One model turn
    const assistant = await aiManager.streamMessage(messages, provider, apiKey)
    if (!assistant) return endGoal(runtime, 'failed', 'No message from model')
    messages.push(assistant)

    const toolCalls = assistant.toolCalls ?? []
    if (toolCalls.length === 0) {
      // Nudge the model — it's not allowed to stop without claim_complete / abort_with_report
      messages.push(makeUserNudge('You must continue the loop. Call a tool…'))
      continue
    }

    // 5. Dispatch each tool call
    let nonTransientSteps = 0
    for (const tc of toolCalls) {
      const result = await aiManager.executeTool(tc)  // policy enforced inside
      goalStore.appendStep(runtime.checkpoint.id, {
        tool: tc.name, args: tc.arguments,
        resultPreview: previewResult(result.result),
        ok: !result.error,
        elapsedMs: Date.now() - runtime.startedAt
      })
      emitEvent({type: 'step', goalId: runtime.checkpoint.id, tool: tc.name, ok: !result.error, preview: ...})
      messages.push(makeToolResultMessage(result))
      if (tc.name !== 'claim_complete' && tc.name !== 'abort_with_report') {
        nonTransientSteps++
      }
    }
    runtime.stepsSinceLastCritic += nonTransientSteps

    // 6. Handle runner-driven exits
    if (runtime.pendingAbort) {
      const { reason, report } = runtime.pendingAbort
      return endGoal(runtime, 'aborted', `${reason}\n\nWhat was learned:\n${report}`)
    }
    if (runtime.pendingClaim) {
      const claim = runtime.pendingClaim
      runtime.pendingClaim = undefined
      const verdict = await verifySuccessCriterion(criterion, claim.rationale, provider, apiKey)
      if (verdict.verified) {
        return endGoal(runtime, 'completed', verdict.detail)
      }
      messages.push(makeSystemMessage(`Verification failed: ${verdict.detail}. Keep working…`))
      emitEvent({type: 'verification_failed', goalId: runtime.checkpoint.id, detail: verdict.detail})
    }

    // 7. Critic check
    if (runtime.criticIntervalSteps > 0 && runtime.stepsSinceLastCritic >= runtime.criticIntervalSteps) {
      runtime.stepsSinceLastCritic = 0
      await runCritic(runtime, provider, apiKey, messages)
    }
  }
}

Success-criterion verification

private async verifySuccessCriterion(c, rationale, provider, apiKey): Promise<{verified, detail?}> {
  switch (c.type) {
    case 'shell':
      // spawn(c.command, {shell: true}); check exit code; surface stderr/stdout tail on fail
    case 'manual':
      // accept rationale verbatim
    case 'model_question':
      // sendMessage with strict YES/NO judge prompt; first token decides
    case 'json_predicate':
      // TODO — accepted with note for now
  }
}

See Success-Criteria for the user-facing description.


Critic

runCritic(runtime, mainProvider, mainApiKey, messages):

  1. Pick critic provider (alt provider via criticProviderId, else main)
  2. Take the last criticIntervalSteps × 2 steps from runtime.checkpoint.steps
  3. Build a strict judge prompt ("Reply on first line with PROGRESSING / STUCK / ACHIEVED / MISLED. Second line: reason.")
  4. aiManager.sendMessage(judgePrompt, criticProvider, criticKey) (non-streaming)
  5. Parse first token of first line; uppercase
  6. Emit critic event over goal:event IPC
  7. Switch on verdict:
    • PROGRESSING: no-op
    • STUCK: inject system message asking for a different approach
    • ACHIEVED: set pendingClaim with synthetic rationale
    • MISLED: inject re-anchor on original goal

Full details: Critic-and-Replan.


Persistence (GoalStore)

Backed by electron-store at clusterspace-goals.json.

Schema:

{ goals: GoalCheckpoint[], maxGoals: 50, maxStepsPerGoal: 500 }

Behavior:

  • create() unshifts the new goal to position 0 (most recent first); trims to maxGoals keeping in-flight always
  • appendStep() pushes onto goal.steps; when count > 500, keeps head-50 + tail-450 (planning context preserved; recent context preserved; middle dropped)
  • update() patches the goal record and bumps updatedAt
  • prune() removes goals matching a predicate (default: any non-active status)
  • listResumable() returns goals with status running or paused — eligible for resume on next launch (resume logic itself is roadmap)

The store is intentionally separate from ai-memory-store so goal-lifecycle changes can't pollute conversation history.


Events (goal:event)

Emitted via window.webContents.send('goal:event', event).

Types:

type GoalRunnerEvent =
  | { type: 'started'; goalId }
  | { type: 'step'; goalId; tool; ok; preview }
  | { type: 'verification_failed'; goalId; detail }
  | { type: 'critic'; goalId; verdict; reason }
  | { type: 'ended'; goalId; status; finalReport }

The Goal-Dashboard subscribes and updates its UI in response.


How policy enforcement plugs into dispatch

AIManager.executeTool consults this.activePolicy (set by GoalRunner.start, cleared by endGoal):

async executeTool(toolCall) {
  // 1. Policy gate (if active)
  const policy = this.activePolicy
  if (policy) {
    const perm = getPermissions(toolCall.name, toolCall.arguments)
    const verdict = evaluate(toolCall.name, perm, policy)
    if (!verdict.allow) {
      if (verdict.needsApproval) {
        const approved = await requestApproval(toolCall, verdict.reason)
        if (!approved) return { error: verdict.reason, ... }
      } else {
        return { error: verdict.reason, ... }
      }
    }
  } else {
    // 2. Legacy regex approval gate (chat panel, no goal)
    // ...
  }

  // 3. Dispatch through registry
  const ctx = this.buildToolContext()
  const result = await toolRegistry.dispatch(toolCall.name, toolCall.arguments, ctx)

  // 4. Action log (browser tools)
  if (toolCall.name.startsWith('browser_')) appendActionLog(...)

  // 5. Truncate result if too large
  return formatAsToolResult(result)
}

How to debug a goal that's stuck

  1. Open the Goal-Dashboard — watch the step log live
  2. Look at the critic rail for verdicts. If you see critic:stuck repeatedly, the critic correctly detects it but the model isn't pivoting — check the system message it was injected with
  3. Look at the verification rail for verify:fail entries — what's the failure detail? Often the issue is in the success criterion, not the AI
  4. Click into the step log to inspect specific tool args / results
  5. Check the goal-store file directly: <userData>/clusterspace-data/clusterspace-goals.json — contains the full step history even after the dashboard closes

Adding a new success-criterion type

  1. Extend SuccessCriterion in src/shared/types.ts with the new variant
  2. Add a case to verifySuccessCriterion in goal-runner.ts
  3. Add a case to humanizeCriterion (for the dashboard display)
  4. Add a UI tab to GoalCreateDialog.tsx with input fields
  5. Type-check

The json_predicate type is a good stub to extend (currently TODO).


See also

Clone this wiki locally