Skip to content

Starting a Goal

nick3 edited this page May 28, 2026 · 1 revision

Starting a Goal

Walkthrough of the New Goal dialog. Open with Ctrl+Shift+G+ New Goal.

Component: src/renderer/components/GoalCreateDialog.tsx.


Fields

Pane

Pick the pane the goal runs in. Dropdown lists every pane in the current workspace with its label and type (terminal / browser).

The pane's existing PTY or browser webview is what the goal will interact with — no new pane is created. For browser-driving goals, point at a browser pane; for shell-driving goals, point at a terminal pane.

Goal

Free-form description of what you want done. Be specific — the AI takes this verbatim as its instruction.

Good Vague
"Install nginx, configure to serve /var/www/html on port 80, verify with curl -sf http://localhost" "Set up the server"
"On https://dashboard.example/reports, click 'Download CSV', save the file to ~/Downloads, then mv it to /tmp/report.csv" "Get the report"
"Run npm test, fix any failures, repeat until exit 0" "Make the tests pass"

A concrete goal lets you pick a concrete Success-Criteria.

Success Criterion

How the runner knows the goal is done. Four types, picked via tabs:

  • shell — exit code of a shell command (default: 0). Most reliable; use whenever you can express success as a command.
  • model_question — a yes/no question a judge model answers about the AI's rationale. Use for visual/subjective outcomes.
  • json_predicate — JSON expression. Currently a documentation-only field (predicate evaluator deferred). Use model_question instead until evaluator ships.
  • manual — trust the rationale. The runner accepts whatever the AI says; you review later in the dashboard.

See Success-Criteria for examples of each.

Policy — risk ceiling

Radio buttons:

  • Read-only — only inspect state. No mutations.
  • Write local — run commands, edit files, type in browser. No network forms.
  • Network GET — navigate to URLs, read pages.
  • Network write — submit forms, set cookies, POST.
  • Spends money — checkout / payment / banking. Use sparingly.

Each option has a one-line help text showing what's actually allowed.

The model is constrained at runtime: tools beyond this risk level prompt the user (via the approval modal) before executing. See Goal-Policy-and-Risk-Levels for the full per-tool table.

Sandbox dir (optional)

For goals that touch the filesystem, restrict file-touching tools (browser_save_pdf, browser_save_html, browser_set_files) to paths inside this directory. Tools wanting to touch paths outside the sandbox prompt the user.

Wall clock (minutes)

Hard cap on how long the goal can run. Default 60 (= 1 hour). If exceeded, the goal ends as failed with a wall-clock-exceeded message.

For long automations, increase to 120 / 240. For short tasks, decrease to 5 / 10 — keeps a runaway loop from burning budget.

Critic fires every N steps

The critic is a sibling model call that judges progress every N tool calls. 0 = disabled. Default 5. See Critic-and-Replan.

Lower (3) for tight oversight; higher (10) for less interruption; 0 if you trust the model and want to save the critic-call cost.


Click Start Goal

The runner:

  1. Creates a GoalCheckpoint in the goal store (status pending)
  2. Sets the policy on AIManager (gates all subsequent tool calls)
  3. Marks the agent on the chosen pane as working
  4. Kicks off the async loop (returns immediately with the goal ID)
  5. The dashboard auto-selects the new goal so you can watch it run

If anything's invalid (no pane, empty goal text, malformed criterion), the dialog shows a red error band at the bottom and the goal is not created.


Three concrete examples

Example 1: Install nginx

  • Pane: a local terminal pane
  • Goal: "Install nginx, start it, configure to serve /var/www/html. Verify with curl."
  • Success criterion: shell curl -sf http://localhost, exit 0
  • Policy: write_local
  • Sandbox: (empty)
  • Wall clock: 15
  • Critic interval: 5

Example 2: Download CSV report

  • Pane: a browser pane (must be logged in to dashboard first)
  • Goal: "On https://dashboard.example/reports/today, click 'Download CSV' and save the file as /tmp/today.csv"
  • Success criterion: shell test -f /tmp/today.csv, exit 0
  • Policy: network_write
  • Sandbox: /tmp
  • Wall clock: 5
  • Critic interval: 3

Example 3: Fix failing tests

  • Pane: a terminal pane in your repo
  • Goal: "Run npm test. If it fails, identify the failing test, fix the underlying issue, and re-run until the suite passes."
  • Success criterion: shell npm test, exit 0
  • Policy: write_local
  • Sandbox: (empty — or your repo dir if you want to be strict)
  • Wall clock: 30
  • Critic interval: 5

After clicking Start Goal

The dialog closes; the dashboard shows the new goal at the top of the list. The step log fills in live as the AI takes actions. The critic rail at the bottom shows verdicts and verification results.

To watch progress, just leave the dashboard open. Status bar Goals pill pulses while any goal runs.


See also

Clone this wiki locally