Ultraswarm is a durable multi-worker coding orchestrator for Codex, Claude Code, Cursor Agent, Grok, and shell usage. One standalone Node runner owns decomposition, worker routing, process supervision, isolated Git worktrees, adaptive review, transactional integration, approvals, recovery, and reporting.
agentworker — the Cursor CLI (agent -p --force) as a headless shell worker for isolated worktree execution. See Cursor Agent Worker.- Cursor agent host skill — install with
scripts/install-cursor-skill.shso Cursor sessions can orchestrate via the standalone runner. See Cursor Agent.
small-harnessworker — SmallHarness as a built-in worker with MCP integration and multi-backend support. See SmallHarness Worker.
- User-defined harness aliases — register your own CLI entries under a new top-level
aliasesconfig key. Each aliasextendsa built-in (inheriting its binary, timeout, effort flags, and capabilities), overrides only its specialty / models / invocation, and can cap routing withmaxTier. Generalizes the previously hardcodedpi-local; strictly opt-in. See Harness Aliases.
piworker — the provider-agnosticpicoding CLI (Anthropic Claude spread by default). See Local / Private Models.pi-localworker — an always-on local/private worker that drives Ollama models through the samepibinary for fully offline-capable runs.- Per-task effort levels — the decomposition brain assigns reasoning
effortper task, independent of model tier, defaulting tolow, with effort-first QA escalation. See Effort Levels.
- SQLite state and append-only events under
.ultraswarm/state.sqlite - Capability and repository-metric worker routing with explanations
- Supervised worker process groups, timeouts, cancellation, redacted bounded logs
- Executable task contracts and forbidden-path policy
- Integration branches that do not modify the checked-out branch
- Separate plan and merge approvals
- Crash/status/log/export commands and stale-base recovery
- Generated Claude, Codex, Grok, and Cursor agent skills from one provenance-locked contract
Node 22 or newer is required because ultraswarm uses the built-in node:sqlite
API.
git clone https://github.com/fubak/ultraswarm.git ~/projects/ultraswarm
cd ~/projects/ultraswarm
npm installbash scripts/install-codex-skill.shThis creates:
~/.agents/skills/ultraswarm -> ~/projects/ultraswarm/hosts/codex/skills/ultraswarm
Restart Codex and invoke $ultraswarm.
Install the plugin:
/plugin marketplace add fubak/ultraswarm
/plugin install ultraswarm@ultraswarm
Invoke /ultraswarm.
bash scripts/install-cursor-skill.shThis creates:
~/.cursor/skills/ultraswarm -> ~/projects/ultraswarm/hosts/agent/skills/ultraswarm
Restart Cursor and invoke the ultraswarm skill. The host prepares plans and
delegates execution to bin/ultraswarm.mjs; it does not implement feature work
directly.
Install the Cursor CLI separately if you also want agent as a worker:
curl https://cursor.com/install -fsS | bash
agent --versionRun node ~/projects/ultraswarm/bin/ultraswarm.mjs .... The generated Grok host
contract is at hosts/grok/skills/ultraswarm/SKILL.md.
- A Git repository
- Node 22+
- At least two authenticated worker CLIs from
codex,gemini,grok,agy,droid,opencode,pi,pi-local,small-harness, andagent - An authenticated
claudeCLI for the default QA/decomposition brain, orANTHROPIC_API_KEYwithULTRASWARM_BRAIN=anthropic-api
Check readiness:
node ~/projects/ultraswarm/bin/ultraswarm.mjs doctor
node ~/projects/ultraswarm/bin/ultraswarm.mjs workersCreate a plan:
{
"tasks": [
{
"id": "api-tests",
"description": "Add regression coverage for the API",
"files": ["test/api.test.mjs"],
"complexity_score": 25,
"risk": "routine",
"effort": "low",
"dependencies": [],
"prompt": "Add focused regression tests for invalid request handling.",
"contract": {
"commands": ["npm test"],
"assertions": ["Invalid requests return 400"],
"allowed_paths": ["test"]
}
}
]
}cli, model_tier, and effort are optional. When cli/model_tier are omitted,
ultraswarm ranks healthy workers using capability fit and repository-local pass, latency,
and cost history. When effort is omitted it defaults to low (see Effort Levels).
Preview without executing:
node ~/projects/ultraswarm/bin/ultraswarm.mjs run \
--plan-file .ultraswarm-plan.jsonApprove the plan and execute:
node ~/projects/ultraswarm/bin/ultraswarm.mjs run \
--plan-file .ultraswarm-plan.json \
--approve-planThe run finishes in awaiting_merge. Your checked-out branch has not changed.
After reviewing status and logs, provide the separate merge approval:
node ~/projects/ultraswarm/bin/ultraswarm.mjs status <run-id>
node ~/projects/ultraswarm/bin/ultraswarm.mjs logs <run-id>
node ~/projects/ultraswarm/bin/ultraswarm.mjs merge <run-id> --approveThe final merge is fast-forward only. If the target branch moved, the run enters
stale_base; recover it with:
node ~/projects/ultraswarm/bin/ultraswarm.mjs resume <run-id>| Command | Purpose |
|---|---|
run |
Preview or execute a plan |
merge <id> --approve |
Approve and fast-forward integrated work |
status [id] |
List runs or inspect durable state |
logs <id> |
Read append-only events |
cancel <id> |
Terminate worker process trees |
resume <id> |
Recover awaiting-merge or stale-base state |
doctor |
Validate policy, gates, and worker health |
workers |
Show worker health and capabilities |
explain-routing <task> |
Explain worker rankings |
export <id> |
Export run provenance as JSON |
Legacy --plan-file ... --yes syntax remains as a v2 compatibility shim.
--yes maps only to plan approval; it never approves the final merge.
Exit codes are 0 success, 1 runtime failure, 2 usage error, 3 approval
required, and 4 blocked or stale state.
Add policy to ultraswarm.config.json:
{
"enabled": ["codex", "gemini"],
"workerEnvAllowlist": ["OPENAI_API_KEY"],
"policy": {
"minimumHealthyWorkers": 2,
"maxParallelWorkers": 4,
"requireCompetitionForRisk": ["high"],
"approvals": {
"beforeExecution": true,
"beforeMerge": true
},
"forbiddenPaths": [".env", ".env.*", "infra/prod/**"],
"maxCostUsd": 10,
"isolation": "native",
"containerImage": null,
"network": "allow"
}
}Project configuration overrides the global
~/.claude/ultraswarm.config.json. For container isolation, set containerImage to an image containing the selected worker CLIs. Network denial requires container
isolation and is rejected when configured with native isolation.
Beyond the built-in CLIs, you can register your own named entries under aliases. An alias
extends a built-in (inheriting its binary, timeout, effort flags, and capabilities) and
overrides only what differs — its specialty, its model tiers, and its invocation. This is how
you run several local models, each tuned for a job, through one CLI binary:
{
"enabled": ["codex", "pi-qwen-coder"],
"aliases": {
"pi-qwen-coder": {
"extends": "pi",
"specialty": "local coding, small refactors, unit tests",
"maxTier": "moderate",
"models": {
"simple": { "model": "qwen3-coder:7b", "invocation": "pi -p --provider ollama --model qwen3-coder:7b --config ~/.pi/lean.json \"$(cat .ultraswarm-prompt.txt)\"" }
}
}
}
}- Lean harness: put whatever makes a CLI's harness leaner directly in the
invocation(a--configpointing at a stripped-down profile, fewer flags, etc.). Local models often do better with less wrapping. maxTier: caps the tiers an alias will accept. A task above the cap is clamped down (e.g. an expert task on amaxTier: moderatealias runs at moderate), so a small local model is never handed work it can't do.- Opt-in only: nothing is auto-generated. An alias exists only if you declare it, and is
active only when it appears in
enabled(or whenenabledis omitted entirely).
SmallHarness is a terminal-first coding agent written in Rust that supports multiple AI backends (OpenAI, OpenRouter, Ollama, LM Studio, MLX, llama.cpp). As an ultraswarm worker it brings:
- Multi-backend routing: switch between cloud and local models per-task via overrides
- MCP integration: native Model Context Protocol support for extended tool sets
- Cost tracking: real-time per-turn and session cost accounting
SmallHarness must be installed separately:
cargo install small-harnessAdd small-harness to enabled to activate it. The built-in defaults use the OpenAI backend for simple tasks and OpenRouter (Claude) for moderate/complex/expert. Backend and model are passed via environment variables — SmallHarness reads BACKEND and AGENT_MODEL from the environment, not CLI flags.
To route simple tasks through a local Ollama model instead, override in ultraswarm.config.json:
{
"enabled": ["codex", "small-harness"],
"overrides": {
"small-harness": {
"models": {
"simple": {
"model": "qwen3-coder:7b",
"invocation": "BACKEND=ollama AGENT_MODEL=qwen3-coder:7b small-harness --allow-tools --print \"$(cat .ultraswarm-prompt.txt)\""
}
}
}
}
}Tool approval: ultraswarm always passes
--allow-toolsso SmallHarness auto-approves tool calls in one-shot mode. Do not omit this flag in custom invocations or the worker will silently deny every tool call and produce no file changes.
API keys: SmallHarness inherits only the variables in
workerEnvAllowlist. The built-in defaults needOPENAI_API_KEY(simple tier) andOPENROUTER_API_KEY(moderate/complex/expert). Add both to your config:{ "workerEnvAllowlist": ["OPENAI_API_KEY", "OPENROUTER_API_KEY"] }
The Cursor CLI (agent) runs headless tasks via agent -p --force in isolated worktrees.
Ultraswarm uses the same ShellWorkerAdapter as every other worker — no custom interface.
Install the CLI:
curl https://cursor.com/install -fsS | bash
agent --versionAdd agent to enabled to activate it. Built-in tier mapping: simple →
composer-2.5-fast; moderate → gpt-5.4; complex/expert → Claude Sonnet 4.6 /
Opus 4.8. Override models in ultraswarm.config.json via the standard overrides key.
File writes: ultraswarm always passes
--forceso the agent applies edits in one-shot mode. Without--force, the CLI only proposes changes and the task fails withno_changes.
API key: headless runs need
CURSOR_API_KEY. Add it toworkerEnvAllowlist:{ "workerEnvAllowlist": ["CURSOR_API_KEY"] }
When Cursor is both host and worker, keep at least one other worker enabled so high-risk tasks can satisfy competition policy.
pi and pi-local are both backed by the pi
CLI. pi runs a provider-agnostic Anthropic Claude spread; pi-local is an always-on
worker that routes through Ollama for fully local, private, offline-capable runs.
Ollama is a model backend, not an agentic worker — it cannot edit files or run commands on
its own. pi-local is the harness that drives local models with tool-calling inside an
isolated worktree.
To use pi-local:
- Install and run Ollama.
- Pull the models you want, e.g.
ollama pull qwen3-coder:7bandollama pull qwen3-coder:30b. - Register an
ollamaprovider and those models in~/.pi/agent/models.json(Pi reads provider entries withbaseUrl: http://localhost:11434/v1,api: openai-completions). - Override the default model IDs in
ultraswarm.config.jsonto match the models you pulled (seeultraswarm.config.advanced.json).
doctor and workers probe the pi binary, so a green pi-local means "pi is
installed" — not "Ollama is running." If Ollama is down, pi-local tasks fail at execution
time and are reported and retried like any other worker failure.
Local-model requirement:
pi-localonly works with a local model that emits structured tool-calls through Pi's provider endpoint. Many small local models (and the OpenAI-completions compatibility path) will describe an edit as plain text instead of calling thewrite/edittool — Pi then has nothing to execute and no file is produced, so the task fails its contract. Choose a local model with reliable tool-calling, and treat the defaultqwen3-coderIDs as examples to override. Frontier-hosted providers (thepiworker) do not have this limitation.
Reasoning effort is a per-task dial, independent of model tier. The decomposition brain
assigns effort (off/low/medium/high/xhigh) to each task and defaults to low —
most routine tasks produce the same result at low effort, far faster and cheaper. High effort is
reserved for genuinely hard reasoning.
Effort is injected per CLI for the workers that expose the dial (codex, droid, pi); other
workers ignore it. On QA failure, ultraswarm escalates effort first (low → medium → high)
before spending more — the cheapest correction rung first. Routine tasks climb effort within
their model tier; high-risk and complex tasks use the full ladder, stepping up the model tier
only after effort tops out.
Set effort explicitly on a task in your plan JSON to override, or override effortFlags per CLI
in ultraswarm.config.json (see ultraswarm.config.advanced.json).
Behavior note: because effort defaults to
low, an expert-tier task runs the expert model at low effort and escalates on failure — it is no longer pinned to high effort. Pin it witheffort: "high"if you need maximum reasoning up front.
- Worker attempts run in separate worktrees and process groups.
- Accepted task commits are squash-integrated into
ultraswarm/run-<run-id>, not the checked-out branch. - Worker environments use an allowlist rather than inheriting secrets.
- Logs redact common credential assignments and rotate at the output limit.
- Task contracts run commands and reject changes outside
allowed_paths. .ultraswarm/is ignored by Git and contains SQLite state plus worker logs.- v2 JSONL journals remain readable files but cannot be resumed as v3 runs.
npm test
bash scripts/validate.sh
node scripts/generate-host-skills.mjs --checkEdit hosts/host-contract.json or scripts/generate-host-skills.mjs, then run
node scripts/generate-host-skills.mjs. Do not hand-edit generated host skills.
Host install scripts:
- Codex:
bash scripts/install-codex-skill.sh - Cursor:
bash scripts/install-cursor-skill.sh
A pre-commit hook (in .githooks/, auto-enabled by npm install via the prepare script)
blocks commits that introduce host-skill drift — the generated SKILL.md files must stay in
sync with hosts/host-contract.json. Enable it manually with
git config core.hooksPath .githooks. CI (.github/workflows/validate.yml) runs
validate.sh and the full test suite on every PR, and main requires a passing CI run
through a pull request before merge.
MIT