A local coding-agent orchestrator. Give it a high-level task and it:
- decomposes the task into a DAG of subtasks,
- fans out one auto-approved, headless Claude Code session per subtask, each isolated on its own git worktree/branch,
- shows live progress — both a terminal
richtable and a liveprogress.mddocument that opens in VSCode and auto-reloads as agents work, - runs each subtask's acceptance checks (retrying once on failure),
- merges the green branches and opens the changed files in VSCode,
- reports a table of measurable speed/accuracy objectives.
Validated end to end: serial baseline 35.7s → orchestrated 6.6s (5.40× speedup), 100% acceptance, 0 merge conflicts, 0 human-wait — see RESULTS.md.
It is built fully autonomously after a single plan approval, and re-prompts only on significant deviations.
| Source | What sidekick takes from it |
|---|---|
| Nous Hermes-Agent | skills/memory learning loop, pluggable execution backend (git worktrees here), isolated subagent delegation for parallel workstreams |
| Raschka, "The Six Components of a Coding Agent" | (1) live repo context, (2) cache-shaped prompts, (3) bounded/structured tool use, (4) context-bloat control, (5) structured session memory, (6) bounded subagent delegation |
| Module | Role | Lineage |
|---|---|---|
repo_context.py |
workspace summary (branch, tree, docs) | Raschka #1 |
prompts/ |
stable system prefix + dynamic suffix for cache reuse | Raschka #2 |
agent_session.py + approval.py |
headless claude -p wrapper, auto-approval policy |
Raschka #3 |
context_budget.py |
output clipping + tiered transcript reduction | Raschka #4 |
memory.py |
transcript + working memory as JSON on disk | Raschka #5 |
orchestrator.py |
DAG waves, bounded parallel agents, merge | Raschka #6 / Hermes |
worktree.py |
git worktree+branch per agent | Hermes backend |
skills.py |
distill + recall reusable skills | Hermes learning loop |
dashboard.py |
live rich table + live progress.md |
— |
vscode.py |
open progress doc + changed files in VSCode | — |
metrics.py |
objective computation + gate | OBJECTIVES.md |
sidekick drives the Claude Code native binary ($CLAUDE_CODE_EXECPATH) in headless
mode:
claude -p "<prompt>" --output-format stream-json --verbose \
--permission-mode acceptEdits \
--allowedTools "Edit Write Read Grep Glob Bash(uv *) Bash(pytest *) …" \
--session-id <uuid> --max-turns N
acceptEdits + an explicit --allowedTools allowlist auto-approves edits and a scoped
set of build/test/lint/vcs commands while still refusing unlisted or dangerous
operations — zero human prompts (objective S4 = 0), without an open shell grant.
Each session runs inside its own worktree, so parallel agents never collide.
Three approval levels (--approval): accept_edits_allowlist (default), bypass
(--allow-dangerously-skip-permissions), edits_no_bash.
The spawned agents are headless Claude Code subprocesses — that is what makes auto-approval and parallel fan-out possible, and it means they do not appear as interactive sessions in the VSCode sidebar (that sidebar session is the one you use to drive sidekick). Progress is surfaced in the editor instead:
- The dashboard writes a live
progress.md(.sidekick/runs/<id>/progress.md) — a per-agent table of status / current action / edits / turns / tokens / elapsed, plus a result footer. sidekick opens it withcode -r; VSCode auto-reloads the tab on every update, so you watch the whole fan-out from one editor pane. - On completion, the changed files of each accepted subtask are opened for review.
- Detection is automatic (the
codeCLI on PATH); force with--vscode/--no-vscodeorSIDEKICK_VSCODE=1|0. The sameprogress.mdworks in any editor ortail.
You can run sidekick from VSCode's integrated terminal (or its Claude Code extension
terminal) and keep progress.md open beside it.
First, put sidekick on PATH so VSCode tasks and terminals can call it:
cd /path/to/sidekick && uv tool install --editable . # or: pipx install -e .The one limitation: the Claude Code sidebar session cannot be transparently rerouted through sidekick — the extension runs the agent in-process and exposes no reroute hook, so nothing can sit invisibly underneath it. sidekick drives headless agents; you keep using the sidebar to drive sidekick. With that understood, there are three usage patterns:
Copy examples/vscode/tasks.json → .vscode/tasks.json in
any repo. Opening the folder auto-starts sidekick repl in a dedicated terminal (VSCode
asks once to "Allow Automatic Tasks"). Then, every time:
sidekick> add input validation to the upload handler
→ sidekick plans → fans out auto-approved agents on worktrees → merges green branches, with
progress.md live in an editor tab beside you. This is the closest thing to "VSCode always
runs through sidekick."
Use the interactive Claude Code sidebar normally; when a task wants parallel fan-out, tell
it to run sidekick, e.g. "run sidekick run "refactor X across these 4 modules" --yes". The
sidebar stays your control surface; sidekick owns the parallel execution and reports back via
progress.md + opened diffs.
- Integrated-terminal alias (add to
~/.bashrc):Recursion-safe — sidekick invokes the Claude binary by its absolutecc() { sidekick run "$@" --yes; } # then: cc "add unit tests for parser.py"
$CLAUDE_CODE_EXECPATH, never the shellclaude. - Hotkey:
examples/vscode/keybindings.jsonbindsCtrl+Alt+Lto a "run task" prompt (uses the second task intasks.json).
| Surface | When | Where |
|---|---|---|
Live progress.md (per-agent table) |
during the run | editor tab, auto-reloads |
rich dashboard |
during the run | the sidekick terminal |
| Changed files of each accepted subtask | on completion | opened for review |
| Objective table (S1–S4 / A1–A4 / E1–E2) | on completion | the sidekick terminal |
Toggle the editor integration with --vscode / --no-vscode or SIDEKICK_VSCODE=1|0
(auto-detected from the code CLI). A per-agent VSCode window on each worktree is not
opened by default — set it up with a code <worktree> step if you want one window per agent.
Speak a task instead of typing it — works in the VSCode integrated terminal (it uses the
OS mic via ffmpeg/arecord, which the terminal process can access).
sidekick voice # press Enter, speak, sidekick plans → fans out → merges
sidekick voice --transcribe-only # just print what it heard
sidekick repl --voice # voice-driven interactive loop (great for the VSCode task)Speech-to-text goes through an OpenAI-compatible /audio/transcriptions endpoint, so
it is provider-independent from the coding model (shared by the claude, kimi, … branches):
| Var | Default |
|---|---|
SIDEKICK_STT_BASE_URL |
$OPENAI_BASE_URL or https://api.openai.com/v1 |
SIDEKICK_STT_API_KEY |
$OPENAI_API_KEY |
SIDEKICK_STT_MODEL |
whisper-1 |
SIDEKICK_AUDIO_INPUT |
auto (pulse:default / alsa:default) |
SIDEKICK_AUDIO_SECONDS |
8 |
Requires ffmpeg or arecord plus an STT key; degrades gracefully with a clear message
if either is missing.
just setup # uv venv + editable install
sidekick plan "add input validation" # see the subtask DAG
sidekick run "add input validation" --yes # fan out, auto-approve, merge, report
sidekick repl # interactive task loop (VSCode auto-launch)
sidekick voice # speak a task; sidekick runs it
sidekick repl --voice # voice-driven loop
sidekick metrics # objective table from .sidekick/metrics.jsonl
sidekick status # last run's working memory
sidekick bench # serial baseline vs orchestrated (proves S2)
sidekick run "..." --vscode # force-open progress + diffs in VSCode
sidekick run "..." --no-vscode # terminal dashboard only
sidekick run "..." --concurrency 5 # wider fan-outRun sidekick from inside the target git repository (changes are made to that repo's
branches and merged into its current branch).
See OBJECTIVES.md. sidekick bench and sidekick metrics compute S1–S4
(speed), A1–A4 (accuracy), E1–E2 (efficiency) from metrics.jsonl. Every optimization
(prompt shape, context budget, concurrency, approval policy) is judged by its effect on
this table.
- Claude Code native binary on PATH or
$CLAUDE_CODE_EXECPATH(set inside Claude Code). git, Python ≥ 3.12.rich(declared dep) for the live dashboard.- Auth via the running Claude Code session's credentials (or
ANTHROPIC_API_KEY). - Optional: the VSCode
codeCLI for the in-editor progress doc + diffs (auto-detected; sidekick runs fine without it).
sidekick can be invoked from inside another Claude Code session so the
parent session never spends its own context on the work. Two pieces:
-
Install once, globally:
uv tool install /mnt/backup/projects/sidekick
This puts
sidekickonPATHfor every shell — every Claude Code session (current or future) can shell out to it. -
The user-level skill at
~/.claude/skills/delegate-to-sidekick/SKILL.md(also in this branch — seeexamples/delegate-to-sidekick.SKILL.mdfor the canonical copy) tells any Claude Code session how + when to invoke. The skill is auto-discovered by Claude Code; the session seesdelegate-to-sidekickin its available-skills list and can invoke it via theSkilltool.
The contract is one command:
sidekick --repo /abs/path/to/repo run "<task>" --json --no-vscode--json emits a single envelope on stdout (schema documented in the SKILL.md);
--no-vscode keeps the live progress doc from popping a window into the
parent session's IDE. The envelope carries ok, n_accepted/n_total,
n_merged, per-subtask branch, and the last 2 KB of every acceptance
check's transcript — enough for the parent session to summarize and decide
whether to follow up.
The SKILL.md hard rules cover the gotchas (--repo must be absolute, never
shell out to uv run sidekick from inside the source tree, etc).
To upgrade the global install after pulling new changes on this branch:
uv tool install --reinstall /mnt/backup/projects/sidekick