Planner-first coding agent for real repositories. The planner handles clarification, discovery, goal sequencing, and final next steps. The worker executes concrete repository actions.
Install dependencies:
pip install openai google-genai anthropicSet an API key with environment variables or a local .env file:
export OPENAI_API_KEY=...
export GEMINI_API_KEY=...
export ANTHROPIC_API_KEY=...Planner-first mode:
python main.py --provider openai --model gpt-5.4 --root /your/project
python main.py --provider anthropic --model claude-sonnet-4-6 --root /your/project
python main.py --provider local --model gemma4 --root /your/projectDirect worker mode:
python main.py --provider openai --model gpt-5.4 --root /your/project --worker-mode
python main.py --provider anthropic --model claude-sonnet-4-6 --root /your/project --worker-mode
python main_v2.py --provider gemini --model gemini-3-flash-preview --root /your/project --worker-mode
python main.py --provider local --model gemma4 --root /your/project --worker-modeOptional runtime tuning:
python main.py --provider openai --model gpt-5.4 --root /your/project --max-parallel-workers 6Live runtime switching in the CLI:
/runtime anthropic claude-sonnet-4-6
/model claude-sonnet-4-6
/runtime-show
/providers
/models
/models gemini
/providers lists supported runtimes. /models [provider] shows the current provider by default and prints suggested model names for any supported provider. On startup, the backend now does one best-effort live model refresh for providers with installed SDKs and credentials, then falls back to the built-in April 2026 catalog if a provider cannot be queried. Custom model strings are still allowed.
An initial desktop VS Code extension shell is available under vscode-extension/.
What it currently provides:
- launches the Python planner/worker runtime as a background bridge process
- renders planner state, worker runtime state, transcript history, and current-run facts in a webview panel
- turns planner and worker
suggested_next_actionsinto clickable buttons for plan approval, rejection, discovery selection, validation, review, and recovery flows - surfaces backend-generated diagnostics in the panel and mirrors them into the VS Code Problems view, including file-targeted checks that also work in pure CLI mode
- opens file paths surfaced from runtime state directly in the editor and can open review reports plus working-tree-vs-HEAD file diffs
Extension development setup:
cd vscode-extension
npm install
npm run compile
npm test
npm run test:integrationThen open vscode-extension/ as the extension development workspace and run the Run Python Agent Extension launch configuration.
Extension settings:
skillzAgent.providerskillzAgent.modelskillzAgent.pythonPathskillzAgent.backendScript
To launch the beta TreeLoop planner bridge from the extension, set skillzAgent.backendScript to main_v2.py. Leave it as main.py to keep using the stable planner/worker backend.
Changing skillzAgent.provider or skillzAgent.model while the extension backend is running now hot-updates the active runtime without killing the process.
Backend requirements:
- Python 3.13 is the current development target; the extension will also work with a compatible Python interpreter that can run
main.pyandagent_tools.py. - Install Python dependencies for the selected provider before launching the extension:
openaifor OpenAI mode,anthropicfor Anthropic mode,google-genaifor Gemini mode. - Set provider credentials in the environment seen by VS Code, such as
OPENAI_API_KEY,ANTHROPIC_API_KEY, orGEMINI_API_KEY. - Keep
gitavailable onPATH; review, diff, and file comparison flows rely on repository commands. skillzAgent.pythonPathshould point at the interpreter or virtual environment you want the extension backend to use.- The
localprovider targets the existing localhost OpenAI-compatible endpoint athttp://127.0.0.1:5051/v1, which can be used for models such as Gemma 4. - Node.js and
npmare required only for extension development insidevscode-extension/, not for the Python backend itself.
The extension currently targets desktop VS Code APIs and uses the Python runtime as the source of truth for planner/worker behavior.
- Starts in planner mode by default.
- Asks clarification questions when the request is materially underspecified.
- Offers a discovery phase when repo inspection is needed before planning.
- Supports
Quick Scan,Moderate Scan, andDeep Scandiscovery depths. - Produces a plan that must be approved before execution.
- Delegates goals one at a time to the worker.
- Can execute dependency-ready read-only or validation-only goals concurrently when the planner marks them safe to parallelize.
- After discovery, pushes discovered files, constraints, and risks into delegation so goals are concrete rather than vague.
- Ends with specific next steps tied to the executed work.
- Opens an issue-scoped execution context when an approved plan starts, closes it on full success, and can explicitly reopen recent issues for follow-up work.
Planner commands:
/approveexecutes the pending plan./rejectrejects the pending plan./planshows the current pending plan./discovershows the current discovery offer./providerslists supported runtime providers./models [provider]lists suggested models for the current or specified provider./resetclears planner state./workerenters direct worker debug mode./quitexits.
Example request:
When opening a routine, do a 10 second countdown with speech and an indicator before the first drill starts.
Typical planner-first flow:
planner> When opening a routine, do a 10 second countdown with speech and an indicator before the first drill starts.
Discovery suggested: The request depends on the current routine start flow and UI entrypoints.
Choose a discovery depth:
1. Quick Scan [budget: 6 tool calls]
2. Moderate Scan (recommended) [budget: 12 tool calls]
3. Deep Scan [budget: 15 tool calls]
planner> 2
Discovery complete: Moderate Scan
Worker result: Discovery found the routine entry flow in src/app.py and the immediate start behavior in src/routine.py.
Tool budget: 7/12
Plan summary: Fix routine start flow
Discovery basis: Discovery found the routine entry flow in src/app.py and the immediate start behavior in src/routine.py.
Goals:
1. Implement countdown before first drill [goal-1] - preserve_context=false
Goal: Update the routine startup flow to show a 10 second countdown, play countdown speech, and begin the first drill only after countdown completion.
Why next: Discovery already identified the startup flow and the files controlling routine start behavior.
Delegation: Primary discovered files: src/app.py, src/routine.py; Use the discovery findings directly rather than repeating broad discovery.
Success signals: The worker reports a concrete completed outcome tied to the discovered flow, not additional broad discovery.
planner> approve
Executing confirmed plan.
Goal 1/1 completed: Implement countdown before first drill
Worker result: Updated the startup flow and added countdown behavior before the first drill begins.
Specific next steps:
1. Validate the countdown timing and speech cadence in the routine UI.
2. Verify the first drill starts only after countdown completion.
What this example shows:
- The planner offers discovery when repo structure matters.
- Discovery findings are carried into the plan rather than discarded.
- Goal delegation names concrete files, outcomes, and success signals.
- Approval is explicit before worker execution begins.
The worker supports focused repository actions instead of a generic shell-first workflow.
Core file and search actions:
list_fileswith recursive listing, max depth, and glob filters.read_filewith optional line windows.inspect_filesfor batched multi-file reads.summarize_filesfor dependency-aware file summaries.grepscoped by path and glob, with ripgrep when available.find_filesscoped by path and glob.symbol_searchfor Python and JS/TS symbols, including imports/exports and Python methods.
Change and git actions:
write_fileandpatch_filewith verification-aware follow-up.git_statuswith parsed entries and counts.git_diffwith staged, stat, and name-only modes.review_changeswith risk and validation summaries.git_add,git_restore,git_commit,git_log, andgit_branch.
Execution and context actions:
diagnosefor backend file-targeted diagnostics on.ts,.tsx,.js,.jsx, and.pyfiles without relying on VS Code.run_shellfor validation, formatting, or targeted inspection.metaandshow_difffor repository context.history_expandandmemory_expandfor compact context recovery.drop_contextandfinishfor execution control.
Playground OS skills:
- Bundled skills live under
skills/*.mdwith front matter forname,description, optionalargs_schema, optionaltags, optionalcategory, and optionalpriority. - Both the stable runtime and the beta TreeLoop runtime auto-load bundled skills from this repo and workspace-local skills from
<target-repo>/skills/*.md. - In the stable runtime, use the
skillaction to list skills or load a named skill payload. - Use
skillto list them andskill <name>to invoke a cached Markdown skill payload.
- Durable facts in
repo_facts.mdare now schema-versioned and stored in an issue-aware ledger instead of a flat list. architecturefacts are cross-issue repo memory and remain available for unrelated future work.goalfacts are issue-local memory and return only while the issue is active or when that issue is explicitly reopened.- Approved plan execution opens an issue automatically; successful completion closes it.
- The planner and extension can surface recent closed issues as explicit reopen actions instead of silently leaking old goal facts into new requests.
- The planner is designed to reduce repeated exploration and push the worker toward concrete execution once enough evidence exists.
- Successful writes and patches require read-based verification before the worker treats them as complete.
- Discovery is intended to improve delegation quality, not become a substitute for execution.
- The host can prefetch discovery probes in parallel and run parallel post-write validation, while repository writes remain serialized behind runtime locks.
- The backend now exposes a structured runtime catalog for supported providers and suggested models, so the CLI and VS Code extension can reuse the same source of truth instead of hardcoding separate lists.