Self-host AI coding agent. Tier-bounded execution. Atomic safety. BYOK transparency.
Existing coding agents (Cursor, Codex, Claude Code) are powerful but:
- Subscription lock-in ($20–200/mo with usage caps and rolling windows)
- No cost predictability (premium-request quotas, opaque tier ladders)
- Closed source (your prompts go through their infrastructure)
- Editor-bound (not for terminal / web / multi-machine workflows)
Zone is different:
- BYOK: Your OpenAI or Anthropic key. You pay your provider directly — no markup, no subscription floor.
- Tier-bounded: Simple tasks from ~$0.001, complex tasks up to ~$5.00 (provider-dependent). Per-tier token and iteration caps make cost predictable per dispatch.
- Atomic safety: Patches stage in memory, typecheck before flush, auto-rollback on failure. Your working tree never enters a half-broken state.
- Self-host: AGPL-3.0. No SaaS lock-in, no telemetry, no vendor dependency beyond your LLM provider.
- Web UI: Browser-based. Use it from anywhere — laptop, remote dev box, tablet — without an editor extension.
Anthropic prompt caching is enabled by default — empirically 50%+ input savings on multi-iter runs (hit ratio reaches 90–96% by iter 3+). A typical Zone run costs $0.001–$5.00 depending on provider and tier — see Cost below.
git clone https://github.com/BedreddinErkan/zone.git
cd zone
npm install
cp .env.example .env # add your OpenAI or Anthropic key
npm run build && npm run serve
# Open http://localhost:3000Add your key in Settings → API Keys, point at any local repo, dispatch a task.
Tasks are classified by a small upstream model into simple / medium / complex tiers before dispatch. Each tier has its own caps:
| Tier | Token budget | Iter cap | Subagent quota |
|---|---|---|---|
| simple | 400 K | 15 | 0 |
| medium | 600 K | 25 | 1 |
| complex | 800 K | 40 | 2 |
User-tunable via Settings → Tier Limits, or per-dispatch override with the forceTier API field.
Every patch goes through:
- In-memory staging — no disk writes until verified
- Typecheck pass —
tsc --noEmiton the staged tree - Atomic flush — all-or-nothing write on success
- Auto-rollback — disk restored to pre-apply state on failure, with a "what was attempted" diff rendered under the rollback banner
Your code never enters broken state because the agent guessed wrong.
After UI patches, Zone takes a Playwright screenshot at your configured dev URL. Thumbnail renders inline in the reply bubble; click to expand into a full-size lightbox. Configure viewport / path / URL in Settings → Visual, or via ZONE_DEV_SERVER_URL.
- OpenAI:
gpt-5.4,gpt-5.4-mini - Anthropic:
claude-opus-4-7,claude-sonnet-4-6,claude-haiku-4-5 - Vision-capable models accept image attachments
- Switch provider/model per dispatch from the footer picker
Click the paperclip → upload PNG / JPEG / WebP / GIF. Limits: ≤5 MB per image, ≤16 MB total, max 4 per dispatch. Vision-capable models analyze the attached images alongside the text prompt; routed to OpenAI's image_url format or Anthropic's image.source.base64 format by the provider adapter.
Settings → Variance surfaces per-task cost variance, p50/p95 latency, outlier detection (>2σ), and the last 50 runs. Powered by the sweep CSV — useful for catching regressions in token efficiency across versions.
Empirical per-task cost by provider × tier (USD), measured from drift sweep runs:
| Provider | Simple | Medium | Complex |
|---|---|---|---|
| gpt-5.4 | $0.005 | $0.05 | $0.20 |
| gpt-5.4-mini | $0.001 | $0.01 | $0.05 |
| Sonnet 4.6 | $0.05 | $0.30 | $1.50 |
| Haiku 4.5 | $0.01 | $0.05 | $0.20 |
| Opus 4.7 | $0.20 | $1.00 | $5.00 |
Monthly estimate for ~50 mixed-tier dispatches:
- gpt-5.4: ~$5–10
- gpt-5.4-mini: ~$1–3
- Sonnet 4.6: ~$15–30
- Haiku 4.5: ~$3–5
- Opus 4.7: ~$30–80
Numbers are approximate per-task averages — your variance dashboard shows real history from your runs.
- Single-file edits ("add this comment", "fix this bug", "rename this variable")
- Scoped refactors inside a 1–3 file blast radius
- Read-only investigation ("why does this test fail?")
- Visual verification ("change the chat background to dark gray, show me the result")
- Tier-classified execution (small tasks stay cheap, large tasks get the budget they need)
- Multi-feature coordination ("fix these 3 unrelated bugs in one run")
- Cross-cutting refactors that span 10+ files
- Editor integration — web UI only (VSCode extension on roadmap)
- Active framework dev on Zone itself — use Codex or Claude Code for that; Zone's atomic flow is tuned for single-feature task execution
Community-driven priorities:
- Phase O — Plan-First mode: structured plan rendered before dispatch, checkpoint resume between iterations
- Phase P — Context compaction: enables 30+ minute execution windows
- Phase Q — Organic subagent dispatch: multi-task delegation with cost propagation
- VSCode extension: editor integration via Zone API
┌─────────────────────────────────────────────┐
│ Browser UI (vanilla TS, single page) │
│ src/ui/index.html │
└───────────────┬─────────────────────────────┘
│ HTTP + SSE
┌───────────────▼─────────────────────────────┐
│ Express server │
│ src/api/server.ts │
│ ├─ /api/chat (investigation flow) │
│ ├─ /api/patch (atomic patch flow) │
│ └─ /api/sweep-results, /api/usage, ... │
└───────────────┬─────────────────────────────┘
│
┌───────────────▼─────────────────────────────┐
│ Agent loop (src/llm/agentLoop.ts) │
│ ├─ LLM adapters: OpenAI + Anthropic │
│ ├─ Tool executor + AST validator │
│ ├─ Tier-bounded budget enforcement │
│ └─ Atomic staging + rollback │
└───────────────┬─────────────────────────────┘
│
File-based storage
~/.zone/, .zone/, dist/
TypeScript end-to-end. AST validator + tool executor + agent loop in src/. Storage is file-based (~/.zone/ for user-level settings, .zone/ for repo-level state). No database required.
- Node.js 22+
- npm 10+
- ~/.zone/ directory (auto-created on first launch)
- Optional: Playwright Chromium for visual verification (
npx playwright install chromium)
# .env (minimum: at least one provider key)
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
# Optional
ZONE_DEV_SERVER_URL=http://localhost:3000 # visual verification target
ZONE_ENABLE_MESSAGE_CACHE=1 # Anthropic prompt caching (default on)
ZONE_VERBOSE_LOGS=0 # 1 for diagnostic output
PORT=3000All keys can also be set through Settings → API Keys after first launch — no .env file required for personal use.
Add a comment to a file (simple tier, ~5 sec, ~$0.005 on gpt-5.4 / ~$0.05 on Sonnet 4.6)
Add a brief header comment to src/utils/files.ts describing what it exports
Investigate a failing test (medium tier, read-only, ~$0.05 on gpt-5.4 / ~$0.30 on Sonnet 4.6)
Why is buildDecisionTrace.test.ts failing on case 3?
Refactor a function signature (complex tier, atomic verification, ~$0.20 on gpt-5.4 / ~$1.50 on Sonnet 4.6)
Change normalizeSignals to accept an options object with a strict flag
"Plan 0/5" stuck in chat Server restart resets plan state — refresh the page.
Branch dropdown grayed out
That branch is in use by another git worktree. git worktree list to inspect.
Vision model rejected with "model_no_vision"
Switch to gpt-5.4 or claude-sonnet-4-6 (vision-capable). gpt-5.4-mini and claude-haiku-4-5 are text-only.
Sweep CSV missing / changes wiped
The sweep runner resets the worktree (git reset --hard HEAD) before each task for idempotent baselines. Commit your work before running sweep.
Verbose logs
ZONE_VERBOSE_LOGS=1 npm run serveShows request flow, cache hits, tool calls, performance traces — useful for filing a bug report.
See CONTRIBUTING.md. Issues and PRs welcome at github.com/BedreddinErkan/zone.
Note: Zone's atomic-staging flow is tuned for single-feature task dispatches. For multi-task framework development on Zone itself, Codex or Claude Code is a better fit — Zone's tier caps and rollback strictness aren't designed for sprawling refactors of its own codebase.
AGPL-3.0-or-later. Free for personal use, modification, and distribution. Forks must release their source under the same terms; closed-source SaaS forks are not permitted.
Commercial license available for teams that need single-tenant deployment, SSO, audit logs, or custom support. Contact: [contact: your email].
Bedreddin Erkan (@BedreddinErkan) — solo dev, ~6 months. Built with TypeScript, Node.js, OpenAI/Anthropic SDKs, and Playwright. Inspired by Cursor; frustrated by SaaS lock-in.
- Codex, Claude Code, Cursor — different philosophies, same problem space
- Anthropic — vision API, prompt caching that actually pays off in iterative agent loops
- OpenAI — gpt-5.4 reasoning models
- Solo dev community on Twitter / HN — feedback throughout development