Contract authority for AI-assisted codebases. Define your schemas and API contracts once in .arc.yaml. Arc generates types in Python, TypeScript, and Zod, fails CI when hand-written code drifts from those contracts, and exposes structured drift context to any coding agent via MCP.
The one-liner: Define contracts once. Arc generates the code, breaks the build on drift, and gives your AI agents verified truth.
Without Arc, an API contract lives in four places: the Pydantic model on the backend, the TypeScript type on the frontend, the Zod validator at the edge, and the OpenAPI schema in CI. Agents reading the codebase have no anchor — each file looks like a hand-typed copy of the others. When they drift, agents make plausible-but-wrong edits.
With Arc, the contract is one file. Everything else is generated. An agent that touches the codebase first reads the contract, then verifies its own diff against the contract before committing. Arc owns truth; agents own inference and editing.
before after
┌──────────────────────────┐ ┌──────────────────────────┐
│ backend/models/task.py │ │ .arc/schemas/task.arc.yaml │ ← the contract
│ frontend/types/task.ts │ │ │
│ shared/schemas/task.zod │ → │ (everything else generated)│
│ docs/openapi/task.yaml │ │ │
└──────────────────────────┘ └──────────────────────────┘
four files, hand-maintained one file, source of truth
Arc helps three kinds of teams:
- Teams using AI coding agents (Claude Code, Cursor, Cline, Continue, etc.). Your agent reads the Arc contract, verifies its own diff against
arc_validate_patch, and stops making plausible-but-wrong edits. The smoke benchmark shows Haiku going from 67% → 100% task success when Arc tools are attached — and the failures it prevents are the canonical "I think I fixed it" false positives. - Solo devs maintaining typed APIs. Replace the four-files-per-contract grind (Pydantic + TS interface + Zod + OpenAPI) with one source of truth.
arc compileregenerates downstream artifacts;arc checkfails CI when anything drifts. - Open-source maintainers reviewing PRs. Run
arc checkin CI; reviewers see structured drift output instead of having to manually cross-check that a backend change kept the TS types and Zod validators consistent.arc diffflags contract-weakening explicitly (nullable: false → true, required → optional, enum value drops) so silent API regressions can't slip past review.
If your codebase has typed cross-language API boundaries and AI agents read it, Arc is for you.
npm i -g arc-contract # or pnpm add -g arc-contractThe binary is arc. Confirm with arc --version.
The Node-side codegen toolchain (json-schema-to-typescript, json-schema-to-zod, prettier) ships bundled — arc compile works out of the box for TypeScript / Zod outputs.
If you generate Pydantic models or want Python formatting, also install the Python tools. After arc init scaffolds your project, a pinned .arc/requirements.txt is included for one-line setup:
pip install -r .arc/requirements.txt
# or, for isolated installs (recommended):
pipx install datamodel-code-generator
pipx install ruffCommit .arc/requirements.txt alongside the rest of .arc/ — all team members run pip install -r .arc/requirements.txt once. Arc emits a clear hint pointing at the same file if either tool is missing when you run arc compile.
Note: v0.1 ships with
0.1.0-devuntil the first public release. Until then, clone the repo andpnpm build && pnpm link --globalfor local use.
In an existing project:
arc init # scaffold .arc/ with one example schema + component
arc compile # generate Python / TypeScript / Zod / JSON Schema artifacts
arc check # verify the working tree against the contractThe GIF above is rendered by
vhsfromscripts/demo.tape. To regenerate it deterministically (and avoid vhs's flaky Windows ↔ ttyd bridge), use the bundled Dockerfile:docker build -f scripts/demo.Dockerfile -t arc-demo-vhs . docker run --rm -v "$PWD:/work" -w /work arc-demo-vhs scripts/demo.tape
arc check --json emits a machine-readable { drifts: Drift[] } payload — the same shape the MCP tool returns. Exit code is non-zero when any error-severity drift is present.
Arc compares the contract against the working tree and emits structured drift on:
- Bond resolution — every declared handler / client / test symbol exists.
- FastAPI signature — handler parameter and return types match the schema.
- Route matching —
@app.<method>("<path>")matches the contract's method + path. - Import scope —
forbidden_pathsanddepends_onare honored. - Generated files current —
Source-HashandGenerator-Hashheaders match the contract + toolchain. - Reference resolution — every
ref:points at a real schema and field.
Each drift carries a stable id, a kind, the bond it touches, expected vs actual descriptions, and a fix hint. See docs/case-study-fastapi-react.md for a worked example against a real OSS demo target.
Arc's main value is structured drift context for agents. The MCP server exposes five tools:
| Tool | Purpose |
|---|---|
arc_check |
run all drift checks, return the structured list |
arc_explain |
given a drift id, return source/code excerpts, related schemas, dependency chain, why-it-matters |
arc_propose_fix |
given a drift id, return a structured fix descriptor (strategy + target file/symbol + guidance) — not a patch |
arc_validate_patch |
given a unified diff, run all checks against an isolated overlay; never mutates the working tree |
arc_scope |
given a list of files, return the components, schemas, bonds, forbidden paths, and tests in scope |
Start the server with arc mcp --root <project> (stdio). Setup guides for the common clients:
- VS Code forks (Cursor / Windsurf / Void / Trae and stock VS Code 1.99+):
docs/integrations/vscode-forks.md - Cline:
docs/integrations/cline.md - Continue:
docs/integrations/continue.md
Two skills ship with the npm package, both at <install-dir>/skills/:
arc-fix— drive thearc_check → arc_explain → arc_propose_fix → arc_validate_patchloop to a clean check. Use after Arc is set up and the contract exists.arc-adopt— read an existing codebase and draft.arc/contracts matching its current types. Use to introduce Arc to a brownfield project.
To enable either in Claude Code, copy the skill directory into ~/.claude/skills/. The exact command depends on your package manager and OS:
POSIX (npm):
cp -r "$(npm root -g)/arc-contract/skills/." ~/.claude/skills/POSIX (pnpm):
cp -r "$(pnpm root -g)/arc-contract/skills/." ~/.claude/skills/POSIX (yarn classic):
cp -r "$(yarn global dir)/node_modules/arc-contract/skills/." ~/.claude/skills/Windows (PowerShell, any package manager):
$src = (& npm root -g) + "\arc-contract\skills" # or pnpm root -g
Copy-Item -Recurse "$src\*" "$env:USERPROFILE\.claude\skills\"Both skills install in one command (the /. / \* trailing pattern copies all subdirectories under skills/). Claude Code surfaces each by name when its description matches the user's task.
For environments without Node, scripts in scripts/build-binary.* produce a standalone executable via Bun (primary) or Node SEA (fallback). See those scripts' header comments — the WASM tree-sitter grammars need slightly different handling depending on platform.
The v0.1 benchmark question: does Arc make agents produce fewer contract-violation defects than the same model without Arc?
Three conditions × 10 fixed tasks × 3 model tiers = 90 runs. The headline credibility metric is the B − C delta — Arc vs. a structured-prompt control. B − A is the easy delta (just adding a workflow helps); B − C is the real test (does the tooling help beyond the prompt?).
A 9-run smoke (3 tasks × 3 conditions × Claude Haiku 4.5) against the fastapi-react case-study target:
| Condition | Success | Failure mode |
|---|---|---|
| B (Arc-enabled) | 3/3 = 100% | none |
| A (no Arc, generic prompt) | 2/3 = 67% | 1 false-positive completion |
| C (no Arc, structured workflow prompt) | 2/3 = 67% | 1 false-positive completion |
Both no-Arc failures were the same false-positive on the read_root task: the agent edited the return value but not the return type annotation, then self-assessed as success. Arc-enabled agent re-ran arc check after editing, caught the gap, fixed it. The B − C delta of +33pp shows the tooling matters, not just the prompt.
Full breakdown: benchmarks/SMOKE_RESULTS.md.
The smoke is N=1 per cell. The publish run will fill out the matrix per plan/arc_benchmarking_plan.md:
- 10 distinct drift kinds, not 3 instances of one
- 3 model tiers (Opus / Sonnet / Haiku)
- ≥3 reps per cell for variance
- Real MCP server (not Bash shell-out)
- Full
arc_validate_patchconvergence loop
Harness at benchmarks/. Tasks and prompts are frozen — don't tweak between runs. See benchmarks/README.md for the operator quick-start.
Three doors in, picked by what you already have:
- Greenfield project, cross-language types from day one. Run
arc initto scaffold.arc/with one example schema + component. Readdocs/format.md. Write your first schema (typically 10–20 lines of YAML). Runarc compile. Addarc checkto CI. - Existing codebase with implicit contracts (Pydantic + TS interfaces + Zod + maybe OpenAPI). Run
arc init, then invoke thearc-adoptskill in Claude Code — it reads your existing types, drafts matching.arc/contracts, and runsarc checkto surface whatever drift the four-files-per-contract sprawl was hiding. Manual review takes ~10–30 min for a typical service; the fastapi-react case study walks through a real example where Arc immediately surfaced 6 drifts on the as-is code. (Automatedarc init --inferis on the v0.3 roadmap.) - You have an AI agent and want it to be less wrong. Skip to the agent integration section above. The
arc-fixskill drives thearc_check → arc_explain → arc_propose_fix → arc_validate_patchloop to a clean check.
Arc's MCP server is plain JSON-RPC over stdio — any client that speaks the Model Context Protocol can connect. The five tool wire shapes are snapshot-locked in tests/mcp/__snapshots__/snapshots.test.ts.snap; a new client just needs to wire those into its prompt/tool surface.
If you want to extend Arc itself — new drift checks, new language extractors, new fix strategies — see CONTRIBUTING.md for the dev workflow and the standing review-agent sweep policy.
v0.1 is feature-complete. Two parallel 8-agent kaava sweeps (mid-phase and pre-ship) returned no critical findings; all high-severity findings have been addressed. 303 tests pass; tsc clean. See TODO.md for known-deferred polish items and plan/arc-v0.1-plan.md for the locked v0.1 contract.
