Skip to content

cspergel/arc

Repository files navigation

Arc

Contract authority for AI-assisted codebases. Define your schemas and API contracts once in .arc.yaml. Arc generates types in Python, TypeScript, and Zod, fails CI when hand-written code drifts from those contracts, and exposes structured drift context to any coding agent via MCP.

The one-liner: Define contracts once. Arc generates the code, breaks the build on drift, and gives your AI agents verified truth.


Why

Without Arc, an API contract lives in four places: the Pydantic model on the backend, the TypeScript type on the frontend, the Zod validator at the edge, and the OpenAPI schema in CI. Agents reading the codebase have no anchor — each file looks like a hand-typed copy of the others. When they drift, agents make plausible-but-wrong edits.

With Arc, the contract is one file. Everything else is generated. An agent that touches the codebase first reads the contract, then verifies its own diff against the contract before committing. Arc owns truth; agents own inference and editing.

              before                                  after
   ┌──────────────────────────┐         ┌──────────────────────────┐
   │ backend/models/task.py   │         │ .arc/schemas/task.arc.yaml │  ←  the contract
   │ frontend/types/task.ts   │         │                          │
   │ shared/schemas/task.zod  │   →     │ (everything else generated)│
   │ docs/openapi/task.yaml   │         │                          │
   └──────────────────────────┘         └──────────────────────────┘
   four files, hand-maintained          one file, source of truth

Who this is for

Arc helps three kinds of teams:

  • Teams using AI coding agents (Claude Code, Cursor, Cline, Continue, etc.). Your agent reads the Arc contract, verifies its own diff against arc_validate_patch, and stops making plausible-but-wrong edits. The smoke benchmark shows Haiku going from 67% → 100% task success when Arc tools are attached — and the failures it prevents are the canonical "I think I fixed it" false positives.
  • Solo devs maintaining typed APIs. Replace the four-files-per-contract grind (Pydantic + TS interface + Zod + OpenAPI) with one source of truth. arc compile regenerates downstream artifacts; arc check fails CI when anything drifts.
  • Open-source maintainers reviewing PRs. Run arc check in CI; reviewers see structured drift output instead of having to manually cross-check that a backend change kept the TS types and Zod validators consistent. arc diff flags contract-weakening explicitly (nullable: false → true, required → optional, enum value drops) so silent API regressions can't slip past review.

If your codebase has typed cross-language API boundaries and AI agents read it, Arc is for you.


Install

npm i -g arc-contract     # or pnpm add -g arc-contract

The binary is arc. Confirm with arc --version.

The Node-side codegen toolchain (json-schema-to-typescript, json-schema-to-zod, prettier) ships bundled — arc compile works out of the box for TypeScript / Zod outputs.

If you generate Pydantic models or want Python formatting, also install the Python tools. After arc init scaffolds your project, a pinned .arc/requirements.txt is included for one-line setup:

pip install -r .arc/requirements.txt
# or, for isolated installs (recommended):
pipx install datamodel-code-generator
pipx install ruff

Commit .arc/requirements.txt alongside the rest of .arc/ — all team members run pip install -r .arc/requirements.txt once. Arc emits a clear hint pointing at the same file if either tool is missing when you run arc compile.

Note: v0.1 ships with 0.1.0-dev until the first public release. Until then, clone the repo and pnpm build && pnpm link --global for local use.

Quick start

arc init → compile → drift → check → recompile demo

In an existing project:

arc init                                    # scaffold .arc/ with one example schema + component
arc compile                                 # generate Python / TypeScript / Zod / JSON Schema artifacts
arc check                                   # verify the working tree against the contract

The GIF above is rendered by vhs from scripts/demo.tape. To regenerate it deterministically (and avoid vhs's flaky Windows ↔ ttyd bridge), use the bundled Dockerfile:

docker build -f scripts/demo.Dockerfile -t arc-demo-vhs .
docker run --rm -v "$PWD:/work" -w /work arc-demo-vhs scripts/demo.tape

arc check --json emits a machine-readable { drifts: Drift[] } payload — the same shape the MCP tool returns. Exit code is non-zero when any error-severity drift is present.

Five v0.1 drift checks

Arc compares the contract against the working tree and emits structured drift on:

  1. Bond resolution — every declared handler / client / test symbol exists.
  2. FastAPI signature — handler parameter and return types match the schema.
  3. Route matching@app.<method>("<path>") matches the contract's method + path.
  4. Import scopeforbidden_paths and depends_on are honored.
  5. Generated files currentSource-Hash and Generator-Hash headers match the contract + toolchain.
  6. Reference resolution — every ref: points at a real schema and field.

Each drift carries a stable id, a kind, the bond it touches, expected vs actual descriptions, and a fix hint. See docs/case-study-fastapi-react.md for a worked example against a real OSS demo target.


Agent integration

Arc's main value is structured drift context for agents. The MCP server exposes five tools:

Tool Purpose
arc_check run all drift checks, return the structured list
arc_explain given a drift id, return source/code excerpts, related schemas, dependency chain, why-it-matters
arc_propose_fix given a drift id, return a structured fix descriptor (strategy + target file/symbol + guidance) — not a patch
arc_validate_patch given a unified diff, run all checks against an isolated overlay; never mutates the working tree
arc_scope given a list of files, return the components, schemas, bonds, forbidden paths, and tests in scope

Start the server with arc mcp --root <project> (stdio). Setup guides for the common clients:

Claude Code skills

Two skills ship with the npm package, both at <install-dir>/skills/:

  • arc-fix — drive the arc_check → arc_explain → arc_propose_fix → arc_validate_patch loop to a clean check. Use after Arc is set up and the contract exists.
  • arc-adopt — read an existing codebase and draft .arc/ contracts matching its current types. Use to introduce Arc to a brownfield project.

To enable either in Claude Code, copy the skill directory into ~/.claude/skills/. The exact command depends on your package manager and OS:

POSIX (npm):

cp -r "$(npm root -g)/arc-contract/skills/." ~/.claude/skills/

POSIX (pnpm):

cp -r "$(pnpm root -g)/arc-contract/skills/." ~/.claude/skills/

POSIX (yarn classic):

cp -r "$(yarn global dir)/node_modules/arc-contract/skills/." ~/.claude/skills/

Windows (PowerShell, any package manager):

$src = (& npm root -g) + "\arc-contract\skills"   # or pnpm root -g
Copy-Item -Recurse "$src\*" "$env:USERPROFILE\.claude\skills\"

Both skills install in one command (the /. / \* trailing pattern copies all subdirectories under skills/). Claude Code surfaces each by name when its description matches the user's task.


Single-binary distribution

For environments without Node, scripts in scripts/build-binary.* produce a standalone executable via Bun (primary) or Node SEA (fallback). See those scripts' header comments — the WASM tree-sitter grammars need slightly different handling depending on platform.


Benchmark

The v0.1 benchmark question: does Arc make agents produce fewer contract-violation defects than the same model without Arc?

Three conditions × 10 fixed tasks × 3 model tiers = 90 runs. The headline credibility metric is the B − C delta — Arc vs. a structured-prompt control. B − A is the easy delta (just adding a workflow helps); B − C is the real test (does the tooling help beyond the prompt?).

Smoke results (current)

A 9-run smoke (3 tasks × 3 conditions × Claude Haiku 4.5) against the fastapi-react case-study target:

Condition Success Failure mode
B (Arc-enabled) 3/3 = 100% none
A (no Arc, generic prompt) 2/3 = 67% 1 false-positive completion
C (no Arc, structured workflow prompt) 2/3 = 67% 1 false-positive completion

Both no-Arc failures were the same false-positive on the read_root task: the agent edited the return value but not the return type annotation, then self-assessed as success. Arc-enabled agent re-ran arc check after editing, caught the gap, fixed it. The B − C delta of +33pp shows the tooling matters, not just the prompt.

Full breakdown: benchmarks/SMOKE_RESULTS.md.

Publish-quality run (coming)

The smoke is N=1 per cell. The publish run will fill out the matrix per plan/arc_benchmarking_plan.md:

  • 10 distinct drift kinds, not 3 instances of one
  • 3 model tiers (Opus / Sonnet / Haiku)
  • ≥3 reps per cell for variance
  • Real MCP server (not Bash shell-out)
  • Full arc_validate_patch convergence loop

Harness at benchmarks/. Tasks and prompts are frozen — don't tweak between runs. See benchmarks/README.md for the operator quick-start.


Using Arc in your project

Three doors in, picked by what you already have:

  1. Greenfield project, cross-language types from day one. Run arc init to scaffold .arc/ with one example schema + component. Read docs/format.md. Write your first schema (typically 10–20 lines of YAML). Run arc compile. Add arc check to CI.
  2. Existing codebase with implicit contracts (Pydantic + TS interfaces + Zod + maybe OpenAPI). Run arc init, then invoke the arc-adopt skill in Claude Code — it reads your existing types, drafts matching .arc/ contracts, and runs arc check to surface whatever drift the four-files-per-contract sprawl was hiding. Manual review takes ~10–30 min for a typical service; the fastapi-react case study walks through a real example where Arc immediately surfaced 6 drifts on the as-is code. (Automated arc init --infer is on the v0.3 roadmap.)
  3. You have an AI agent and want it to be less wrong. Skip to the agent integration section above. The arc-fix skill drives the arc_check → arc_explain → arc_propose_fix → arc_validate_patch loop to a clean check.

Building integrations / contributing

Arc's MCP server is plain JSON-RPC over stdio — any client that speaks the Model Context Protocol can connect. The five tool wire shapes are snapshot-locked in tests/mcp/__snapshots__/snapshots.test.ts.snap; a new client just needs to wire those into its prompt/tool surface.

If you want to extend Arc itself — new drift checks, new language extractors, new fix strategies — see CONTRIBUTING.md for the dev workflow and the standing review-agent sweep policy.


Project status

v0.1 is feature-complete. Two parallel 8-agent kaava sweeps (mid-phase and pre-ship) returned no critical findings; all high-severity findings have been addressed. 303 tests pass; tsc clean. See TODO.md for known-deferred polish items and plan/arc-v0.1-plan.md for the locked v0.1 contract.

License

Apache-2.0. See LICENSE and NOTICE.

About

Contract authority for AI-assisted codebases. Define schemas once; Arc generates code, verifies drift, and gives agents structured truth via MCP.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages