autodev

A tool-agnostic autonomous software-delivery system where agents implement, review, and merge, and only real decisions reach the human.

Read the design · The core loop · Three invariants · Glossary

This repo is a durable record of design and knowledge, not installable software. It describes an autonomous software-delivery system in which agents run implementation, review, and merge, only the genuine decisions are escalated to a human, and those decisions accumulate into a persistent Knowledge Graph that makes the next task smarter. The Orchestrator and its current Runtime (the concrete implementation is quarantined to a single doc, 10-runtime-mapping) are swappable parts above the line; the models, contracts, and decisions captured in these documents are the durable substrate below it.

The human's role is decision-maker. They do not write code. They settle only the real decisions that remain after the machine Gauntlet has filtered everything else, and each decision becomes knowledge so the same question never reaches them twice.

Who is this for?

Engineers designing autonomous or semi-autonomous delivery pipelines who need a tool-agnostic model rather than a vendor lock-in.
Teams that want agents to ship code unattended while keeping a hard, forgery-proof boundary between machine approval and human decision.
Architects interested in an Evergreen Decision Graph as a source of truth (SoT) that compounds over time, kept in plain markdown + git.
Anyone evaluating how to let an Agent and harness / coding tool run safely behind a Merge Gate without ceding judgment to a single model's "done" signal.

The core loop

Issue → implement (agent) → multi-model gauntlet (Claude review → other-model review → CI gate)
      → [safe lane] auto-merge  /  [real decision] escalate to human → decide → accrue as ADR
      → the next task's agent reads that ADR (compounding)

An issue is implemented by an Agent, then passes through a multi-model review Gauntlet (Claude review, then a different model's review, then the CI gate). On the safe lane it auto-merges; on a real decision it escalates to the human, who decides, and the decision is captured as an ADR. The Agent for the next task reads that ADR, which is the compounding effect.

Three invariants

Identity separation: the party that produces an approval (a review verdict) and the party that merges are physically separated. Merge credentials cannot forge an approval.
No guessing: when a real decision is required, the Agent stops and escalates to the human. It never proceeds on a guess.
Durable + compounding: every decision becomes a grep-able node in markdown + git, and the next task reads it before acting.

Read the design

Document	Contents
00-overview	Vision, goals, design principles, scope
01-architecture	End-to-end flow, components, trust boundaries
02-knowledge-graph	Evergreen Decision Graph: knowledge atoms, projection, maturity, visibility
03-review-gauntlet	Multi-model review Gauntlet, forgery-proof verdicts, loop limits
04-merge-gate	Ruleset, aggregated CI checks, risk classification, CODEOWNERS, tiers
05-escalation	Trigger policy, Decision Card, channels, answer paths
06-knowledge-loop	ADR lifecycle, Scribe, compounding metrics
07-unattended-ops	External dead-man-switch, quota backoff, kill-switch, daily caps
08-view-layer	Decision inbox, rendering, notifications
09-multi-repo-docs	Repo registry / onboarding, evolving docs
10-runtime-mapping	Abstract Runtime to current implementation (tool dependence lives only here), swap guide
11-access-and-search	Access control, sharing, search, and human views over git as SoT (federation tiers)
12-agents	Concrete agent roster: roles, models, triggers, and instruction prompts (build-ready)
13-project-lifecycle	Kickoff, design-first phase, milestone checkpoints, deliverables, sub-issue decomposition
14-reporting	Operator dashboard + client report (two audiences, one source)
15-quality-gates	Pervasive machine-checked Definition-of-Done as hard blocking gates
16-self-improvement	autodev improves autodev: observe -> propose -> gate -> measure
glossary	Glossary
roadmap	Improvement backlog
adr/	Records this design's own decisions in its own ADR format (self-dogfooding)

Principles

Tools are swappable, knowledge is durable: markdown + git is the source of truth. No vendor store ever becomes canonical.
completed != correct: a task-completion signal is not a quality signal. Quality gates (CI + independent review + human) are stacked explicitly.
Headless first: under unattended operations there may be no per-tool plugin (MCP and the like). Every path must be reachable via git / REST / CLI.
Visibility tiering: personal OSS and company-confidential split on a single frontmatter flag, and several chokepoints prevent leakage.
Start small, one at a time: drive the single safest lane end to end to build trust, then expand.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
adr		adr
docs		docs
sim		sim
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
roadmap.md		roadmap.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autodev

Who is this for?

The core loop

Three invariants

Read the design

Principles

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

autodev

Who is this for?

The core loop

Three invariants

Read the design

Principles

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages