Aegis

Aegis ensures bad decisions don't stick.

A system for preventing silent regressions in LLM-driven workflows.

If you are an AI coding agent (Claude Code, Cursor, Aider, etc.) helping a user install or integrate Aegis, read AGENTS.md instead — it has imperative install commands, integration templates you can paste, and the framing rules you must respect. The rest of this README is for the human evaluating whether to adopt Aegis.

What it is

Aegis is a constraint-based behavior harness for LLM systems. (In the broad sense: a verification environment, not an agent driver.)

For the domain-independent framework definition, see docs/framework.md. This repository is the reference implementation on the code-change domain: Aegis applied to file-write transitions proposed by any upstream agent (Cursor / Claude Code / Aider / your own).

Aegis does not write code, and does not tell the LLM how to write code. It only judges whether the code an LLM produces is allowed to stay. The LLM (or whichever code generator you wrap) keeps full agency over what to write; Aegis exercises agency only over what passes its gates.

It enforces a local closed loop: each proposed state transition is validated, and regressions are rejected.

Instead of optimizing model behavior, Aegis enforces explicit, verifiable constraints, ensuring that invalid or regressive states do not persist.

Why it exists

LLM systems fail in three ways that current tooling does not catch:

Multi-turn refactors accumulate regressions silently
LLM-described actions diverge from actual tool calls
Structural rules erode without anyone noticing

Aegis exists to make these failures visible and rejectable.

Core mechanism

Aegis controls state transitions:

Sₙ → Sₙ₊₁ is allowed only if all constraints are satisfied
and no regression is detected.

Otherwise, the system rolls back to Sₙ.

Cost-aware rollback is the only cross-iteration consistent criterion. Other checks (validation, structural constraints) act as local guards, not global direction signals.

What actually gets checked (V1.10, six layers)

Layer	Checks	Where it runs
Ring 0 — syntax	tree-sitter parse; ERROR / MISSING node → BLOCK	`aegis check`, MCP, pipeline
Ring 0.5 — structural signals	`fan_out` (unique imports), `max_chain_depth` (longest method chain) — numeric only	`aegis check`, MCP, pipeline
Cost regression	`sum(signals_after) > sum(signals_before)` → BLOCK / ROLLBACK	MCP (when `old_content` given), pipeline (every iter)
PlanValidator	path safety / scope / dangerous_path / virtual-FS simulation	`aegis pipeline run` only
Executor + Snapshot	atomic apply with backup-dir rollback	`aegis pipeline run` only
Stalemate / Thrashing detector	sequence-level; halts the loop with a named reason	`aegis pipeline run` only

aegis check and the MCP server expose the first three layers (single-file judgement); the multi-turn pipeline adds the last three (cross-iteration loop control).

Design principles

Do not write code; only judge code that is written
Do not tell the model what to write; only what cannot stay
Only reject what is verifiably bad
No automatic learning
No objective optimization

The first two are the load-bearing ones. They are why Aegis can wrap any code-generating agent (Cursor, Claude Code, Aider, your own pipeline) without becoming a competing one — Aegis exercises judge agency; the wrapped agent keeps author agency.

Guarantees

Regressions are detected and rolled back when constraints are defined
Invalid states are blocked at the validation layer
All decisions are recorded in a machine-readable trace

Non-goals

Aegis is not:

An AI agent
An optimizer
A self-improving system

These are deliberate design choices, not future work. Aegis will not evolve into any of these. See docs/post_launch_discipline.md for the explicit deferral list.

Scope

Aegis enforces correctness within a single execution loop, but does not adapt behavior across executions.

The loop is local, not global.

Quickstart

As of V1.10, Aegis is a single Rust workspace producing two binaries — aegis (CLI) and aegis-mcp (MCP stdio server). Zero Python at runtime.

Install

# Prerequisites: git + a Rust toolchain.
# If you don't have Rust:
#   curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
#   source "$HOME/.cargo/env"

git clone https://github.com/wei9072/aegis && cd aegis
cargo build --release --workspace

Install system-wide so aegis / aegis-mcp end up on $PATH:

cargo install --path crates/aegis-cli
cargo install --path crates/aegis-mcp     # optional — MCP server

Cross-platform release artifacts (Linux x86_64/aarch64, macOS x86_64/aarch64, Windows x86_64) ship via GitHub Releases — see V2.0 in docs/v1_rust_port_plan.md.

Static analysis (no LLM, no API key)

aegis check runs Ring 0 (syntax) + Ring 0.5 (structural signals — fan-out, max chain depth) on any supported source file:

aegis languages                       # list supported languages
aegis check path/to/file.py           # human-readable signals
aegis check path/to/file.py --json    # machine-readable

The intent: drop this into a pre-commit hook or CI gate so bad diffs never make it past the boundary you care about. See docs/integrations/ for paste-ready examples.

LLM-backed multi-turn pipeline

aegis pipeline run drives the Planner → Validator → Executor loop against your workspace. Provider config comes from environment variables — supports any OpenAI-compatible endpoint (OpenAI, OpenRouter, Groq):

export AEGIS_PROVIDER=openai          # or: openrouter | groq
export AEGIS_MODEL=gpt-4o-mini        # any model the provider exposes
export OPENAI_API_KEY=sk-...          # or AEGIS_API_KEY (provider-agnostic)

aegis pipeline run \
  --task "rename the foo helper to bar everywhere" \
  --root . \
  --max-iters 3

Every iteration prints a one-line summary (iter 0 [abc12345] plan=continuing patches=2 applied=true rolled_back=false); add --json for a machine-readable summary at the end. The loop stops when the planner declares done, signals stalemate, signals thrashing, or hits --max-iters. Cost-aware regression rollback fires automatically if structural signals get worse mid-loop.

MCP server (Cursor / Claude Code / your own agent)

aegis-mcp     # stdio JSON-RPC, MCP protocol 2025-06-18

Configure your MCP client per docs/integrations/mcp_design.md; the agent then calls validate_change(path, new_content, old_content?) mid-loop and gets a {decision, reasons, signals_after, …} verdict back. Pure observation — never coaches the agent (see Design principles).

Integrations

You're already using Cursor / Claude Code / Aider / Copilot / your own agent. Aegis is meant to be a side-channel enforcement layer that doesn't ask you to switch tools.

Boundary	Path	Status
Commit	Git pre-commit hook	✓ ready (5-line bash)
PR / merge	GitHub Action / CI gate	✓ ready (10-line YAML)
Agent decision	MCP server	✅ `validate_change` ready (`cargo install --path crates/aegis-mcp && aegis-mcp`)

Pick whichever boundary fits your workflow; you can stack them. Index + per-path detail: docs/integrations/.

Status

Layer	State	Notes
Execution Engine	✅	Pipeline + Executor + cost-aware regression rollback. Native Rust loop in `aegis-runtime::native_pipeline`.
Static analysis	✅	Ring 0 (syntax) + Ring 0.5 (`fan_out`, `max_chain_depth`) shared by `aegis check` + `aegis pipeline run` + `aegis-mcp validate_change`.
Decision Trace	✅	`DecisionTrace` + 10-value `DecisionPattern` + 5-value `TaskVerdict`; Python-era cross-model evidence in `docs/v1_validation.md`. Rust re-validation is gated on LLM API budget (V1.8).
MCP server	✅	`aegis-mcp` — hand-rolled JSON-RPC 2.0 over stdio; one tool: `validate_change` per `docs/integrations/mcp_design.md`.
Cross-model sweep harness	🟡	`aegis pipeline run` works scenario-by-scenario; batch sweep (`aegis sweep`) is V1.8 backlog — gated on API budget.
Feedback Layer	❌	Out of scope by design — see Non-goals and Critical Principle. Structurally enforced by `crates/aegis-decision/tests/contract.rs`.

Supported source languages (Ring 0 + Ring 0.5 signals)

Tier 2 multi-language support landed in V1.4–V1.7 of the Rust port (see docs/v1_rust_port_plan.md). With V1.10's Python deletion, every language listed below now gets both enforcement (Ring 0 + Ring 0.5 via aegis check) and the refactor half (aegis pipeline run against an OpenAI-compatible LLM provider). Run aegis languages for the live registry.

Language	Ring 0 syntax	Ring 0.5 fan-out	Ring 0.5 chain-depth	Extensions
Python	✅	✅	✅	`.py`, `.pyi`
TypeScript	✅	✅	✅	`.ts`, `.tsx`, `.mts`, `.cts`
JavaScript	✅	✅	✅	`.js`, `.mjs`, `.cjs`, `.jsx`
Go	✅	✅	✅	`.go`
Java	✅	✅	🟡	`.java`
C#	✅	✅	✅	`.cs`
PHP	✅	✅	✅	`.php`, `.phtml`, `.php5`, `.php7`, `.phps`
Swift	✅	✅	✅	`.swift`
Kotlin	✅	✅	✅	`.kt`, `.kts`
Dart	✅	✅	🟡	`.dart`
Rust	✅	✅	✅	`.rs`

🟡 = the default chain-depth walker under-counts on this language's AST shape; per-language overrides are the planned fix path (LanguageAdapter::max_chain_depth).

Adding a language is one Cargo dep + one adapter file under crates/aegis-core/src/ast/languages/ + one .scm query — checklist in docs/multi_language_plan.md#per-language-work-checklist.

Philosophy

If Aegis starts learning automatically, it has violated its own design.

License

MIT — see LICENSE.

V1.10 — Rust workspace, zero Python at runtime. Cross-platform release artifacts (Homebrew, npm, GitHub Releases) are templated under packaging/; activation is the V2.0 milestone in docs/v1_rust_port_plan.md.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.github/workflows		.github/workflows
crates		crates
docs		docs
examples/integration/mcp-server		examples/integration/mcp-server
packaging		packaging
templates		templates
.gitignore		.gitignore
.mcp.json		.mcp.json
2026-04-23-aegis-core-rs-ring0.md		2026-04-23-aegis-core-rs-ring0.md
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README.zh-TW.md		README.zh-TW.md
glama.json		glama.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aegis

What it is

Why it exists

Core mechanism

What actually gets checked (V1.10, six layers)

Design principles

Guarantees

Non-goals

Scope

Quickstart

Install

Static analysis (no LLM, no API key)

LLM-backed multi-turn pipeline

MCP server (Cursor / Claude Code / your own agent)

Integrations

Status

Supported source languages (Ring 0 + Ring 0.5 signals)

Philosophy

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Aegis

What it is

Why it exists

Core mechanism

What actually gets checked (V1.10, six layers)

Design principles

Guarantees

Non-goals

Scope

Quickstart

Install

Static analysis (no LLM, no API key)

LLM-backed multi-turn pipeline

MCP server (Cursor / Claude Code / your own agent)

Integrations

Status

Supported source languages (Ring 0 + Ring 0.5 signals)

Philosophy

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages