Aegis

Agent Execution Governance with Incremental Scoring

Aegis is an open-source sensitivity-gating framework for Claude Code. It scores every tool action using three independent signals — the agent's own self-assessment, the skill/MCP author's declared risk, and context-aware historical user preferences — and requires confirmation when the combined score exceeds a threshold.

Who is this for?

Skill Builders

If you build Claude Code skills or MCP servers with real-world side effects (sending emails, executing trades, modifying infrastructure), using the Aegis framework gives your users granular, adaptive permission control without them having to choose between "confirm everything" and "approve everything."

Users

If you use Claude Code skills or MCP servers that take actions on your behalf, Aegis learns from your contextual feedback over time — approving routine actions faster while continuing to check in on the ones you find sensitive. You can also take skills you're already using and make them Aegis-compatible with the provided script!

Quick Start

git clone https://github.com/veezbo/aegis.git
cd aegis
./install.sh   # requires uv (https://docs.astral.sh/uv/)

This registers two hooks in ~/.claude/settings.json, installs the /flag feedback skill, and optionally configures the user preference signal if ANTHROPIC_API_KEY is available.

To uninstall: ./uninstall.sh

Demo

Try Aegis with a mock email management skill in 2 minutes:

cd demo
./start.sh

Then open Claude Code in the demo/ directory and try things like "any emails from alice?" or "reply to the cfo letting him know ill have the data by eod". Low-risk actions auto-approve; high-risk ones trigger confirmation. If you give Claude Code feedback, you can see it start auto-approving more or requiring confirmation more. See demo/README.md for details.

How It Works

Three independent signals are computed for every tool call and summed into a composite score:

Signal	Source	What it captures
S1 — Agent Self-Assessment	`[SENS:...]` in Bash description	Claude's own risk rating for this specific call
S2 — Skill-Declared Sensitivity	`action-sensitivity` in SKILL.md	The skill author's risk rating per action
S3 — Historical User Preferences	Embed + KNN history search + Haiku	Personalized, contextual scoring from your past decisions

Decision logic:

Composite ≥ threshold (default 1.0) → require confirmation
Otherwise → auto-approve

Example	S1	S2	S3	Composite	Decision	Why
Search emails	0.0	0.0	0.0	0.0	allow	Read-only, no risk from any signal
Read portfolio	0.1	0.1	0.0	0.2	allow	Low-risk read; agent and skill agree
Reply to email	0.2	0.5	0.3	1.0	ask	Skill and agent say this reply is moderate risk; user has previously indicated long replies as somewhat sensitive
Reply to email	0.1	0.5	0.1	0.7	ask	Skill says emails are always somewhat risky; user has previously indicated similar quick replies to friends should be auto-approved
Transfer funds	0.1	0.6	0.1	0.8	ask	Agent thinks it's routine, skill marks transfers as risky, but user has expressed auto-approving transfers to bills checking account
Transfer funds	0.1	0.6	0.8	1.5	ask	Agent thinks it's routine, skills marks transfers as risky, but user has expressed requiring confirmation for transfers away from savings account

S3 is skipped when no skill is active (plain Read/Edit/Bash calls) or when S1+S2 already resolve the decision. It requires ANTHROPIC_API_KEY; without it, S3 returns 0.0 and scoring relies on S1+S2 only.

flowchart TD
    TC["Tool call"] --> PLAN{"Plan mode?"}
    PLAN -->|yes| ALLOW_PLAN["allow"]
    PLAN -->|no| S["Compute S1 + S2 + S3"]
    S --> T{"composite >= threshold?"}
    T -->|yes| ASK["ASK"]
    T -->|no| ALLOW["ALLOW"]
    ASK --> LOG["Log + index"]
    ALLOW --> LOG
    LOG --> POST["PostToolUse: feedback context"]

Making Your Skill Aegis-Compatible

Run the converter on any existing SKILL.md:

./aegis_convert.py path/to/SKILL.md

This analyzes your skill, adds an action-sensitivity map to the frontmatter, and appends a standard Aegis section that tells Claude how to tag its calls. See examples/skills/deploy-manager/ for a before/after comparison on an infrastructure management skill.

You can also do it manually:

1. Add action-sensitivity to your frontmatter:

---
name: email-manager
description: Manage emails via API
action-sensitivity:
  search_emails: 0.0
  read_email: 0.1
  reply_email: 0.5
  send_email: 0.6
  delete_email: 0.6
  block_sender: 0.5
default-sensitivity: 0.5
---

These are custom fields — Claude Code ignores them, and Aegis reads them directly.

2. Add the Aegis section to the skill body (copy-paste as-is):

## Aegis

For every Bash tool call, prefix your `description` field with a sensitivity tag:

`[SENS:<score>|<action_name>|<reason>] <normal description>`

- `score`: 0.0-1.0, how risky this specific call is given its actual parameters
- `action_name`: the matching action name from the list above
- `reason`: what makes this instance more or less risky

Example: `[SENS:0.7|send_email|sending to external recipient] Send partnership proposal`

See demo/.claude/skills/email-manager/SKILL.md for a full working example.

Feedback Loop

Aegis learns from two feedback mechanisms:

Borderline auto-approvals (composite 0.7–0.99): The PostToolUse hook injects context so Claude can mention /flag if the user later indicates the decision was wrong.

/flag command: Report wrong decisions directly. Select which recent actions were incorrectly handled and explain why. This feedback is recorded in history and retrieved by S3 for future scoring of similar actions.

> /flag

Configuration

~/.claude/aegis.toml overrides config/default.toml:

[gate]
ask_threshold = 1.0       # composite sum that triggers "ask"

[signal3]
enabled = true            # requires ANTHROPIC_API_KEY
api_key = "sk-ant-..."
model = "claude-haiku-4-5"

[embeddings]
db_path = "~/.claude/aegis-embeddings.db"
retrieval_k = 10

[history]
path = "~/.claude/aegis-history.jsonl"
max_entries = 10000

Project Structure

aegis/
  hooks/
    pre_tool_use.py         # Signals 1-3, scoring, history logging
    post_tool_use.py        # Feedback context injection
    flag_cli.py             # CLI for /flag skill
    lib/                    # Scoring, skill parsing, embeddings, LLM assessor
  skills/
    flag/SKILL.md           # /flag feedback command
  examples/skills/
    example-trading/        # Demo skill with action-sensitivity
    deploy-manager/         # Before/after aegis_convert.py conversion
  demo/                     # Try Aegis with a mock email skill
  config/default.toml      # Default thresholds
  aegis_convert.py         # Skill migration script
  install.sh / uninstall.sh
  tests/

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aegis

Who is this for?

Skill Builders

Users

Quick Start

Demo

How It Works

Making Your Skill Aegis-Compatible

Feedback Loop

Configuration

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
config		config
demo		demo
examples/skills		examples/skills
hooks		hooks
skills/flag		skills/flag
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aegis_convert.py		aegis_convert.py
install.sh		install.sh
pyproject.toml		pyproject.toml
uninstall.sh		uninstall.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Aegis

Who is this for?

Skill Builders

Users

Quick Start

Demo

How It Works

Making Your Skill Aegis-Compatible

Feedback Loop

Configuration

Project Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages