Governed action runtime for AI coding agents.
Ships with a 26-agent autonomous development swarm.
AgentGuard intercepts AI agent tool calls, enforces policies and invariants, and produces a verifiable execution trail. Traditional AI safety focuses on model behavior — AgentGuard enforces safety at the execution layer through deterministic governance of every action.
agent proposes action → policy evaluated → invariants checked → allow/deny → execute → events emitted
Install and activate governance in 30 seconds:
# 1. Install AgentGuard
npm install -g @red-codes/agentguard
# 2. (Optional) Install RTK for 60-90% token savings
# Homebrew (macOS/Linux):
brew install rtk
# Quick install (macOS/Linux):
curl -fsSL https://raw.githubusercontent.com/rtk-ai/rtk/refs/heads/master/install.sh | sh
# Windows: download from https://github.com/rtk-ai/rtk/releases (rtk-x86_64-pc-windows-msvc.zip)
# 3. Set up Claude Code hooks
agentguard claude-init
# 4. Check governance status
agentguard status
# ✓ Claude Code hooks
# ✓ Policy file (agentguard.yaml)
# ⚡ Token optimization rtk v0.30.0 (60-90% token savings)That's it. Start a Claude Code session and every tool call is now governed. Dangerous actions (push to main, write to .env, rm -rf, force push) are blocked before execution.
Try it on your own repo:
# Evaluate a sample action against the default policy
echo '{"tool":"Bash","command":"git push origin main"}' | agentguard guard --dry-run
# Start the runtime with a policy file
agentguard guard --policy agentguard.yaml
# Inspect the last run
agentguard inspect --lastAI coding agents execute file writes, shell commands, and git operations autonomously — but there's no governance layer between what an agent proposes and what actually runs. One bad tool call can push to main, leak secrets, or delete production files.
AgentGuard adds a deterministic decision point between proposal and execution:
- Safety policies — declare what agents can and cannot do in YAML
- Invariant enforcement — 21 built-in checks (secrets, protected branches, blast radius, skill/task protection, package script injection, lockfile integrity, CI/CD config, permission escalation, governance self-modification, container config, environment variables, recursive operations, large file writes, network egress, destructive migrations, transitive effect analysis, IDE socket access) run on every action
- Audit trail — every decision is recorded in structured SQLite, inspectable after the fact
- Session debugging — replay any agent session to see exactly what happened and why
AgentGuard evaluates every agent action through a governed action kernel:
- Normalize — Claude Code tool calls (Bash, Write, Edit, Read) are mapped to canonical action types (shell.exec, file.write, file.read)
- Evaluate — policies match against the action (deny git.push to main, deny destructive commands, enforce scope limits)
- Check invariants — 21 built-in safety checks run on every action
- Execute — if allowed, the action runs via adapters (file, shell, git handlers)
- Emit events — full lifecycle events sunk to SQLite for audit trail
AgentGuard Runtime Active
policy: agentguard.yaml | invariants: 21 active
✓ file.write src/auth/service.ts
✓ shell.exec npm test
✗ git.push main → DENIED (protect-main)
⚠ invariant violated: protected-branch
Policies are YAML or JSON files that declare what agents can and cannot do:
id: project-policy
name: Project Policy
severity: 4
rules:
- action: git.push
effect: deny
branches: [main, master]
reason: Protected branch
- action: git.force-push
effect: deny
reason: Force push not allowed
- action: file.write
effect: deny
target: .env
reason: No secrets modification
- action: file.read
effect: allow
reason: Reading is always safeDrop an agentguard.yaml in your repo root — the CLI picks it up automatically.
21 safety invariants run on every action evaluation:
| Invariant | Severity | Description |
|---|---|---|
| no-secret-exposure | 5 (critical) | Blocks access to .env, credentials, .pem, .key files |
| no-credential-file-creation | 5 (critical) | Blocks creation or modification of well-known credential files (SSH keys, cloud configs, auth tokens) |
| no-scheduled-task-modification | 5 (critical) | Prevents modification of scheduled task files |
| no-cicd-config-modification | 5 (critical) | Blocks writes to CI/CD pipeline configs (.github/workflows/, .gitlab-ci.yml, Jenkinsfile) |
| no-governance-self-modification | 5 (critical) | Prevents agents from modifying governance config (policy files, governance data) |
| protected-branch | 4 (high) | Prevents direct push to main/master |
| no-force-push | 4 (high) | Forbids force push |
| no-skill-modification | 4 (high) | Prevents modification of .claude/skills/ files |
| no-package-script-injection | 4 (high) | Blocks package.json modifications that alter lifecycle script entries |
| no-permission-escalation | 4 (high) | Catches chmod to world-writable, setuid/setgid, ownership changes |
| blast-radius-limit | 3 (medium) | Enforces file modification limit (default 20) |
| test-before-push | 3 (medium) | Requires tests pass before push |
| large-file-write | 3 (medium) | Enforces per-file size limit to prevent data dumps |
| no-container-config-modification | 3 (medium) | Protects Dockerfile, docker-compose.yml, .dockerignore |
| no-env-var-modification | 3 (medium) | Detects attempts to modify environment variables or shell profile files |
| no-destructive-migration | 3 (medium) | Flags writes to migration directories containing destructive DDL |
| no-network-egress | 4 (high) | Denies HTTP requests to non-allowlisted domains |
| transitive-effect-analysis | 4 (high) | Analyzes written files for downstream effects that would violate policy |
| recursive-operation-guard | 2 (low) | Flags find -exec, xargs combined with write/delete operations |
| lockfile-integrity | 2 (low) | Ensures package.json changes sync with lockfiles |
| no-ide-socket-access | 4 (high) | Blocks access to IDE socket files (vscode-ipc-*.sock) |
AgentGuard integrates with RTK (Rust Token Killer) to reduce token consumption by 60-90% on common CLI output. When RTK is installed, AgentGuard's shell adapter automatically rewrites commands through RTK after governance approval.
# Install RTK (optional but recommended)
npm install -g @anthropic-ai/rtk
# AgentGuard detects RTK automatically
agentguard status
# ⚡ Token optimization rtk v0.30.0 (60-90% token savings)How it works: After the kernel approves a shell command, the shell adapter passes it to rtk rewrite for a token-optimized equivalent. If RTK has a compact version (git, npm, cargo, tsc, docker, kubectl, etc.), it uses that. If not, the original command runs unchanged. This happens transparently — no configuration needed.
Token savings by category:
| Category | Commands | Savings |
|---|---|---|
| Tests | vitest, playwright, cargo test | 90-99% |
| Build | next, tsc, lint, prettier | 70-87% |
| Git | status, log, diff, add, commit | 59-80% |
| Package managers | pnpm, npm, npx | 70-90% |
| Infrastructure | docker, kubectl | 85% |
Control via environment variable: AGENTGUARD_RTK_ENABLED=true (default) or false to disable.
AgentGuard tracks repeated denials and invariant violations. If an agent repeatedly attempts blocked actions, the runtime escalates to lockdown — all actions denied until a human intervenes. See escalation state machine for the full detail.
Governance overhead is measured per component using vitest bench. All numbers below are p99 latencies from the benchmark suite (pnpm bench).
| Component | p99 Latency | Description |
|---|---|---|
| Policy evaluation | < 30µs | Single action against a mixed deny/allow policy |
| Invariant check (single) | < 10µs | One invariant against system state |
| Invariant suite (21 checks) | < 300µs | All 21 built-in invariants, clean state |
| Simulation (filesystem) | < 100µs | File write/delete impact prediction |
| Simulation (git) | < 50µs | Branch delete, push impact prediction |
| Full kernel loop | < 5ms | End-to-end: propose → normalize → evaluate → emit |
Key takeaway: The full governance pipeline adds < 5ms of overhead per agent action. Policy evaluation and invariant checking are sub-millisecond. These numbers are enforced by a CI regression gate that fails if any p99 exceeds 50ms.
Run benchmarks locally:
pnpm bench # Run all benchmarks (77 cases across 4 suites)
pnpm bench:report # Generate markdown report from resultsInstall globally: npm install -g @red-codes/agentguard
# === Governance ===
agentguard guard # Start governed action runtime
agentguard guard --policy <file> # Use a specific policy file (YAML/JSON)
agentguard guard --policy a --policy b # Compose multiple policies with precedence
agentguard guard --dry-run # Evaluate without executing actions
agentguard inspect [runId] # Show action graph and decisions for a run
agentguard inspect --last # Inspect most recent run
agentguard events [runId] # Show raw event stream for a run
agentguard analytics # Analyze violation patterns across sessions
agentguard status # Show current governance session status
agentguard audit-verify # Verify tamper-resistant audit chain integrity
# === Setup & Configuration ===
agentguard claude-init # Set up Claude Code hook integration
agentguard auto-setup # Auto-detect AgentGuard and configure Claude Code hooks
agentguard config show|get|set # Manage AgentGuard configuration
agentguard init <type> # Scaffold governance extensions or storage backends
agentguard demo # Interactive governance showcase
# === Adoption & Migration ===
agentguard adoption # Adoption metrics and onboarding status
agentguard learn # Interactive tutorials and learning paths
agentguard migrate # Migrate configuration between versions
# === Replay & Debug ===
agentguard session-viewer [runId] # Generate interactive HTML dashboard
agentguard replay --last # Replay a governance session timeline
agentguard replay --last --step # Step through events interactively
agentguard diff <run1> <run2> # Compare two governance sessions side-by-side
agentguard traces [runId] # Display policy evaluation traces for a run
agentguard simulate <action-json> # Simulate action and show predicted impact
# === Portability & CI ===
agentguard export <runId> # Export a governance session to JSONL
agentguard import <file> # Import a governance session from JSONL
agentguard ci-check <session> # Verify governance session for violations
agentguard evidence-pr # Attach governance evidence summary to a PR
# === Policy ===
agentguard policy validate <file> # Validate a policy file without starting the runtime
agentguard policy-verify <file> # Verify policy file structure and rules
# === Plugins ===
agentguard plugin list # List installed plugins
agentguard plugin install <path> # Install a plugin from a local path
agentguard plugin remove <id> # Remove a plugin by ID
agentguard plugin search [query] # Search for plugins on npm
# === Cloud ===
agentguard cloud connect <api-key> # Connect to AgentGuard Cloud
agentguard cloud status # Check cloud connection status
agentguard cloud events # Query governance events from cloud
agentguard cloud runs # List governance runs from cloud
agentguard cloud summary # Cloud analytics summary
agentguard cloud disconnect # Disconnect from cloud
# === Trust & Telemetry ===
agentguard trust # Manage policy and hook trust verification
agentguard telemetry # Manage telemetry enrollment and settings
agentguard help # Show all commandsAgentGuard integrates with Claude Code via inline hooks — not a separate daemon or background process. When a Claude Code session starts, AgentGuard's hooks fire on every tool call, routing each one through the governance kernel for policy and invariant evaluation before Claude Code executes it.
This design is intentional: no daemon to crash, no ports to manage, no IPC. Each hook invocation is self-contained — load policy, evaluate, respond, exit. If anything fails, the hook exits cleanly and Claude Code continues (fail-open).
agentguard claude-init # Set up Claude Code hooksThree hooks are installed:
| Hook | Purpose |
|---|---|
PreToolUse |
Governance enforcement — evaluates every tool call against policies and invariants, blocks denied actions |
PostToolUse |
Error monitoring — reports Bash stderr errors (informational only) |
SessionStart |
Build check + governance status display on session start |
How PreToolUse works:
Claude Code tool call → stdin (JSON) → AgentGuard kernel → stdout (deny) or silent (allow)
The kernel runs in evaluation-only mode (dryRun: true) — it checks policies and invariants but doesn't execute actions. Claude Code handles execution; AgentGuard only governs.
Tool call mapping:
| Claude Code Tool | AgentGuard Action |
|---|---|
| Write | file.write |
| Edit | file.write |
| Read | file.read |
| Bash | shell.exec (or git.push, git.commit if git command detected) |
| Glob | file.read |
| Grep | file.read |
See Hook Architecture for the full design, configuration options, and debugging guide.
Global hook installation (recommended):
For full coverage even when Claude Code starts from a parent directory, install hooks globally:
agentguard claude-init --global # Installs to ~/.claude/settings.jsonGlobal hooks use path-aware policy resolution — they walk up from the target file to find the nearest agentguard.yaml, so governance applies regardless of working directory.
AgentGuard also supports GitHub Copilot CLI via the same hook pattern:
agentguard copilot-init # Set up Copilot hooks
agentguard copilot-init --global # Global installation
agentguard copilot-init --remove # Remove hooksCopilot tool names (bash, view, edit, create, glob, grep) are normalized to AgentGuard's canonical action types, and all policies and invariants apply identically.
Connect to AgentGuard Cloud for centralized governance analytics across teams and repos:
agentguard cloud connect <api-key> # Store credentials
agentguard cloud status # Check connection
agentguard cloud events # Query governance events
agentguard cloud runs # List governance runs
agentguard cloud summary # Analytics summary
agentguard cloud disconnect # Remove credentialsCloud telemetry is opt-in and configured via agentguard cloud connect. Events, runs, and analytics are queryable from the CLI or via the MCP server's cloud tools.
AgentGuard ships with pre-built compliance policy packs:
# In your agentguard.yaml:
extends:
- soc2
- hipaa
- engineering-standards| Pack | Controls | Description |
|---|---|---|
| soc2 | CC6.1, CC6.6, CC7.1-7.2 | SOC 2 Type II access controls and change management |
| hipaa | 164.312(a)-(e) | HIPAA technical safeguards for PHI protection |
| engineering-standards | — | Balanced dev-friendly guardrails |
| ci-safe | — | Strict CI/CD pipeline protection |
| enterprise | — | Full enterprise governance |
| strict | — | Maximum restriction |
| open-source | — | OSS contribution-friendly defaults |
Policy packs are composable — list multiple in extends and they merge with your local rules taking highest precedence.
For enhanced secret detection beyond the built-in invariants:
npm install @red-codes/invariant-data-protectionAdds three invariants:
- no-pii-in-logs — Scans log-file writes for emails, SSNs, credit cards, phone numbers
- no-hardcoded-secrets — 3-layer detection: regex patterns → fingerprint matching → Shannon entropy analysis (18+ secret types)
- max-file-count-per-action — Limits batch operations to a configurable file count
AgentGuard ships with a 26-agent autonomous development swarm — the same one that builds AgentGuard itself. One command scaffolds the entire pipeline into your repo:
agentguard init swarmThis installs 39 skill definitions, governance hooks, and a configurable swarm manifest. Agents handle implementation, code review, CI triage, security audits, planning, docs sync, and more — all under full governance policy enforcement.
ROADMAP.md (you write strategy)
│
├── Planning Agent (daily) ─── reads roadmap, sets priorities
├── Coder Agent (2-hourly) ─── picks issues, implements, creates PRs
├── Code Review Agent (2h) ─── reviews PRs for quality
├── CI Triage Agent (hourly) ─ fixes failing CI
├── PR Merger Agent (2h) ───── auto-merges when gates pass
├── Security Audit (weekly) ── dependency + code scanning
├── Recovery Controller (2h) ─ self-healing, detects unhealthy state
└── ... 19 more agents across 5 tiers
Select which tiers to enable (core, governance, ops, quality, marketing), override cron schedules, and set behavioral thresholds in agentguard-swarm.yaml.
Full documentation: packages/swarm/README.md
Every action proposal, decision, and execution is recorded as JSONL:
.agentguard/events/<runId>.jsonl
Inspect with:
agentguard inspect --last # Action summary + event stream
agentguard events --last # Raw JSONL to stdout (pipe to jq, etc.)Agent Tool Call → AgentGuard Kernel → Policy + Invariants → allow / deny
│
┌────────────────────────┤
▼ ▼
Execution Adapter Event Stream
(file, shell, git) (JSONL audit trail)
Full kernel loop detail: docs/unified-architecture.md
This is a pnpm monorepo orchestrated by Turbo. Workspace packages live in packages/, applications in apps/.
packages/
├── core/src/ # @red-codes/core — Shared types, actions, hash, rng, execution-log
├── events/src/ # @red-codes/events — Canonical event model (schema, bus, store)
├── policy/src/ # @red-codes/policy — Policy evaluation, YAML/JSON loaders, composition
├── invariants/src/ # @red-codes/invariants — 21 built-in invariant definitions + checker
├── invariant-data-protection/src/ # @red-codes/invariant-data-protection — Data protection invariant plugin
├── kernel/src/ # @red-codes/kernel — Governed action kernel (orchestrator, AAB, decisions, simulation)
├── adapters/src/ # @red-codes/adapters — Execution adapters (file, shell, git, claude-code)
├── matchers/src/ # @red-codes/matchers — Structured matchers (Aho-Corasick, globs, sets)
├── storage/src/ # @red-codes/storage — SQLite storage backend (opt-in)
├── telemetry/src/ # @red-codes/telemetry — Runtime telemetry and logging
├── plugins/src/ # @red-codes/plugins — Plugin ecosystem (discovery, registry, sandboxing)
├── renderers/src/ # @red-codes/renderers — Renderer plugin system (TUI renderer)
├── swarm/src/ # @red-codes/swarm — Shareable agent swarm templates
└── telemetry-client/src/ # @red-codes/telemetry-client — Telemetry client (identity, signing, queue, sender)
apps/
├── cli/src/ # @red-codes/agentguard — CLI (published npm package)
│ ├── bin.ts # CLI entry point
│ ├── evidence-summary.ts # Evidence summary generator for PR reports
│ └── commands/ # guard, inspect, replay, export, import, simulate, ci-check, cloud, etc.
├── mcp-server/src/ # @red-codes/mcp-server — MCP governance server (14 governance tools)
├── vscode-extension/src/ # agentguard-vscode — VS Code extension
│ ├── extension.ts # Sidebar panels, file watcher, notifications
│ ├── providers/ # Tree data providers (run status, run history, recent events)
│ └── services/ # Event reader, notification formatter, diagnostics, violation mapper
crates/
└── kernel-core/ # Rust kernel (in development)
policies/ # Policy packs (YAML: ci-safe, engineering-standards, enterprise, hipaa, open-source, soc2, strict)
AgentGuard is published as three packages on npm:
| Package | Description | Install |
|---|---|---|
@red-codes/agentguard |
CLI -- the primary install for end users | npm install -g @red-codes/agentguard |
@red-codes/core |
Shared types, action definitions, utilities | npm install @red-codes/core |
@red-codes/events |
Canonical event model (schema, bus, store) | npm install @red-codes/events |
Building integrations? Install the library packages for typed access to AgentGuard's action model and event system:
npm install @red-codes/core @red-codes/eventsWork has started on a Rust implementation of the governance kernel in crates/kernel-core/. The long-term goal is to replace the TypeScript kernel with a native binary for lower latency and smaller footprint. The Rust kernel is not yet functional -- the TypeScript kernel remains the production implementation.
git clone https://github.com/AgentGuardHQ/agentguard.git
cd agentguard
pnpm install # Install dependencies
pnpm build # Build all packages (turbo build)
pnpm test # Run all tests (turbo test)| Document | Description |
|---|---|
| AgentGuard Spec | Governance runtime specification |
| Architecture | Governed action kernel model |
| Hook Architecture | Claude Code hook integration design |
| Agent Swarm | 26-agent autonomous development swarm |
| Roadmap | Technical roadmap and next steps |
| Event Model | Canonical event schema |
| Plugin API | Event sources and extension points |
| Contributing | How to contribute |
Found a bug, have a feature request, or want to contribute? Open an issue at:
github.com/AgentGuardHQ/agentguard/issues
Contributions are welcome -- see CONTRIBUTING.md for guidelines on submitting pull requests, writing invariants, creating policy packs, and building adapters.