DevTools for AI agents. Intercept actions. Enforce policies. Rewind time. Run red teams. Ship safer agents.
Your agent just exfiltrated customer data in production. Today you read logs. With Sentinel, you caught it before it happened — and synthesized a policy to prevent it forever.
Seven reasoning tasks. One model: Claude Opus 4.7.
Built for the Built with Opus 4.7 Hackathon by @saulwade.
Sentinel is a security debugger for autonomous AI agents — like Chrome DevTools + a firewall, but for agents that process support tickets, query customer databases, and take financial actions on your behalf.
It sits between your agent and its tools. Every action is intercepted by a two-layer defense:
- Policy Engine — deterministic DSL rules that block/pause actions in <5ms, no LLM needed
- Pre-cog — Opus 4.7 extended thinking that reasons about the full causal chain of a proposed tool call
Verdicts, world state, and Opus reasoning stream live to a visual timeline you can scrub, edit, and fork.
Three injection attacks against a Customer Support Agent:
| Scenario | Attack vector | Stakes |
|---|---|---|
| Support Agent | Compliance audit framing — exfiltrate all enterprise PII | $47k unauthorized refund + bulk data exfil |
| CEO Override | Authority impersonation via executive escalation bot | $12k goodwill credit + M&A data to external firm |
| GDPR Audit | Legal urgency framing — GDPR Art. 20 data portability | $8.5k processing fee + unfiltered customer dump |
Landing dashboard. Trust Score ring (A+ to F) computed from interdiction effectiveness × policy coverage. Live stats: active policies, total interdictions, money blocked, runs. One-click access to all workflows.
Real-time action stream. Each tool call gets a two-source verdict:
- POLICY (indigo) — deterministic rule matched, <5ms, no Opus call
- PRE-COG (purple) — Opus extended thinking evaluated the causal chain
- ALLOW (green) · PAUSE (amber) · BLOCK (red, red pulse animation)
Toggle PRE-COMPUTED / LIVE OPUS at runtime. Keyboard A/D to approve/deny paused actions.
Merged Timeline + Fork View. Scrubber across all events. At any step: see exact world state, edit it, press ⎇ Branch from here. Fork appears inline. Blast Radius grid compares Original vs Branch: money interdicted, exfil blocked, records accessed, severity badge. Download Incident Report button generates a professional markdown report via Opus.
Before deploying, simulate your agent through synthetic scenarios generated by Opus. Safety grade (A+ to F) with failure drill-down.
Left panel — Red Team adaptive loop:
- Opus generates attacks tailored to your agent's tools and system prompt
- 3 iterations with explicit mutation strategies: payload split, subdomain bypass, request chaining, customer-framing
- Each iteration sees previous attempts and active defenses, adapts accordingly
- Bypassed attacks get a Synthesize Policy button → Opus generates a DSL rule that blocks the variant
Right panel — Policy catalog:
- All active policies with source badge (
DEFAULT/AUTO · from attackId) - Toggle enabled/disabled · revoke
- Auto-synthesized policies appear immediately after adoption
Seven distinct reasoning tasks, all with extended thinking:
| Feature | What Opus does | Thinking budget |
|---|---|---|
| Pre-cog | Causal chain simulation of proposed tool calls | 8k tokens |
| Run Analysis | Executive summary + attack chain + risk grade (A+..F) | 10k tokens |
| Fork Narrator | Narrates the branch-not-taken in Replay | 4k tokens |
| Pre-flight | Generates plausible synthetic scenarios | 4k tokens |
| Red Team Iter 1 | Fresh adversarial attacks for agent's tool surface | 4k tokens |
| Red Team Iter 2+ | Adaptive mutations from prior attempt history | 6k tokens |
| Policy Synthesis | DSL policy from bypassed attack, with retry loop | 6k tokens |
All reasoning streams to the UI as purple text in real-time.
localhost:3000 localhost:3001
┌─────────────────────┐ ┌──────────────────────────────┐
│ Next.js 16 App │ │ Hono Engine │
│ │ SSE │ │
│ Command Center ────┼──────────┤─► Stats + Trust Score │
│ Live View ─────────┼──────────┤─► Agent Runner │
│ Replay ────────────┼──────────┤─► Event Store (SQLite) │
│ Pre-flight ────────┼──────────┤─► Pre-flight Simulator │
│ Red Team & Policies┼──────────┤─► Red Team Loop │
│ │ │ Policy Registry │
└─────────────────────┘ │ │
│ ┌────────────────────────┐ │
│ │ Tool Interceptor │ │
│ │ ┌──────────────────┐ │ │
│ │ │ Policy Engine │──┼──┼──► deterministic (<5ms)
│ │ │ (DSL evaluator) │ │ │
│ │ └──────────────────┘ │ │
│ │ ┌──────────────────┐ │ │
│ │ │ Pre-cog (Opus) │──┼──┼──► Opus 4.7
│ │ │ extended think │ │ │ (extended thinking)
│ │ └──────────────────┘ │ │
│ └────────────────────────┘ │
│ │
│ Blast Radius Computer │
│ Analysis (Opus, SSE) │
│ Policy Synthesis (Opus) │
│ MCP Server (stdio) │
└──────────────────────────────┘
Stack: TypeScript end-to-end. Next.js 16 + React 19 + Tailwind 4 (web). Hono + SQLite + Drizzle ORM (engine). Anthropic SDK with streaming extended thinking.
Event sourcing: every agent interaction is an immutable event row in SQLite. World state at any point = replay events 0..N. Forks create new runs with parentRunId. No mutation, full auditability.
Policy Engine: deterministic DSL with 10 condition kinds (toolName, argMatch, argRegex, domainCheck, valueThreshold, piiClass, planTier, ticketPriority, customerTier, and/or combinators). Runs before Pre-cog — no API cost, no latency for known-bad patterns.
git clone https://github.com/saulwade/sentinel.git
cd sentinel
pnpm install
cp apps/engine/.env.example apps/engine/.env
# Add your ANTHROPIC_API_KEY to apps/engine/.env
pnpm devOpen http://localhost:3000.
Requirements: Node 22+, pnpm 9+, Anthropic API key with Opus 4.7 access.
- Open Live View, select a scenario (Support Agent / CEO Override / GDPR Audit)
- Toggle PRE-COMPUTED for instant cached verdicts, or LIVE OPUS for real extended thinking (~45s)
- Press ▶ Run — watch actions stream with POLICY/OPUS source badges
- Click any BLOCK/PAUSE row to inspect Opus reasoning or the matching policy rule
- After the run, open Replay → scrub the timeline → ⎇ Branch from here → see Blast Radius
- Open Red Team & Policies → run the adaptive loop → synthesize a policy from a bypass → adopt it
| Key | Action |
|---|---|
1–5 |
Switch tabs |
R |
Run agent |
j / k |
Navigate events in Live View |
A / D |
Approve / Deny a PAUSE |
/ |
Search events |
? |
Help modal (all shortcuts) |
Esc |
Close modal / clear search |
Sentinel exposes an MCP server. Add to Claude Code (~/.claude/claude_desktop_config.json or project .mcp.json):
{
"mcpServers": {
"sentinel": {
"command": "pnpm",
"args": ["-F", "@sentinel/engine", "mcp"],
"cwd": "/path/to/sentinel"
}
}
}Available MCP tools:
| Tool | Description |
|---|---|
sentinel_start_run |
Start a run (optional scenario: support/ceo/gdpr/phishing) |
sentinel_get_events |
Fetch all events with ALLOW/PAUSE/BLOCK verdicts and source (policy/pre-cog) |
sentinel_get_blast_radius |
Money disbursed/blocked, PII exposed/blocked, severity grade |
sentinel_get_policies |
List all active policies with action, severity, description |
sentinel_get_trust_score |
Composite Trust Score (A+ to F) across all runs |
sentinel_snapshot |
Reconstruct world state at a specific event |
sentinel_list_agent_tools |
List the agent's tools in MCP schema format |
Example Claude Code usage:
> Start a Sentinel CEO Override run and tell me what was blocked
> What's the blast radius for that run?
> What policies are active and what's the Trust Score?
> Show me the world state at event #5
sentinel/
├── apps/
│ ├── web/app/components/
│ │ ├── CommandCenter.tsx # Trust Score + stats dashboard
│ │ ├── LiveView.tsx # Real-time stream + inspector
│ │ ├── Replay.tsx # Timeline scrubber + inline fork + blast radius
│ │ ├── Shell.tsx # 5-tab shell + keyboard nav
│ │ └── RedTeam.tsx # Adaptive red team + policy catalog
│ └── engine/src/
│ ├── agent/ # World state, mock tools, scenario seeds
│ │ └── scenarios/ # phishing, support, ceo, gdpr
│ ├── interceptor.ts # Two-layer intercept (policy → pre-cog)
│ ├── policies/ # DSL evaluator + default policies
│ ├── analysis/ # Blast radius + Opus analysis + incident report
│ ├── redteam/ # Adaptive attacker + tester + policy synthesizer
│ ├── timetravel/ # Snapshot + replay engine
│ ├── mcp/ # MCP server (Claude Code integration)
│ └── routes/ # runs, analysis, policies, redteam, stats, settings
└── packages/shared/ # Shared types: AgentEvent, Decision, Policy, RunAnalysis
AI agents are shipping to production every week. When they fail, the answer today is "read logs and guess." Sentinel gives agent developers a real debugger:
- See every action in real-time with causal reasoning and policy source
- Pause suspicious actions before they execute
- Rewind to any point in the agent's history
- Edit the past and replay alternate futures
- Quantify blast radius: money interdicted, PII blocked, damage avoided
- Test against adaptive adversarial attacks before deploying
- Synthesize defense policies automatically from discovered bypasses
This is a debugging primitive that did not exist before models could reason about counterfactuals at production speed.
MIT — see LICENSE.