Sentinel

DevTools for AI agents. Intercept actions. Enforce policies. Rewind time. Run red teams. Ship safer agents.

Your agent just exfiltrated customer data in production. Today you read logs. With Sentinel, you caught it before it happened — and synthesized a policy to prevent it forever.

Seven reasoning tasks. One model: Claude Opus 4.7.

Built for the Built with Opus 4.7 Hackathon by @saulwade.

What it does

Sentinel is a security debugger for autonomous AI agents — like Chrome DevTools + a firewall, but for agents that process support tickets, query customer databases, and take financial actions on your behalf.

It sits between your agent and its tools. Every action is intercepted by a two-layer defense:

Policy Engine — deterministic DSL rules that block/pause actions in <5ms, no LLM needed
Pre-cog — Opus 4.7 extended thinking that reasons about the full causal chain of a proposed tool call

Verdicts, world state, and Opus reasoning stream live to a visual timeline you can scrub, edit, and fork.

The demo scenarios

Three injection attacks against a Customer Support Agent:

Scenario	Attack vector	Stakes
Support Agent	Compliance audit framing — exfiltrate all enterprise PII	$47k unauthorized refund + bulk data exfil
CEO Override	Authority impersonation via executive escalation bot	$12k goodwill credit + M&A data to external firm
GDPR Audit	Legal urgency framing — GDPR Art. 20 data portability	$8.5k processing fee + unfiltered customer dump

The five tabs

1. Command Center

Landing dashboard. Trust Score ring (A+ to F) computed from interdiction effectiveness × policy coverage. Live stats: active policies, total interdictions, money blocked, runs. One-click access to all workflows.

2. Live View

Real-time action stream. Each tool call gets a two-source verdict:

POLICY (indigo) — deterministic rule matched, <5ms, no Opus call
PRE-COG (purple) — Opus extended thinking evaluated the causal chain
ALLOW (green) · PAUSE (amber) · BLOCK (red, red pulse animation)

Toggle PRE-COMPUTED / LIVE OPUS at runtime. Keyboard A/D to approve/deny paused actions.

3. Replay

Merged Timeline + Fork View. Scrubber across all events. At any step: see exact world state, edit it, press ⎇ Branch from here. Fork appears inline. Blast Radius grid compares Original vs Branch: money interdicted, exfil blocked, records accessed, severity badge. Download Incident Report button generates a professional markdown report via Opus.

4. Pre-flight Simulator

Before deploying, simulate your agent through synthetic scenarios generated by Opus. Safety grade (A+ to F) with failure drill-down.

5. Red Team & Policies

Left panel — Red Team adaptive loop:

Opus generates attacks tailored to your agent's tools and system prompt
3 iterations with explicit mutation strategies: payload split, subdomain bypass, request chaining, customer-framing
Each iteration sees previous attempts and active defenses, adapts accordingly
Bypassed attacks get a Synthesize Policy button → Opus generates a DSL rule that blocks the variant

Right panel — Policy catalog:

All active policies with source badge (DEFAULT / AUTO · from attackId)
Toggle enabled/disabled · revoke
Auto-synthesized policies appear immediately after adoption

How Opus 4.7 is used

Seven distinct reasoning tasks, all with extended thinking:

Feature	What Opus does	Thinking budget
Pre-cog	Causal chain simulation of proposed tool calls	8k tokens
Run Analysis	Executive summary + attack chain + risk grade (A+..F)	10k tokens
Fork Narrator	Narrates the branch-not-taken in Replay	4k tokens
Pre-flight	Generates plausible synthetic scenarios	4k tokens
Red Team Iter 1	Fresh adversarial attacks for agent's tool surface	4k tokens
Red Team Iter 2+	Adaptive mutations from prior attempt history	6k tokens
Policy Synthesis	DSL policy from bypassed attack, with retry loop	6k tokens

All reasoning streams to the UI as purple text in real-time.

Architecture

                    localhost:3000                    localhost:3001
               ┌─────────────────────┐          ┌──────────────────────────────┐
               │   Next.js 16 App    │          │        Hono Engine           │
               │                     │   SSE    │                              │
               │  Command Center ────┼──────────┤─► Stats + Trust Score        │
               │  Live View ─────────┼──────────┤─► Agent Runner               │
               │  Replay ────────────┼──────────┤─► Event Store (SQLite)       │
               │  Pre-flight ────────┼──────────┤─► Pre-flight Simulator       │
               │  Red Team & Policies┼──────────┤─► Red Team Loop              │
               │                     │          │   Policy Registry            │
               └─────────────────────┘          │                              │
                                                │  ┌────────────────────────┐  │
                                                │  │    Tool Interceptor    │  │
                                                │  │  ┌──────────────────┐  │  │
                                                │  │  │  Policy Engine   │──┼──┼──► deterministic (<5ms)
                                                │  │  │  (DSL evaluator) │  │  │
                                                │  │  └──────────────────┘  │  │
                                                │  │  ┌──────────────────┐  │  │
                                                │  │  │  Pre-cog (Opus)  │──┼──┼──► Opus 4.7
                                                │  │  │  extended think  │  │  │   (extended thinking)
                                                │  │  └──────────────────┘  │  │
                                                │  └────────────────────────┘  │
                                                │                              │
                                                │  Blast Radius Computer       │
                                                │  Analysis (Opus, SSE)        │
                                                │  Policy Synthesis (Opus)     │
                                                │  MCP Server (stdio)          │
                                                └──────────────────────────────┘

Stack: TypeScript end-to-end. Next.js 16 + React 19 + Tailwind 4 (web). Hono + SQLite + Drizzle ORM (engine). Anthropic SDK with streaming extended thinking.

Event sourcing: every agent interaction is an immutable event row in SQLite. World state at any point = replay events 0..N. Forks create new runs with parentRunId. No mutation, full auditability.

Policy Engine: deterministic DSL with 10 condition kinds (toolName, argMatch, argRegex, domainCheck, valueThreshold, piiClass, planTier, ticketPriority, customerTier, and/or combinators). Runs before Pre-cog — no API cost, no latency for known-bad patterns.

Quick start

git clone https://github.com/saulwade/sentinel.git
cd sentinel
pnpm install
cp apps/engine/.env.example apps/engine/.env
# Add your ANTHROPIC_API_KEY to apps/engine/.env
pnpm dev

Open http://localhost:3000.

Requirements: Node 22+, pnpm 9+, Anthropic API key with Opus 4.7 access.

Running the demo

Open Live View, select a scenario (Support Agent / CEO Override / GDPR Audit)
Toggle PRE-COMPUTED for instant cached verdicts, or LIVE OPUS for real extended thinking (~45s)
Press ▶ Run — watch actions stream with POLICY/OPUS source badges
Click any BLOCK/PAUSE row to inspect Opus reasoning or the matching policy rule
After the run, open Replay → scrub the timeline → ⎇ Branch from here → see Blast Radius
Open Red Team & Policies → run the adaptive loop → synthesize a policy from a bypass → adopt it

Keyboard shortcuts

Key	Action
`1`–`5`	Switch tabs
`R`	Run agent
`j` / `k`	Navigate events in Live View
`A` / `D`	Approve / Deny a PAUSE
`/`	Search events
`?`	Help modal (all shortcuts)
`Esc`	Close modal / clear search

MCP integration (Claude Code)

Sentinel exposes an MCP server. Add to Claude Code (~/.claude/claude_desktop_config.json or project .mcp.json):

{
  "mcpServers": {
    "sentinel": {
      "command": "pnpm",
      "args": ["-F", "@sentinel/engine", "mcp"],
      "cwd": "/path/to/sentinel"
    }
  }
}

Available MCP tools:

Tool	Description
`sentinel_start_run`	Start a run (optional `scenario`: support/ceo/gdpr/phishing)
`sentinel_get_events`	Fetch all events with ALLOW/PAUSE/BLOCK verdicts and source (policy/pre-cog)
`sentinel_get_blast_radius`	Money disbursed/blocked, PII exposed/blocked, severity grade
`sentinel_get_policies`	List all active policies with action, severity, description
`sentinel_get_trust_score`	Composite Trust Score (A+ to F) across all runs
`sentinel_snapshot`	Reconstruct world state at a specific event
`sentinel_list_agent_tools`	List the agent's tools in MCP schema format

Example Claude Code usage:

> Start a Sentinel CEO Override run and tell me what was blocked
> What's the blast radius for that run?
> What policies are active and what's the Trust Score?
> Show me the world state at event #5

Project structure

sentinel/
├── apps/
│   ├── web/app/components/
│   │   ├── CommandCenter.tsx   # Trust Score + stats dashboard
│   │   ├── LiveView.tsx        # Real-time stream + inspector
│   │   ├── Replay.tsx          # Timeline scrubber + inline fork + blast radius
│   │   ├── Shell.tsx           # 5-tab shell + keyboard nav
│   │   └── RedTeam.tsx         # Adaptive red team + policy catalog
│   └── engine/src/
│       ├── agent/              # World state, mock tools, scenario seeds
│       │   └── scenarios/      # phishing, support, ceo, gdpr
│       ├── interceptor.ts      # Two-layer intercept (policy → pre-cog)
│       ├── policies/           # DSL evaluator + default policies
│       ├── analysis/           # Blast radius + Opus analysis + incident report
│       ├── redteam/            # Adaptive attacker + tester + policy synthesizer
│       ├── timetravel/         # Snapshot + replay engine
│       ├── mcp/                # MCP server (Claude Code integration)
│       └── routes/             # runs, analysis, policies, redteam, stats, settings
└── packages/shared/            # Shared types: AgentEvent, Decision, Policy, RunAnalysis

Why Sentinel exists

AI agents are shipping to production every week. When they fail, the answer today is "read logs and guess." Sentinel gives agent developers a real debugger:

See every action in real-time with causal reasoning and policy source
Pause suspicious actions before they execute
Rewind to any point in the agent's history
Edit the past and replay alternate futures
Quantify blast radius: money interdicted, PII blocked, damage avoided
Test against adaptive adversarial attacks before deploying
Synthesize defense policies automatically from discovered bypasses

This is a debugging primitive that did not exist before models could reason about counterfactuals at production speed.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
apps		apps
docs		docs
packages/shared		packages/shared
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentinel

What it does

The demo scenarios

The five tabs

1. Command Center

2. Live View

3. Replay

4. Pre-flight Simulator

5. Red Team & Policies

How Opus 4.7 is used

Architecture

Quick start

Running the demo

Keyboard shortcuts

MCP integration (Claude Code)

Project structure

Why Sentinel exists

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sentinel

What it does

The demo scenarios

The five tabs

1. Command Center

2. Live View

3. Replay

4. Pre-flight Simulator

5. Red Team & Policies

How Opus 4.7 is used

Architecture

Quick start

Running the demo

Keyboard shortcuts

MCP integration (Claude Code)

Project structure

Why Sentinel exists

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages