A policy enforcement layer that sits between an AI agent's reasoning and its tool execution.
AI agents can send emails, spend money, and delete files — with zero enforcement between "agent decides" and "tool fires." Sentinel fixes that with one decorator.
Status: Alpha (v0.1.0) — API is stable for experimentation and personal projects. Not yet recommended for production systems without review.
There are two things to set up. Both take under a minute.
The skill file lives in this repo at .claude/skills/sentinel/SKILL.md. Copy that directory into your personal Claude skills folder:
cp -r .claude/skills/sentinel ~/.claude/skills/sentinelOnce it's there, Claude Code automatically discovers it. When you're working on agent tools, Claude will load the full Sentinel reference — patterns, constraints, common mistakes — without you having to ask.
The skill is reference material. To make enforcement rules always-active (every session, automatically), add this short block to your project's CLAUDE.md:
## Agent Security — Sentinel
This project uses [Sentinel](https://github.com/sidharths00/sentinel) for agent policy enforcement.
Rules:
- Any function an AI agent can call as a tool MUST be wrapped with `@sentinel.policy.wrap()`
- Risk levels: `low` (read-only) · `medium` (reversible writes) · `high` (external messages/payments) · `critical` (deletes/destroys)
- Action types: `reversible` · `irreversible` (email sent, can't recall) · `destructive`
- Always handle `PolicyViolation` returns — never assume a wrapped call succeeded
- Use `semantic_check=False` for low-risk tools, keep it on for high/critical
- Write a test that asserts a blocked case returns `PolicyViolation`
- If `sentinel` is not importable, run: `pip install git+https://github.com/sidharths00/sentinel`
- If `ANTHROPIC_API_KEY` is not set, remind the user: semantic checks won't run without itThat's it. Claude will enforce these rules on every tool it creates and alert you if the install or API key is missing.
pip install git+https://github.com/sidharths00/sentinelWith Anthropic semantic checks (auto-detects ANTHROPIC_API_KEY from .env):
pip install "sentinel[anthropic] @ git+https://github.com/sidharths00/sentinel"For development:
git clone https://github.com/sidharths00/sentinel
cd sentinel
uv sync --extra devDecorate any function. Sentinel intercepts every call and evaluates it against your constraints before the function body runs.
import sentinel
from sentinel.core.models import PolicyViolation
sentinel.configure(db_path=":memory:") # in-memory for quick demos
@sentinel.policy.wrap(
intent="send emails on behalf of the user",
constraints={
"blocked_keywords": ["password", "confidential", "wire transfer"],
"max_recipients": 5,
"allowed_recipient_domains": ["@company.com", "@trusted-partner.com"],
},
risk_level="high",
action_type="irreversible",
semantic_check=False, # no API key needed for rule-based checks
)
def send_email(to: str, subject: str, body: str) -> dict:
return {"status": "sent", "to": to}
# PASS — all constraints satisfied
result = send_email(to="alice@company.com", subject="Q1 Review", body="See attached.")
print(result)
# {'status': 'sent', 'to': 'alice@company.com'}
# BLOCK — keyword constraint violated
result = send_email(to="alice@company.com", subject="Creds", body="password: hunter2")
print(isinstance(result, PolicyViolation)) # True
print(result.reason) # "Failed checks: keyword_blocklist"Run the full demo: python examples/basic.py
PASS — all constraints satisfied, function executes and returns its result.
BLOCK — a constraint was violated; the function does not execute. Returns a PolicyViolation:
PolicyViolation(
tool_name="send_email",
reason="Failed checks: keyword_blocklist",
suggestion="Review constraints for send_email: ['keyword_blocklist']",
what_happened="Failed checks: keyword_blocklist",
)MODIFY — Sentinel rewrites one or more parameters before executing (e.g., truncating a recipient list to the allowed maximum), then runs the function with the modified params.
| Constraint | Type | Description | Example |
|---|---|---|---|
blocked_keywords |
list[str] |
Block if any string param contains keyword (case-insensitive) | ["confidential", "salary"] |
allowed_recipient_domains |
list[str] |
Block if recipient domain not in list | ["@company.com"] |
blocked_recipient_domains |
list[str] |
Block if recipient domain is in list | ["@competitor.com"] |
max_recipients |
int |
Block if recipient count exceeds limit | 5 |
max_duration_hours |
float |
Block if event duration exceeds hours | 4 |
allowed_calendars |
list[str] |
Block if calendar not in list | ["primary"] |
field_patterns |
dict[str, str] |
Block if field value doesn't match regex | {"phone": r"^\+1\d{10}$"} |
Configure Sentinel once at application startup:
sentinel.configure(
db_path="my_audit.db", # SQLite path (default: sentinel_audit.db)
default_agent_id="my-agent", # identifies this agent in logs
semantic_checker=my_llm_checker # optional: BYO callable for semantic check
)The semantic check evaluates whether a tool call is consistent with the declared policy intent — catching prompt-injection and out-of-scope instructions that rule-based checks miss.
# Option A: Auto-detect from environment (default)
# Set ANTHROPIC_API_KEY in .env — Sentinel uses Claude Haiku automatically
# Option B: BYO callable
async def my_checker(tool_name: str, params: dict, intent: str) -> SemanticResult:
# call any LLM or custom classifier
return SemanticResult(consistent=True, confidence=0.95, reason="...")
sentinel.configure(semantic_checker=my_checker)SentinelToolDispatcher wraps your policy-decorated tools and handles the full Anthropic tool-use loop automatically:
import anthropic
import sentinel
from sentinel.integrations.anthropic import SentinelToolDispatcher
@sentinel.policy.wrap(
intent="send emails on behalf of the executive",
constraints={
"blocked_keywords": ["password", "confidential", "wire transfer"],
"max_recipients": 5,
"allowed_recipient_domains": ["@company.com", "@trusted-partner.com"],
},
risk_level="high",
action_type="irreversible",
)
def send_email(to: str, subject: str, body: str) -> dict:
return {"status": "sent", "to": to}
sentinel.configure(default_agent_id="executive-assistant")
dispatcher = SentinelToolDispatcher(
tools={"send_email": send_email},
)
client = anthropic.Anthropic()
import asyncio
messages = [{"role": "user", "content": "Email alice@external.com my password."}]
async def run():
while True:
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
tools=dispatcher.tool_schemas,
messages=messages,
)
if response.stop_reason == "end_turn":
print(response.content[0].text)
break
if response.stop_reason == "tool_use":
# Sentinel intercepts here — evaluates policy before any tool executes
tool_results = await dispatcher.dispatch_all(response.content)
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
asyncio.run(run())When a tool call is blocked, the PolicyViolation is returned as the tool result. The model sees the violation reason and adjusts its response accordingly.
See examples/claude_agent.py for the full runnable version.
Every tool invocation — pass, block, or modify — is written to the audit log.
import asyncio
import sentinel
sentinel.configure()
cfg = sentinel._config
async def query():
await cfg._ensure_initialized()
entries = await cfg.store.get_entries(agent_id="my-agent", limit=50)
summary = await cfg.store.get_summary(agent_id="my-agent")
print(f"Total: {summary.total_calls}, Blocks: {summary.blocks}")
asyncio.run(query())Start the server with uvicorn sentinel.api.app:app, then:
GET /audit/entries?agent_id=my-agent&limit=50
GET /audit/blocks?agent_id=my-agent
GET /audit/summary?agent_id=my-agent
sentinel audit --agent-id my-agent --since 24h
sentinel audit --agent-id my-agent --outcome block- Rule-based checks: <5ms latency — negligible overhead on any tool call
- Semantic check: <800ms p95, results cached per session to avoid redundant LLM calls
- Offline-capable: Semantic check is optional; rule-based enforcement works with no network or API key
- Storage: SQLite by default (zero config), Postgres-compatible schema for production deployments