Sentinel Gateway is a secure middleware layer for AI agent deployments. It solves prompt injection the #1 LLM security risk (OWASP 2025) by structurally separating instruction channels from data channels. Every agent action requires a signed, scoped token issued at runtime. External content can never become an instruction regardless of what it says.
Built with Streamlit (UI) and FastAPI (agent API). Supports built-in Claude sessions, external agent integration, scheduled tasks, two-tier agent memory, key rotation, and a full audit log. Deployable on Replit with PostgreSQL or locally with SQLite.
pip install -r requirements.txt
playwright install chromium
python start.py- Streamlit UI:
http://localhost:8501 - FastAPI:
http://localhost:8000 - API docs:
http://localhost:8000/docs
Layer 1 — Input channel separation. The agent only sees tools that are in the token scope. Tools outside scope are never presented to the model — they don't exist from the model's perspective.
Layer 2 — Token-gated action enforcement. Every tool call is intercepted before execution. If the action is not in the token scope it is blocked at the infrastructure layer regardless of what the model decided.
External agents are the capability providers. Sentinel Gateway is the control plane — it validates tokens and enforces scope. Agents execute authorised actions with their own tools.
- Open the Agents tab in the UI at
http://localhost:8501 - Enter agent name, type, and scope ceiling
- Click Register — copy the API key shown (displayed once only)
- Use the key in all API requests:
Authorization: Bearer <api_key>
Scope ceiling is the maximum scope your agent can ever request. At runtime you can request any subset but never exceed it.
Agent Role Templates let you assign a persona and communication style to an agent at runtime. The template is injected at the top of the system prompt, ahead of the Gateway's operating rules, so the agent adopts the specified role without any relaxation of security enforcement.
Eight roles are seeded automatically into the agent_roles table on first run:
| Role | Purpose |
|---|---|
| Legal | Senior legal analyst — precise, clause-structured output, explicit risk flagging |
| Sales | Sales strategist — outcome-focused, value-framed communication |
| Marketing | Marketing professional — brand-aware, audience-led, channel-conscious |
| Human Resources | HR professional — empathetic, policy-aware, neutral on people matters |
| Administration | Administrative professional — structured, accuracy-first, friction-reducing |
| Customer Support | Support specialist — warm, plain-language, solution-focused |
| Software Development | Senior software engineer — technical, edge-case-aware, review-ready code |
| Analyst | Data analyst — evidence-based, uncertainty-quantified, structured findings |
- In the Run tab, select an agent from the Agent dropdown as normal.
- Select a role from the Agent Role dropdown directly below it. Select — None — to run without a role.
- Issue the token. The selected role is locked to the session and persists across all follow-up messages until the session is cleared.
The Agents tab shows a two-column layout. The right panel — Agent Role Templates — lets you browse all available roles and preview the full definition text for each one before assigning it to a session.
When a role is selected, its definition text is prepended to the base system prompt with a --- separator:
<role definition text>
---
You are an AI assistant operating inside Sentinel Gateway — a token-gated security middleware.
OPERATING RULES:
...
The Gateway operating rules always follow the role template and are never overridden by it. Security enforcement is unaffected.
CREATE TABLE agent_roles (
role_name TEXT PRIMARY KEY,
definition TEXT,
created_at INTEGER
);Custom roles can be inserted directly into this table and will appear in both the UI dropdown and the role browser immediately on the next page load.
Sentinel Gateway uses a Unified Persisted Token Authority — a single Ed25519 signing key stored in the database and shared between the Streamlit UI and the FastAPI process. Tokens issued by either process are verifiable by the other.
Rotation promotes the current key to previous_key and generates a new signing key. The previous key remains valid for a 1-hour grace period so in-flight tokens are not orphaned.
Before rotation: [current_key] verifies all tokens
After rotation: [new_key] verifies new tokens
[previous_key] verifies tokens issued before rotation (grace: 60 min)
After grace ends: [previous_key] is retired — only [new_key] accepted
Token TTL and key rotation are independent checks. A token that was issued with the old key and is still within its TTL is accepted during the grace period. A token that has passed its own TTL is always rejected, regardless of the grace period.
- Open the Profile tab
- Under Token Authority (admin only), click 🔄 Rotate Signing Key
- A confirmation shows the grace period expiry time
POST /v1/rotate_key
Authorization: Bearer <api_key>
{
"status": "rotated",
"grace_until": "2025-01-01 13:00:00",
"grace_seconds": 3600,
"message": "New key active. Previous key valid until 13:00:00 (60 min grace period)."
}Rotation is logged to the audit table with the grace period expiry timestamp.
Sentinel Gateway provides a shared, persistent memory for agents — a two-tier store that all agents for an instance can read, with writes restricted to agents that hold the memory_write scope.
| Short-term | Long-term | |
|---|---|---|
| Purpose | Operational context — last 7 days activity, next 7 days scheduled tasks | Persistent goals, preferences, and ongoing state |
| Char limit | 3,000 | 10,000 |
| Read | All agents — no scope required | All agents — no scope required |
| Write | memory_write scope required |
memory_write scope required |
| Write strategy | Replace — agent writes full desired content | Replace — agent writes full desired content |
When an agent calls POST /v1/issue_token, the response includes a memory_status field at no extra cost:
{
"payload": { ... },
"signature": "...",
"memory_status": {
"short_term": "has_content",
"long_term": "empty"
}
}The agent uses this to decide whether a GET /v1/memory call is worth making. If both tiers are "empty" the agent skips the read entirely — zero wasted tokens.
GET /v1/memory?type=short_term
GET /v1/memory?type=long_term
Authorization: Bearer <api_key>
No token required — all agents may read. type must be short_term or long_term.
{
"type": "short_term",
"content": "...",
"updated_at": 1234567890,
"updated_by": "SchedulerAgent",
"char_limit": 3000
}POST /v1/memory
Authorization: Bearer <api_key>
{
"type": "short_term",
"content": "Full replacement content here...",
"token": { ...token object with memory_write in scope... }
}The token must have memory_write in its scope. Each write replaces the full content — read first if you need to preserve existing content.
{
"status": "ok",
"type": "short_term",
"chars": 842,
"char_limit": 3000
}If the content exceeds the character limit the write is rejected:
{
"status": "error",
"reason": "Content exceeds 3000 character limit (3247 chars)"
}The 🧠 Memory tab shows both tiers with:
- Character usage bar (used / limit)
- Full content (read-only display)
- Last updated timestamp and the agent that wrote it
- Clear button per tier (admin and user)
When running a built-in Claude session with memory_write in scope, Claude can call the memory_write tool directly:
Tool: memory_write
type: "short_term" | "long_term"
content: "Full replacement content"
The tool confirms with the character count on success or returns an error if the limit is exceeded.
CREATE TABLE user_memory (
memory_type TEXT PRIMARY KEY, -- 'short_term' | 'long_term'
content TEXT DEFAULT '',
char_limit INTEGER DEFAULT 3000,
updated_at INTEGER DEFAULT 0,
updated_by TEXT DEFAULT ''
);All endpoints require Authorization: Bearer <your_agent_api_key>.
Health check.
{"service": "Sentinel Gateway", "version": "2.0", "status": "running"}Get a signed token for a specific instruction and scope.
Request:
{
"prompt_id": "uuid-string",
"scope": ["file_read", "web_read"],
"expires_in": 600
}Response:
{
"payload": {
"prompt_id": "...", "user_id": "...", "scope": [...],
"issued_at": 1234567890, "expires_at": 1234568490, "nonce": "..."
},
"signature": "hex-string",
"memory_status": {
"short_term": "empty",
"long_term": "has_content"
}
}memory_status is included at no extra cost so agents can decide whether to call GET /v1/memory without a separate round-trip.
Register a verified instruction against a token.
Request:
{
"instruction": "Read https://example.com and summarise it",
"token": { ...token object from issue_token... }
}Response:
{"status": "accepted", "prompt_id": "..."}Execute a token-gated action. Token nonce is consumed on use.
Request:
{
"prompt_id": "...",
"action_type": "web_read",
"action_params": {"url": "https://example.com"},
"token": { ...token object... }
}Response:
{"status": "permitted", "action": "web_read", "result": "..."}Read a memory tier. No scope restriction — all agents may read.
GET /v1/memory?type=short_term
GET /v1/memory?type=long_term
Write a memory tier. Requires memory_write in token scope.
Rotate the signing key. Previous key valid for 1-hour grace period. Admin agents only — restrict at infrastructure level if needed.
List all agent role templates.
Retrieve recent audit log entries.
Retrieve a registered prompt by ID.
import requests, uuid
BASE = "http://localhost:8000"
API_KEY = "sgk-your-api-key-here"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
# 1. Issue token — memory_status included in response
prompt_id = str(uuid.uuid4())
token_resp = requests.post(f"{BASE}/v1/issue_token", headers=HEADERS, json={
"prompt_id": prompt_id,
"scope": ["web_read", "memory_write"],
"expires_in": 300
}).json()
token = {"payload": token_resp["payload"], "signature": token_resp["signature"]}
memory_status = token_resp["memory_status"]
# 2. Read memory only if it has content
if memory_status["long_term"] == "has_content":
mem = requests.get(f"{BASE}/v1/memory?type=long_term", headers=HEADERS).json()
print("Long-term memory:", mem["content"])
# 3. Submit instruction
requests.post(f"{BASE}/v1/submit_instruction", headers=HEADERS, json={
"instruction": "Read https://example.com and return the title",
"token": token
})
# 4. Request action
result = requests.post(f"{BASE}/v1/request_action", headers=HEADERS, json={
"prompt_id": prompt_id,
"action_type": "web_read",
"action_params": {"url": "https://example.com"},
"token": token
}).json()
print(result["result"])
# 5. Update short-term memory
requests.post(f"{BASE}/v1/memory", headers=HEADERS, json={
"type": "short_term",
"content": "Fetched example.com title on 2025-01-01. Next: process report.",
"token": token
})const BASE = "http://localhost:8000";
const API_KEY = "sgk-your-api-key-here";
const headers = { "Authorization": `Bearer ${API_KEY}`,
"Content-Type": "application/json" };
const promptId = crypto.randomUUID();
// 1. Issue token — memory_status included in response
const tokenResp = await fetch(`${BASE}/v1/issue_token`, {
method: "POST", headers,
body: JSON.stringify({ prompt_id: promptId,
scope: ["web_read", "memory_write"], expires_in: 300 })
}).then(r => r.json());
const token = { payload: tokenResp.payload, signature: tokenResp.signature };
const memoryStatus = tokenResp.memory_status;
// 2. Read memory only if it has content
if (memoryStatus.long_term === "has_content") {
const mem = await fetch(`${BASE}/v1/memory?type=long_term`, { headers }).then(r => r.json());
console.log("Long-term memory:", mem.content);
}
// 3. Submit instruction
await fetch(`${BASE}/v1/submit_instruction`, {
method: "POST", headers,
body: JSON.stringify({ instruction: "Read https://example.com and return the title", token })
});
// 4. Request action
const result = await fetch(`${BASE}/v1/request_action`, {
method: "POST", headers,
body: JSON.stringify({ prompt_id: promptId, action_type: "web_read",
action_params: { url: "https://example.com" }, token })
}).then(r => r.json());
console.log(result.result);
// 5. Update short-term memory
await fetch(`${BASE}/v1/memory`, {
method: "POST", headers,
body: JSON.stringify({
type: "short_term",
content: "Fetched example.com title on 2025-01-01. Next: process report.",
token
})
});| Action | Type | Requires |
|---|---|---|
file_read |
Real | Nothing |
file_write |
Real | Nothing |
file_list |
Real | Nothing |
file_delete |
Real | Nothing |
web_read |
Real | Nothing |
web_write |
Real (Claude) / Signal (agents) | playwright install chromium |
email_read |
Real | Gmail app password in Settings |
email_write |
Real | Gmail app password in Settings |
mark_calendar |
Real | Google Calendar OAuth in Settings |
calculate |
Real | simpleeval (included in requirements) |
query_database |
Real | Connection string in Settings — SELECT queries only |
schedule_task |
Permission signal | Nothing |
agent_talk |
Real | Nothing — agent_url → agent_id → agent_name priority |
memory_write |
Real | memory_write scope in token — reads require no scope |
| Setting | Location | Purpose |
|---|---|---|
| Anthropic API Key | Sidebar | Required for Claude sessions and scheduled tasks |
| Gmail credentials | Settings tab | Enables email_read and email_write |
| Google Calendar OAuth | Settings tab | Enables mark_calendar |
| Database connection string | Settings tab | Enables query_database (SELECT only) |
| Screenshot save path | Settings tab | Directory for web_write screenshots — defaults to system temp directory if blank |
agent_talk connects to internal registered agents or external agents by URL. Priority order:
agent_url— POSTs{"message": "..."}to the URL and returns the response. Use for any agent reachable over HTTP.agent_id— Looks up a registered agent by UUID. Works for active and inactive agents.agent_name— Looks up an active registered agent by name.
# External agent
result = requests.post(f"{BASE}/v1/request_action", headers=HEADERS, json={
"prompt_id": prompt_id,
"action_type": "agent_talk",
"action_params": {
"agent_url": "http://localhost:9000/run",
"message": "Summarise the latest report"
},
"token": token
}).json()
# Internal agent lookup
result = requests.post(f"{BASE}/v1/request_action", headers=HEADERS, json={
"prompt_id": prompt_id,
"action_type": "agent_talk",
"action_params": {"agent_name": "DataBot"},
"token": token
}).json()| Error | Reason |
|---|---|
| 401 Unauthorized | Invalid or unregistered API key |
| 403 Forbidden | Requested scope exceeds agent ceiling |
status: blocked — Invalid signature |
Token was tampered with or previous key grace period has expired |
status: blocked — Token has expired |
TTL elapsed — issue a new token |
status: blocked — Nonce already used |
Token was replayed — issue a new token |
status: blocked — Not in scope |
Action not in token scope |
status: blocked — 'memory_write' not in token scope |
Agent attempted memory write without memory_write scope |
status: error — Content exceeds N character limit |
Memory write rejected — content too long |
| 400 — type must be 'short_term' or 'long_term' | Invalid type parameter on memory endpoints |
| 404 — Prompt ID not found | submit_instruction was not called first |
[query_database] Blocked: only SELECT queries are permitted |
Write/DDL query attempted — only SELECT allowed |
[calculate] simpleeval not installed |
Run pip install simpleeval |
[agent_talk] External agent unreachable |
URL provided but agent did not respond — check endpoint |
[agent_talk] No registered agent found |
agent_id or agent_name not in registry or agent is inactive |
[memory_write] Content exceeds N character limit |
Built-in Claude memory_write tool — content too long |
© 2025 Cumhur Murat Topbas, Sentinel Gateway. All rights reserved.