Sentinel Gateway — Developer README

Sentinel Gateway is a secure middleware layer for AI agent deployments. It solves prompt injection the #1 LLM security risk (OWASP 2025) by structurally separating instruction channels from data channels. Every agent action requires a signed, scoped token issued at runtime. External content can never become an instruction regardless of what it says.

Built with Streamlit (UI) and FastAPI (agent API). Supports built-in Claude sessions, external agent integration, scheduled tasks, two-tier agent memory, key rotation, and a full audit log. Deployable on Replit with PostgreSQL or locally with SQLite.

Quick Start

pip install -r requirements.txt
playwright install chromium
python start.py

Streamlit UI: http://localhost:8501
FastAPI: http://localhost:8000
API docs: http://localhost:8000/docs

How It Works

Layer 1 — Input channel separation. The agent only sees tools that are in the token scope. Tools outside scope are never presented to the model — they don't exist from the model's perspective.

Layer 2 — Token-gated action enforcement. Every tool call is intercepted before execution. If the action is not in the token scope it is blocked at the infrastructure layer regardless of what the model decided.

External agents are the capability providers. Sentinel Gateway is the control plane — it validates tokens and enforces scope. Agents execute authorised actions with their own tools.

Agent Registration

Open the Agents tab in the UI at http://localhost:8501
Enter agent name, type, and scope ceiling
Click Register — copy the API key shown (displayed once only)
Use the key in all API requests: Authorization: Bearer <api_key>

Scope ceiling is the maximum scope your agent can ever request. At runtime you can request any subset but never exceed it.

Agent Role Templates

Agent Role Templates let you assign a persona and communication style to an agent at runtime. The template is injected at the top of the system prompt, ahead of the Gateway's operating rules, so the agent adopts the specified role without any relaxation of security enforcement.

Built-in roles

Eight roles are seeded automatically into the agent_roles table on first run:

Role	Purpose
Legal	Senior legal analyst — precise, clause-structured output, explicit risk flagging
Sales	Sales strategist — outcome-focused, value-framed communication
Marketing	Marketing professional — brand-aware, audience-led, channel-conscious
Human Resources	HR professional — empathetic, policy-aware, neutral on people matters
Administration	Administrative professional — structured, accuracy-first, friction-reducing
Customer Support	Support specialist — warm, plain-language, solution-focused
Software Development	Senior software engineer — technical, edge-case-aware, review-ready code
Analyst	Data analyst — evidence-based, uncertainty-quantified, structured findings

Using a role in the UI

In the Run tab, select an agent from the Agent dropdown as normal.
Select a role from the Agent Role dropdown directly below it. Select — None — to run without a role.
Issue the token. The selected role is locked to the session and persists across all follow-up messages until the session is cleared.

Browsing roles

The Agents tab shows a two-column layout. The right panel — Agent Role Templates — lets you browse all available roles and preview the full definition text for each one before assigning it to a session.

How role injection works

When a role is selected, its definition text is prepended to the base system prompt with a --- separator:

<role definition text>

---

You are an AI assistant operating inside Sentinel Gateway — a token-gated security middleware.
OPERATING RULES:
...

The Gateway operating rules always follow the role template and are never overridden by it. Security enforcement is unaffected.

Database schema

CREATE TABLE agent_roles (
    role_name  TEXT PRIMARY KEY,
    definition TEXT,
    created_at INTEGER
);

Custom roles can be inserted directly into this table and will appear in both the UI dropdown and the role browser immediately on the next page load.

Token Authority & Key Rotation

Sentinel Gateway uses a Unified Persisted Token Authority — a single Ed25519 signing key stored in the database and shared between the Streamlit UI and the FastAPI process. Tokens issued by either process are verifiable by the other.

Key rotation

Rotation promotes the current key to previous_key and generates a new signing key. The previous key remains valid for a 1-hour grace period so in-flight tokens are not orphaned.

Before rotation:  [current_key] verifies all tokens
After rotation:   [new_key] verifies new tokens
                  [previous_key] verifies tokens issued before rotation (grace: 60 min)
After grace ends: [previous_key] is retired — only [new_key] accepted

Token TTL and key rotation are independent checks. A token that was issued with the old key and is still within its TTL is accepted during the grace period. A token that has passed its own TTL is always rejected, regardless of the grace period.

Rotating via the UI

Open the Profile tab
Under Token Authority (admin only), click 🔄 Rotate Signing Key
A confirmation shows the grace period expiry time

Rotating via the API

POST /v1/rotate_key
Authorization: Bearer <api_key>

{
  "status": "rotated",
  "grace_until": "2025-01-01 13:00:00",
  "grace_seconds": 3600,
  "message": "New key active. Previous key valid until 13:00:00 (60 min grace period)."
}

Rotation is logged to the audit table with the grace period expiry timestamp.

Memory System

Sentinel Gateway provides a shared, persistent memory for agents — a two-tier store that all agents for an instance can read, with writes restricted to agents that hold the memory_write scope.

Two tiers

	Short-term	Long-term
Purpose	Operational context — last 7 days activity, next 7 days scheduled tasks	Persistent goals, preferences, and ongoing state
Char limit	3,000	10,000
Read	All agents — no scope required	All agents — no scope required
Write	`memory_write` scope required	`memory_write` scope required
Write strategy	Replace — agent writes full desired content	Replace — agent writes full desired content

How agents learn that memory exists

When an agent calls POST /v1/issue_token, the response includes a memory_status field at no extra cost:

{
  "payload": { ... },
  "signature": "...",
  "memory_status": {
    "short_term": "has_content",
    "long_term": "empty"
  }
}

The agent uses this to decide whether a GET /v1/memory call is worth making. If both tiers are "empty" the agent skips the read entirely — zero wasted tokens.

Reading memory

GET /v1/memory?type=short_term
GET /v1/memory?type=long_term
Authorization: Bearer <api_key>

No token required — all agents may read. type must be short_term or long_term.

{
  "type": "short_term",
  "content": "...",
  "updated_at": 1234567890,
  "updated_by": "SchedulerAgent",
  "char_limit": 3000
}

Writing memory

POST /v1/memory
Authorization: Bearer <api_key>

{
  "type": "short_term",
  "content": "Full replacement content here...",
  "token": { ...token object with memory_write in scope... }
}

The token must have memory_write in its scope. Each write replaces the full content — read first if you need to preserve existing content.

{
  "status": "ok",
  "type": "short_term",
  "chars": 842,
  "char_limit": 3000
}

If the content exceeds the character limit the write is rejected:

{
  "status": "error",
  "reason": "Content exceeds 3000 character limit (3247 chars)"
}

Viewing memory in the UI

The 🧠 Memory tab shows both tiers with:

Character usage bar (used / limit)
Full content (read-only display)
Last updated timestamp and the agent that wrote it
Clear button per tier (admin and user)

Built-in Claude memory_write tool

When running a built-in Claude session with memory_write in scope, Claude can call the memory_write tool directly:

Tool: memory_write
  type:    "short_term" | "long_term"
  content: "Full replacement content"

The tool confirms with the character count on success or returns an error if the limit is exceeded.

Database schema

CREATE TABLE user_memory (
    memory_type  TEXT PRIMARY KEY,   -- 'short_term' | 'long_term'
    content      TEXT DEFAULT '',
    char_limit   INTEGER DEFAULT 3000,
    updated_at   INTEGER DEFAULT 0,
    updated_by   TEXT DEFAULT ''
);

API Reference

All endpoints require Authorization: Bearer <your_agent_api_key>.

GET /

Health check.

{"service": "Sentinel Gateway", "version": "2.0", "status": "running"}

POST /v1/issue_token

Get a signed token for a specific instruction and scope.

Request:

{
  "prompt_id": "uuid-string",
  "scope": ["file_read", "web_read"],
  "expires_in": 600
}

Response:

{
  "payload": {
    "prompt_id": "...", "user_id": "...", "scope": [...],
    "issued_at": 1234567890, "expires_at": 1234568490, "nonce": "..."
  },
  "signature": "hex-string",
  "memory_status": {
    "short_term": "empty",
    "long_term": "has_content"
  }
}

memory_status is included at no extra cost so agents can decide whether to call GET /v1/memory without a separate round-trip.

POST /v1/submit_instruction

Register a verified instruction against a token.

Request:

{
  "instruction": "Read https://example.com and summarise it",
  "token": { ...token object from issue_token... }
}

Response:

{"status": "accepted", "prompt_id": "..."}

POST /v1/request_action

Execute a token-gated action. Token nonce is consumed on use.

Request:

{
  "prompt_id": "...",
  "action_type": "web_read",
  "action_params": {"url": "https://example.com"},
  "token": { ...token object... }
}

Response:

{"status": "permitted", "action": "web_read", "result": "..."}

GET /v1/memory

Read a memory tier. No scope restriction — all agents may read.

GET /v1/memory?type=short_term
GET /v1/memory?type=long_term

POST /v1/memory

Write a memory tier. Requires memory_write in token scope.

POST /v1/rotate_key

Rotate the signing key. Previous key valid for 1-hour grace period. Admin agents only — restrict at infrastructure level if needed.

GET /v1/roles

List all agent role templates.

GET /v1/audit

Retrieve recent audit log entries.

GET /v1/prompt/{prompt_id}

Retrieve a registered prompt by ID.

Full Flow — Python Example

import requests, uuid

BASE    = "http://localhost:8000"
API_KEY = "sgk-your-api-key-here"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# 1. Issue token — memory_status included in response
prompt_id = str(uuid.uuid4())
token_resp = requests.post(f"{BASE}/v1/issue_token", headers=HEADERS, json={
    "prompt_id": prompt_id,
    "scope": ["web_read", "memory_write"],
    "expires_in": 300
}).json()

token         = {"payload": token_resp["payload"], "signature": token_resp["signature"]}
memory_status = token_resp["memory_status"]

# 2. Read memory only if it has content
if memory_status["long_term"] == "has_content":
    mem = requests.get(f"{BASE}/v1/memory?type=long_term", headers=HEADERS).json()
    print("Long-term memory:", mem["content"])

# 3. Submit instruction
requests.post(f"{BASE}/v1/submit_instruction", headers=HEADERS, json={
    "instruction": "Read https://example.com and return the title",
    "token": token
})

# 4. Request action
result = requests.post(f"{BASE}/v1/request_action", headers=HEADERS, json={
    "prompt_id": prompt_id,
    "action_type": "web_read",
    "action_params": {"url": "https://example.com"},
    "token": token
}).json()

print(result["result"])

# 5. Update short-term memory
requests.post(f"{BASE}/v1/memory", headers=HEADERS, json={
    "type": "short_term",
    "content": "Fetched example.com title on 2025-01-01. Next: process report.",
    "token": token
})

Full Flow — JavaScript Example

const BASE    = "http://localhost:8000";
const API_KEY = "sgk-your-api-key-here";
const headers = { "Authorization": `Bearer ${API_KEY}`,
                  "Content-Type": "application/json" };
const promptId = crypto.randomUUID();

// 1. Issue token — memory_status included in response
const tokenResp = await fetch(`${BASE}/v1/issue_token`, {
  method: "POST", headers,
  body: JSON.stringify({ prompt_id: promptId,
                         scope: ["web_read", "memory_write"], expires_in: 300 })
}).then(r => r.json());

const token        = { payload: tokenResp.payload, signature: tokenResp.signature };
const memoryStatus = tokenResp.memory_status;

// 2. Read memory only if it has content
if (memoryStatus.long_term === "has_content") {
  const mem = await fetch(`${BASE}/v1/memory?type=long_term`, { headers }).then(r => r.json());
  console.log("Long-term memory:", mem.content);
}

// 3. Submit instruction
await fetch(`${BASE}/v1/submit_instruction`, {
  method: "POST", headers,
  body: JSON.stringify({ instruction: "Read https://example.com and return the title", token })
});

// 4. Request action
const result = await fetch(`${BASE}/v1/request_action`, {
  method: "POST", headers,
  body: JSON.stringify({ prompt_id: promptId, action_type: "web_read",
                         action_params: { url: "https://example.com" }, token })
}).then(r => r.json());

console.log(result.result);

// 5. Update short-term memory
await fetch(`${BASE}/v1/memory`, {
  method: "POST", headers,
  body: JSON.stringify({
    type: "short_term",
    content: "Fetched example.com title on 2025-01-01. Next: process report.",
    token
  })
});

Actions Reference

Action	Type	Requires
`file_read`	Real	Nothing
`file_write`	Real	Nothing
`file_list`	Real	Nothing
`file_delete`	Real	Nothing
`web_read`	Real	Nothing
`web_write`	Real (Claude) / Signal (agents)	`playwright install chromium`
`email_read`	Real	Gmail app password in Settings
`email_write`	Real	Gmail app password in Settings
`mark_calendar`	Real	Google Calendar OAuth in Settings
`calculate`	Real	`simpleeval` (included in requirements)
`query_database`	Real	Connection string in Settings — SELECT queries only
`schedule_task`	Permission signal	Nothing
`agent_talk`	Real	Nothing — `agent_url` → `agent_id` → `agent_name` priority
`memory_write`	Real	`memory_write` scope in token — reads require no scope

Settings Reference

Setting	Location	Purpose
Anthropic API Key	Sidebar	Required for Claude sessions and scheduled tasks
Gmail credentials	Settings tab	Enables `email_read` and `email_write`
Google Calendar OAuth	Settings tab	Enables `mark_calendar`
Database connection string	Settings tab	Enables `query_database` (SELECT only)
Screenshot save path	Settings tab	Directory for `web_write` screenshots — defaults to system temp directory if blank

agent_talk Usage

agent_talk connects to internal registered agents or external agents by URL. Priority order:

agent_url — POSTs {"message": "..."} to the URL and returns the response. Use for any agent reachable over HTTP.
agent_id — Looks up a registered agent by UUID. Works for active and inactive agents.
agent_name — Looks up an active registered agent by name.

# External agent
result = requests.post(f"{BASE}/v1/request_action", headers=HEADERS, json={
    "prompt_id": prompt_id,
    "action_type": "agent_talk",
    "action_params": {
        "agent_url": "http://localhost:9000/run",
        "message": "Summarise the latest report"
    },
    "token": token
}).json()

# Internal agent lookup
result = requests.post(f"{BASE}/v1/request_action", headers=HEADERS, json={
    "prompt_id": prompt_id,
    "action_type": "agent_talk",
    "action_params": {"agent_name": "DataBot"},
    "token": token
}).json()

Error Reference

Error	Reason
401 Unauthorized	Invalid or unregistered API key
403 Forbidden	Requested scope exceeds agent ceiling
`status: blocked` — Invalid signature	Token was tampered with or previous key grace period has expired
`status: blocked` — Token has expired	TTL elapsed — issue a new token
`status: blocked` — Nonce already used	Token was replayed — issue a new token
`status: blocked` — Not in scope	Action not in token scope
`status: blocked` — 'memory_write' not in token scope	Agent attempted memory write without `memory_write` scope
`status: error` — Content exceeds N character limit	Memory write rejected — content too long
400 — type must be 'short_term' or 'long_term'	Invalid `type` parameter on memory endpoints
404 — Prompt ID not found	`submit_instruction` was not called first
`[query_database] Blocked: only SELECT queries are permitted`	Write/DDL query attempted — only SELECT allowed
`[calculate] simpleeval not installed`	Run `pip install simpleeval`
`[agent_talk] External agent unreachable`	URL provided but agent did not respond — check endpoint
`[agent_talk] No registered agent found`	`agent_id` or `agent_name` not in registry or agent is inactive
`[memory_write] Content exceeds N character limit`	Built-in Claude memory_write tool — content too long

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
README.md		README.md
api.py		api.py
requirements.txt		requirements.txt
sentinel_gateway.py		sentinel_gateway.py

Folders and files

Latest commit

History

Repository files navigation

Sentinel Gateway — Developer README

Quick Start

How It Works

Agent Registration

Agent Role Templates

Built-in roles

Using a role in the UI

Browsing roles

How role injection works

Database schema

Token Authority & Key Rotation

Key rotation

Rotating via the UI

Rotating via the API

Memory System

Two tiers

How agents learn that memory exists

Reading memory

Writing memory

Viewing memory in the UI

Built-in Claude memory_write tool

Database schema

API Reference

GET /

POST /v1/issue_token

POST /v1/submit_instruction

POST /v1/request_action

GET /v1/memory

POST /v1/memory

POST /v1/rotate_key

GET /v1/roles

GET /v1/audit

GET /v1/prompt/{prompt_id}

Full Flow — Python Example

Full Flow — JavaScript Example

Actions Reference

Settings Reference

agent_talk Usage

Error Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages