-
-
Notifications
You must be signed in to change notification settings - Fork 34
Security Supervisor
scarecr0w12 edited this page Jun 18, 2026
·
1 revision
CortexPrism implements a three-layer LLM-based access control system to protect sensitive data from unauthorized agent access. The security supervisor runs alongside the existing Parallax policy validator, adding an intelligent review layer specifically for sensitive data operations.
┌─────────────────────────────────────────────────────────┐
│ Agent Tool Execution Flow │
└─────────────────────────────────────────────────────────┘
Agent requests sensitive data
│
↓
┌─────────────────────────────────────────────────────────┐
│ Layer 1: Data Classification │
│ - Check sensitivity level of requested data │
│ - Levels: PUBLIC, NORMAL, SENSITIVE, SECRET │
│ - Pattern-based detection: │
│ • SECRET: passwords, API keys, tokens, SSNs, │
│ credit cards, private keys │
│ • SENSITIVE: email, phone, addresses, │
│ confidential markers │
│ • Default: non-empty = sensitive │
└─────────────────────────────────────────────────────────┘
│
├─→ PUBLIC/NORMAL → Allow (no gate)
│
└─→ SENSITIVE/SECRET ↓
┌─────────────────────────────────────────────────────────┐
│ Layer 2: LLM Supervisor │
│ - Fast model (Gemini 2.0 Flash, GPT-4o Mini) │
│ - Decision caching (1-hour session TTL) │
│ - Confidence scoring (0.0-1.0) │
│ - Reviews: agent intent, data sensitivity, │
│ operational context, risk assessment │
│ - Automatic human escalation for low confidence │
└─────────────────────────────────────────────────────────┘
│
├─→ ALLOW (confidence > threshold) → grant access
│
└─→ DENY or low confidence ↓
┌─────────────────────────────────────────────────────────┐
│ Layer 3: Human Approval │
│ - CLI: Interactive color-coded prompt │
│ with reasoning, sample data preview │
│ - Web UI: Modal with supervisor reasoning, │
│ data preview, approve/deny buttons │
│ - Temporary grant: 1-hour TTL per session+tool │
│ - Timeout after 60s → auto-deny │
└─────────────────────────────────────────────────────────┘
│
├─→ Approve → cache grant, allow access
│
└─→ Deny (or timeout) → reject access
Automatic sensitivity detection using pattern matching:
-
SECRET patterns: Passwords (
password,passwd,pwd), API keys (sk-,api_key,token), SSNs (\d{3}-\d{2}-\d{4}), credit cards (\d{4}[\s-]\d{4}[\s-]\d{4}[\s-]\d{4}), private keys (-----BEGIN.*PRIVATE KEY-----) -
SENSITIVE patterns: Email addresses, phone numbers, physical addresses, confidential markers (
confidential,internal,proprietary) - Default approach: Non-empty data is assumed sensitive until classified otherwise
- Uses a fast, cheap model (Gemini 2.0 Flash or GPT-4o Mini) for rapid review
- Decision caching per session (1-hour TTL) prevents repeated approval prompts
- Confidence scoring — high confidence auto-approves; low confidence escalates to human
- Cost optimization — cached decisions avoid repeated LLM calls
- Configurable threshold (
confidenceThreshold, default 0.7)
- CLI: Color-coded interactive prompts showing what data, why the agent wants it, and the supervisor's reasoning
- Web UI: Modal dialog with sample data preview (truncated for privacy), approve/deny buttons
- Temporary grants: Approved access cached for the session (1-hour TTL) to prevent approval fatigue
- Timeout guard: 60-second timeout auto-denies if no human response
The following tools trigger security supervisor review when accessing sensitive data:
| Tool | Gate Condition |
|---|---|
memory_search |
Results classified as SENSITIVE or SECRET |
db_query |
Query targets tables with sensitivity columns |
browser |
Screenshot/snapshot may capture sensitive UI |
computer |
Screenshot may capture sensitive desktop content |
web_fetch |
Fetched content matches sensitive patterns |
Sensitivity metadata is stored across all databases:
| Database | Tables with Sensitivity |
|---|---|
cortex.db |
sessions, agents
|
memory.db |
episodic_memory, semantic_memory, reflection_memory, graph_entities
|
lens.db |
lens_events (audit logs) |
A one-time backfill migration classifies all existing data on first run.
{
"securitySupervisor": {
"enabled": true,
"provider": "google",
"model": "gemini-2.0-flash",
"cacheTTL": 3600,
"confidenceThreshold": 0.7
},
"classification": {
"levels": ["SECRET", "SENSITIVE", "NORMAL", "PUBLIC"],
"customPatterns": [
{ "level": "SECRET", "pattern": "my-company-secret-\\d+", "description": "Internal secrets" }
]
}
}| Field | Type | Default | Description |
|---|---|---|---|
enabled |
boolean | true |
Enable/disable the security supervisor |
provider |
string | "google" |
LLM provider for the supervisor model |
model |
string | "gemini-2.0-flash" |
Fast model for review decisions |
cacheTTL |
number | 3600 |
Decision cache TTL in seconds (1 hour) |
confidenceThreshold |
number | 0.7 |
Minimum confidence for auto-approval |
classification.levels |
string[] | Default 4 levels | Custom classification levels |
classification.customPatterns |
object[] | [] |
Additional regex patterns for classification |
The security supervisor is configurable in Settings → Security Supervisor tab:
- Enable/disable toggle
- Provider and model selection
- Cache TTL slider
- Classification level management
- Custom pattern editor
- Cache inspection (live decision cache entries)
- Decision history browser
API endpoints:
-
GET /api/security/supervisor— current configuration -
PUT /api/security/supervisor— update configuration -
GET /api/security/supervisor/cache— inspect decision cache -
DELETE /api/security/supervisor/cache— clear decision cache -
GET /api/security/supervisor/history— review past decisions -
GET /api/security/classification— classification configuration -
PUT /api/security/classification— update classification settings -
POST /api/security/classification/test— test classification on sample content
- Security — Parallax policy validator and overall security model
- Built-in Tools — Tool catalog with security gates documented
- Agent Loop — How the supervisor integrates into agent turn processing
CortexPrism — Open-source agentic AI harness · MIT License · Built with Deno 2.x + TypeScript
- Agent Loop
- Metacognition
- Memory System
- Skills System
- Sub-Agents
- Built-in Tools
- Code Intelligence
- Code Sandbox
- Cross-Agent Context Protocol
- Prompt Lab
- PKM Assistant
- Voice Pipeline
- Computer Use
- Browser Tool
- Git & GitHub
- Scheduler & Jobs
- Dashboard
- Observability
- A2A Protocol
- MCP Gateway
- Distributed Nodes
- Memori Checkpoints
- Eval System
- Workflow Engine
- Triggers
- Projects
- TUI
- Glossary
- Update System