Security Supervisor

Security Supervisor System

Overview

CortexPrism implements a three-layer LLM-based access control system to protect sensitive data from unauthorized agent access. The security supervisor runs alongside the existing Parallax policy validator, adding an intelligent review layer specifically for sensitive data operations.

Architecture

┌─────────────────────────────────────────────────────────┐
│              Agent Tool Execution Flow                   │
└─────────────────────────────────────────────────────────┘

Agent requests sensitive data
        │
        ↓
┌─────────────────────────────────────────────────────────┐
│  Layer 1: Data Classification                            │
│  - Check sensitivity level of requested data             │
│  - Levels: PUBLIC, NORMAL, SENSITIVE, SECRET             │
│  - Pattern-based detection:                              │
│    • SECRET: passwords, API keys, tokens, SSNs,          │
│      credit cards, private keys                          │
│    • SENSITIVE: email, phone, addresses,                 │
│      confidential markers                                │
│    • Default: non-empty = sensitive                      │
└─────────────────────────────────────────────────────────┘
        │
        ├─→ PUBLIC/NORMAL → Allow (no gate)
        │
        └─→ SENSITIVE/SECRET ↓
                              
┌─────────────────────────────────────────────────────────┐
│  Layer 2: LLM Supervisor                                 │
│  - Fast model (Gemini 2.0 Flash, GPT-4o Mini)           │
│  - Decision caching (1-hour session TTL)                │
│  - Confidence scoring (0.0-1.0)                         │
│  - Reviews: agent intent, data sensitivity,             │
│    operational context, risk assessment                  │
│  - Automatic human escalation for low confidence        │
└─────────────────────────────────────────────────────────┘
        │
        ├─→ ALLOW (confidence > threshold) → grant access
        │
        └─→ DENY or low confidence ↓
                                     
┌─────────────────────────────────────────────────────────┐
│  Layer 3: Human Approval                                │
│  - CLI: Interactive color-coded prompt                  │
│    with reasoning, sample data preview                  │
│  - Web UI: Modal with supervisor reasoning,             │
│    data preview, approve/deny buttons                   │
│  - Temporary grant: 1-hour TTL per session+tool         │
│  - Timeout after 60s → auto-deny                        │
└─────────────────────────────────────────────────────────┘
        │
        ├─→ Approve → cache grant, allow access
        │
        └─→ Deny (or timeout) → reject access

Key Features

Data Classification Engine

Automatic sensitivity detection using pattern matching:

SECRET patterns: Passwords (password, passwd, pwd), API keys (sk-, api_key, token), SSNs (\d{3}-\d{2}-\d{4}), credit cards (\d{4}[\s-]\d{4}[\s-]\d{4}[\s-]\d{4}), private keys (-----BEGIN.*PRIVATE KEY-----)
SENSITIVE patterns: Email addresses, phone numbers, physical addresses, confidential markers (confidential, internal, proprietary)
Default approach: Non-empty data is assumed sensitive until classified otherwise

LLM Supervisor

Uses a fast, cheap model (Gemini 2.0 Flash or GPT-4o Mini) for rapid review
Decision caching per session (1-hour TTL) prevents repeated approval prompts
Confidence scoring — high confidence auto-approves; low confidence escalates to human
Cost optimization — cached decisions avoid repeated LLM calls
Configurable threshold (confidenceThreshold, default 0.7)

Human Approval Flows

CLI: Color-coded interactive prompts showing what data, why the agent wants it, and the supervisor's reasoning
Web UI: Modal dialog with sample data preview (truncated for privacy), approve/deny buttons
Temporary grants: Approved access cached for the session (1-hour TTL) to prevent approval fatigue
Timeout guard: 60-second timeout auto-denies if no human response

Gated Tools

The following tools trigger security supervisor review when accessing sensitive data:

Tool	Gate Condition
`memory_search`	Results classified as SENSITIVE or SECRET
`db_query`	Query targets tables with sensitivity columns
`browser`	Screenshot/snapshot may capture sensitive UI
`computer`	Screenshot may capture sensitive desktop content
`web_fetch`	Fetched content matches sensitive patterns

Database Sensitivity

Sensitivity metadata is stored across all databases:

Database	Tables with Sensitivity
`cortex.db`	`sessions`, `agents`
`memory.db`	`episodic_memory`, `semantic_memory`, `reflection_memory`, `graph_entities`
`lens.db`	`lens_events` (audit logs)

A one-time backfill migration classifies all existing data on first run.

Configuration

{
  "securitySupervisor": {
    "enabled": true,
    "provider": "google",
    "model": "gemini-2.0-flash",
    "cacheTTL": 3600,
    "confidenceThreshold": 0.7
  },
  "classification": {
    "levels": ["SECRET", "SENSITIVE", "NORMAL", "PUBLIC"],
    "customPatterns": [
      { "level": "SECRET", "pattern": "my-company-secret-\\d+", "description": "Internal secrets" }
    ]
  }
}

Configuration Options

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable/disable the security supervisor
`provider`	string	`"google"`	LLM provider for the supervisor model
`model`	string	`"gemini-2.0-flash"`	Fast model for review decisions
`cacheTTL`	number	`3600`	Decision cache TTL in seconds (1 hour)
`confidenceThreshold`	number	`0.7`	Minimum confidence for auto-approval
`classification.levels`	string[]	Default 4 levels	Custom classification levels
`classification.customPatterns`	object[]	`[]`	Additional regex patterns for classification

Web UI

The security supervisor is configurable in Settings → Security Supervisor tab:

Enable/disable toggle
Provider and model selection
Cache TTL slider
Classification level management
Custom pattern editor
Cache inspection (live decision cache entries)
Decision history browser

API endpoints:

GET /api/security/supervisor — current configuration
PUT /api/security/supervisor — update configuration
GET /api/security/supervisor/cache — inspect decision cache
DELETE /api/security/supervisor/cache — clear decision cache
GET /api/security/supervisor/history — review past decisions
GET /api/security/classification — classification configuration
PUT /api/security/classification — update classification settings
POST /api/security/classification/test — test classification on sample content

Uh oh!

Security Supervisor

Security Supervisor System

Overview

Architecture

Key Features

Data Classification Engine

LLM Supervisor

Human Approval Flows

Gated Tools

Database Sensitivity

Configuration

Configuration Options

Web UI

See Also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CortexPrism Wiki

Getting Started

Core Concepts