Skip to content

Security

scarecr0w12 edited this page Jun 19, 2026 · 3 revisions

Security Model

CortexPrism uses a Parallax security model with a three-layer LLM-based access control system for protecting sensitive data from unauthorized agent access.

Architecture

Agent → Tool Intent → Policy Validator → (Sensitive?) → LLM Supervisor → Human Approval → Executor
                            │
                    [regex allow/deny rules]
                    [capability level (CPL)]
                    [optional human approval]

The system has two complementary security paths:

  1. Policy validation — all tool calls are evaluated against regex rules before execution
  2. LLM supervisor — sensitive data access triggers a 3-layer review (classification → LLM review → human approval)

Policy Validator

Every tool call an agent makes is evaluated before execution:

  1. The agent emits a tool intent (e.g. shell("rm -rf /tmp/cache"))
  2. The validator evaluates the intent against all active policy rules
  3. The intent is either approved, denied, or held for human approval

Default Deny Rules

Seeded on first cortex migrate:

Pattern Blocks
rm\s+-rf\s+/ Recursive delete from root
:\(\)\{.*\} Fork bomb patterns
dd\s+if=.*of=/dev/ Direct disk writes
chmod\s+777\s+/ World-write on filesystem root

Managing Rules

cortex policy list
cortex policy add "curl.*evil\.com" --kind shell --effect deny --reason "Blocked domain"
cortex policy check shell "rm -rf /etc"
cortex policy remove pol_abc123

Rules are evaluated by priority (ASC order). The first matching rule wins. If no rule matches, the default is to allow.

LLM Security Supervisor

For sensitive data access (memory search, database queries, browser screenshots, etc.), CortexPrism implements a 3-layer access control system:

Layer 1: Data Classification
  - Classify data as SECRET, SENSITIVE, NORMAL, or PUBLIC
  - Pattern-based detection (passwords, API keys, PII, SSNs, credit cards)
  - PUBLIC/NORMAL → allow immediately

Layer 2: LLM Supervisor
  - Fast model review (Gemini 2.0 Flash, GPT-4o Mini)
  - Decision caching per session (1-hour TTL) to reduce costs
  - Confidence scoring (0.0-1.0)
  - High confidence → auto-allow; low confidence → escalate to human

Layer 3: Human Approval
  - CLI: Color-coded interactive prompt with reasoning
  - Web UI: Modal dialog with sample data preview
  - Temporary grants (1-hour TTL per session + tool)
  - Timeout after 60s → auto-deny

Configuration

In ~/.cortex/config.json:

{
  "securitySupervisor": {
    "enabled": true,
    "provider": "google",
    "model": "gemini-2.0-flash",
    "cacheTTL": 3600,
    "confidenceThreshold": 0.7
  },
  "classification": {
    "levels": ["SECRET", "SENSITIVE", "NORMAL", "PUBLIC"],
    "customPatterns": []
  }
}

See Security Supervisor for the full architecture guide.

AES-256-GCM Vault

API keys and credentials are stored encrypted using:

  • AES-256-GCM symmetric encryption
  • PBKDF2 key derivation (100,000 iterations, SHA-256)
  • Passphrase supplied via CORTEX_VAULT_KEY env var (never stored on disk)
export CORTEX_VAULT_KEY="your-passphrase"
cortex vault store "openai-key" --service openai
cortex vault get "openai-key"

Once vaulted, credentials are removed from config.json plain text. Access logging is fire-and-forget to prevent failures from blocking credential retrieval. Usage limits, expiration, and allowed-agent enforcement are checked before decryption.

Cortex Lens (Audit Log)

Append-only audit log in lens.db tracking 35+ event types:

  • Every LLM call (provider, model, token counts, cost)
  • Every tool call (name, arguments, result, policy decision)
  • Every policy evaluation (rule matched, effect, reason)
  • Every security supervisor decision (classification, LLM review, human approval)
  • Session start/end events
  • MQM predictions, observations, and weight updates
  • Node events (connected, disconnected, heartbeat, directives)

Visible in the Web UI under the Activity tab and queryable via GET /api/lens/recent.

CPL (Capability Level)

YAML-based policy files defining capability boundaries:

version: 1
description: "Custom security policy"
rules:
  - kind: shell
    effect: deny
    pattern: "rm\\s+-rf\\s+/"
    reason: "Prevent recursive root delete"
    priority: 1
  - kind: domain
    effect: allow
    pattern: "api\\.github\\.com"
    reason: "Allow GitHub API access"
    priority: 10

Loaded via cortex policy or auto-loaded from .cortex/policy.yml. The CPL YAML editor in the Web UI supports live import with validation.

Code Sandbox Isolation

Code execution runs in ephemeral Docker/gVisor containers with:

  • No network access by default
  • CPU and memory resource limits
  • No host filesystem mounts
  • Container destroyed immediately after execution
  • gVisor (--runtime=runsc) for kernel-level syscall filtering when available

Subprocess fallback available for systems without Docker (less isolation, retains policy gating).

No Telemetry

CortexPrism collects no telemetry. No usage data, prompts, or credentials are ever sent to external servers. Data stays on your machine.

Known Limitations

  • Policy validator operates on intent strings — best-effort filter, not OS-level sandboxing
  • LLM prompt injection through untrusted content is a risk — review tool approvals carefully
  • Subprocess code execution has no container isolation — use Docker/gVisor for untrusted code
  • LLM supervisor adds latency (~200-500ms) and token costs per sensitive data access

Reporting Vulnerabilities

See the Security Policy for the responsible disclosure process.

LLM Vulnerability Scanner (#136)

POST /api/security/scan analyzes prompts and outputs for:

  • Prompt injection — system impersonation, instruction override, format injection
  • Data leaks — passwords, secrets, tokens, API keys in output
  • Destructive commands — DROP TABLE, rm -rf, shutdown
  • Unsafe patterns — curl-pipe-bash
  • Code injection — eval with user input
  • XSS vectors — innerHTML, dangerouslySetInnerHTML
  • SQL injection — concatenated SQL with request parameters

Returns findings with severity levels (critical, high, medium) and an overall risk score.

Credential Hygiene Monitor (#142)

GET /api/security/hygiene checks the vault for:

  • Duplicate credential names
  • Namespace conventions (suggested for api_key types)
  • Total count warnings (>50 credentials)

Returns a 0-100 hygiene score and categorized issues.

Zero-Trust Policy Generator (#274)

GET /api/security/policies/generate-allowlist generates path/domain allow-lists from enabled allow policy rules, suitable for ingress/egress firewall configuration.

Bulk Approval (#254)

POST /api/security/approvals/bulk accepts multiple request IDs with a single approve/deny action.

Sandbox Security Extensions

See Code Sandbox for:

  • Environment snapshotting (#79)
  • Bug reproduction manifests (#230)
  • Environment-as-code serialization (#232)
  • Workspace snapshots (#240)
  • Remote sandbox backends (E2B, Daytona) (#257)

Clone this wiki locally