Skip to content

Code Sandbox

CortexPrism edited this page Jun 17, 2026 · 1 revision

Code Sandbox

CortexPrism executes code in isolated environments to protect the host system from potentially harmful or buggy code generated by LLMs.

Docker Runtime (Recommended)

docker run --rm \
  --network=none \
  --memory=256m \
  --cpus=0.5 \
  --pids-limit=64 \
  --security-opt=no-new-privileges \
  <image> <interpreter> /tmp/code.<ext>

Security Properties

  • No network access--network=none
  • Resource limits — 256MB memory, 0.5 CPU, 64 PIDs max
  • No privilege escalation--security-opt=no-new-privileges
  • Ephemeral — Container destroyed immediately after execution (--rm)
  • No host mounts — No filesystem access to the host machine

Limits

  • Timeout: 30 seconds
  • Max output: 64KB

Subprocess Fallback

When Docker is not available (docker info fails), CortexPrism falls back to direct subprocess execution. This provides less isolation but retains policy gating through the security validator.

gVisor Support

When gVisor is installed, runInDocker passes --runtime=runsc for kernel-level syscall filtering. getAvailableRuntime() auto-detects gVisor availability (cached result) and prefers it over plain Docker.

Supported Languages

Language Docker Image
Python python:3-slim
JavaScript node:20-slim
TypeScript node:20-slim (via tsx)
Bash bash:latest
Ruby ruby:3-slim
Go golang:1-slim
Rust rust:1-slim

Auto-Fix Loop

When code execution fails, CortexPrism can automatically fix and retry:

runInSandbox(code)
  → exit != 0?
     → LLM: "Fix this error: <stderr>\n\nCode:\n<code>"
     → extract code from LLM response
     → runInSandbox(fixedCode)
     → repeat up to maxRounds (default 4)

Enable with --fix flag on cortex run or configure per session.

CLI

cortex run script.py                    # Docker sandbox
cortex run script.py --no-sandbox       # Subprocess mode
cortex run script.py --fix              # Auto-fix on failure
cortex run script.py --fix --max-fix 6  # Up to 6 fix attempts

Agent Tool

The code_exec tool lets agents execute code in the sandbox. The tool description explicitly warns that:

  • The sandbox has NO access to host files or workspace
  • No package managers are available in the sandbox
  • Use file tools for all file operations

REST API

POST /api/code/exec
{
  "language": "python",
  "code": "print('hello')",
  "sandbox": true
}

See Also

Clone this wiki locally