Skip to content

Code Sandbox

scarecr0w12 edited this page Jun 19, 2026 · 3 revisions

Code Sandbox

CortexPrism executes code in isolated environments to protect the host system from potentially harmful or buggy code generated by LLMs.

Docker Runtime (Recommended)

docker run --rm \
  --network=none \
  --memory=256m \
  --cpus=0.5 \
  --pids-limit=64 \
  --security-opt=no-new-privileges \
  <image> <interpreter> /tmp/code.<ext>

Security Properties

  • No network access--network=none
  • Resource limits — 256MB memory, 0.5 CPU, 64 PIDs max
  • No privilege escalation--security-opt=no-new-privileges
  • Ephemeral — Container destroyed immediately after execution (--rm)
  • No host mounts — No filesystem access to the host machine

Limits

  • Timeout: 30 seconds
  • Max output: 64KB

Subprocess Fallback

When Docker is not available (docker info fails), CortexPrism falls back to direct subprocess execution. This provides less isolation but retains policy gating through the security validator.

gVisor Support

When gVisor is installed, runInDocker passes --runtime=runsc for kernel-level syscall filtering. getAvailableRuntime() auto-detects gVisor availability (cached result) and prefers it over plain Docker.

Supported Languages

Language Docker Image
Python python:3-slim
JavaScript node:20-slim
TypeScript node:20-slim (via tsx)
Bash bash:latest
Ruby ruby:3-slim
Go golang:1-slim
Rust rust:1-slim

Auto-Fix Loop

When code execution fails, CortexPrism can automatically fix and retry:

runInSandbox(code)
  → exit != 0?
     → LLM: "Fix this error: <stderr>\n\nCode:\n<code>"
     → extract code from LLM response
     → runInSandbox(fixedCode)
     → repeat up to maxRounds (default 4)

Enable with --fix flag on cortex run or configure per session.

CLI

cortex run script.py                    # Docker sandbox
cortex run script.py --no-sandbox       # Subprocess mode
cortex run script.py --fix              # Auto-fix on failure
cortex run script.py --fix --max-fix 6  # Up to 6 fix attempts

Agent Tool

The code_exec tool lets agents execute code in the sandbox. The tool description explicitly warns that:

  • The sandbox has NO access to host files or workspace
  • No package managers are available in the sandbox
  • Use file tools for all file operations

Sandbox Configuration

Configurable via ~/.cortex/config.json:

{
  "sandbox": {
    "runtime": "docker",
    "languages": ["python", "javascript", "typescript", "bash", "ruby", "go", "rust"],
    "timeout": 30000,
    "memoryLimit": "512m",
    "outputLimit": 102400
  }
}

runtime: docker | gvisor (kernel-level syscall filtering via runsc) | subprocess.

REST API

Method Path Description
POST /api/code/exec Execute code in sandbox
GET /api/sandbox/config Sandbox configuration
PUT /api/sandbox/config Update sandbox config
GET /api/sandbox/images Docker image list
POST /api/sandbox/images/pull Pull Docker image
DELETE /api/sandbox/images/:id Remove Docker image

Config persistence: PUT /api/sandbox/config persists runtime, languages, timeout, memory/output limits.

See Also

Remote Backends (#257)

The sandbox supports multiple execution backends beyond Docker and subprocess:

Backend Availability Description
Docker Always Local Docker container (default)
Subprocess Always Native subprocess execution
gVisor Requires install Container-aware sandbox
E2B Requires E2B_API_KEY Cloud sandbox
Daytona Requires DAYTONA_API_KEY Dev environments

Backend availability is visible on the Settings page under Sandbox Backends. GET /api/sandbox/backends returns the full list with availability status.

Environment Snapshot (#79)

GET /api/sandbox/snapshot captures:

  • OS information (uname -a)
  • Deno version (deno --version)
  • Environment variables (Deno.env.toObject())

Bug Reproduction Studio (#230)

POST /api/sandbox/reproduce generates structured reproduction manifests:

{
  "manifest": {
    "kind": "reproduce",
    "issue": "description",
    "steps": ["step 1", "step 2", "step 3"],
    "sandbox": { "type": "docker", "image": "denoland/deno:latest" },
    "environment": { "deno": "deno 2.x" }
  }
}

Dev Environment as Code (#232)

GET /api/sandbox/env-as-code serializes the full environment config:

  • Sandbox type and image
  • Provider configurations (with key presence, not values)
  • Web auth settings

Workspace Snapshot (#240)

GET /api/sandbox/workspace-snapshot captures:

  • Working directory path
  • File tree (top 50 files with sizes and modification times)
  • Total file count and size
  • Session directory listing
  • Current git branch

Clone this wiki locally