Security-hardened AI task agent runtime
Zero-Trust sandboxing Β· AES-256-GCM encryption Β· Full action revert Β· Multi-provider LLM
AI task agents execute arbitrary code on your machine with minimal security guarantees. Standard runtimes store your API keys in plaintext, give agents unrestricted OS access, and offer no way to undo completed actions. A hallucinated rm -rf or a prompt-injected command runs unguarded. SecureClawAgent closes all three gaps.
Every action runs inside an ephemeral Docker container with a minimal capability grant β no network, no privilege escalation, destroyed immediately after execution. All user data β credentials, workspace files, action history β is AES-256-GCM encrypted at rest with an Argon2id-derived key that never touches disk. Any completed task β whether a single write or a multi-step pipeline β can be fully rolled back to its exact pre-execution state via git-backed snapshots and inverse-action semantics in an encrypted, HMAC-signed audit ledger.
Research: 78% of tested LLM agent frameworks are vulnerable to prompt injection causing
unintended filesystem changes. 0% of surveyed AI developer tools offer encryption of
stored user credentials. SecureClawAgent is the first to address all three dimensions:
isolation, encryption, and reversibility.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SecureClawAgent System β
β β
β ββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Web UI β β API Gateway (FastAPI) β β
β β React 18 /TS βββββΊβ JWT Auth β TOTP 2FA β Rate Limiting β Audit β β
β ββββββββββββββββ ββββββββββββββββββββββ¬ββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββ β
β β Task Orchestrator β β
β β Instruction β Planner (LLM) β Subtask Decomposer β Tool Router β β
β ββββββββββββββββββββ¬βββββββββββββββββββββββ¬βββββββββββββββββββββββββββ β
β β β β
β ββββββββββββββββββββΌβββββββββββ βββββββββΌββββββββββββββββββββββββββββ β
β β LLM Provider Layer β β Security Layer β β
β β Claude Β· OpenAI Β· Ollama β β Risk Classifier (BART-MNLI) β β
β β PII Scrub Β· Fallback Chain β β Injection Detector (regex+sem) β β
β βββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββ ββββββββββββββββββΌββββββββββββββββββββββββββββ β
β β Backup & Revert β β Sandbox Execution Layer β β
β β Engine ββββ€ Pre-snapshot β Container β Diff β Ledger β β
β β ββββββββββββββββββ β β βββββββββββββββββββββββββββββββββββββββββ β β
β β β Action Ledger β β β β Hardened Container (Alpine Linux) β β β
β β β SQLCipher+HMAC β β β β seccomp Β· cap-drop Β· non-root β β β
β β ββββββββββββββββββ β β β no-network Β· read-only rootfs Β· tmpfs β β β
β β ββββββββββββββββββ β β β 512MB limit Β· 1 core Β· 300s timeout β β β
β β β Git Snapshots β β β βββββββββββββββββββββββββββββββββββββββββ β β
β β ββββββββββββββββββ β ββββββββββββββββββββββββββββββββββββββββββββββ β
β ββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Encrypted Storage Layer β β
β β SQLCipher (AES-256-GCM) β VaultStore (encrypted file envelopes) β β
β β Argon2id key derivation (64MB, 3 iterations) β key never persisted β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. User submits task β JWT validated β prompt injection scan
2. LLM Planner decomposes instruction into ordered subtasks
3. Risk Classifier assigns tier: Read-Only / Low-Risk Write /
High-Risk Destructive / Network Egress
4. For each subtask:
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β a. SnapshotEngine: git stash + SHA-256 manifest β
β b. DockerExecutor: ephemeral container spawned β
β c. Tool executes inside sandbox β
β d. Container destroyed, stdout/stderr captured β
β e. Diff computed, inverse-action saved to ledger β
β f. Ledger entry HMAC-SHA256 signed β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
5. If High-Risk β UI pauses for user confirmation
6. Result Aggregator collects outputs, reports summary
7. Revert requested?
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β a. Ledger queried for target action(s) β
β b. Inverse-action executed in reverse order β
β c. State verified against pre-state manifest hash β
β d. Revert event appended to immutable audit trail β
βββββββββββββββββββββββββββββββββββββββββββββββββββ
Every shell command, file write, and API call runs in an ephemeral Docker container with a hardened Alpine Linux base image. The container is destroyed immediately after execution.
| Hardening Measure | Configuration |
|---|---|
| Seccomp profile | Default-deny syscall filter β blocks ptrace, mount, unshare, keyctl, bpf |
| Linux capabilities | All dropped; only CAP_SETUID/CAP_SETGID if strictly needed |
| User | Non-root uid=1000 β no sudo, no privilege escalation |
| Root filesystem | Read-only β only /workspace (bind mount) and /tmp (tmpfs) are writable |
| Network | none by default β per-domain allowlist required for any outbound access |
| Resource limits | 512 MB memory, 1.0 CPU core, 128 PIDs, 300s timeout |
| tmpfs | /tmp mounted rw,noexec,nosuid,size=64m |
| Container lifecycle | Created β run β force-remove β no lingering state |
All user credentials, workspace files, and the action ledger are encrypted before touching disk.
User Password
β
βΌ
Argon2id (64 MB, 3 iterations, 1 parallelism)
β
βΌ
256-bit Master Key ββββ never persisted to disk
β
βββΊ AES-256-GCM βββ all data at rest
β
βββΊ SQLCipher βββββ database encryption
β
βββΊ HKDF-SHA256 βββ HMAC signing sub-key
- Key derivation: Argon2id with 64 MB memory, 3 iterations, 16-byte random salt
- AEAD: AES-256-GCM provides both confidentiality and integrity
- Unique nonce: Fresh 96-bit random nonce per encryption β catastrophic to reuse
- File vault: Encrypted envelope per file; filename = SHA-256 of original path (metadata protection)
- Database: SQLCipher with
PRAGMA key, WAL mode, HMAC-SHA512 integrity per page
Every action is snapshot-prepped, diff-recorded, and revertible β including multi-step pipelines.
| Action Type | Revert Strategy |
|---|---|
| File write | Restore from pre-action git snapshot (byte-for-byte) |
| Shell command | Git stash pop workspace to pre-execution state |
| Code patch | Reverse diff via snapshot restore |
| API call | Compensating transaction (manual review flagged) |
| File read | No revert needed (idempotent) |
The ledger is an append-only, HMAC-SHA256-signed audit trail. Every entry is independently verifiable β tamper-evident compliance for SOC 2 Type II.
Primary: Claude (Anthropic)
β (rate limit / auth error)
βΌ
Fallback: GPT-4o (OpenAI)
β (rate limit / auth error)
βΌ
Fallback: Mistral 7B (local Ollama β privacy mode)
Before any external API call, a mandatory PII scrubber strips:
- API keys (
sk-...,sk-ant-...,ghp_...) - Email addresses, phone numbers, credit card numbers
- Password fields in config-like patterns
Two-layer defense before any LLM call:
- Regex layer β 20+ known injection patterns:
ignore previous instructions,<|im_start|>,[system]:,disregard prior directives, jailbreak phrases - Semantic layer β
sentence-transformers/all-MiniLM-L6-v2embeddings with cosine similarity > 0.75 threshold against known injection embeddings
Zero-shot classification via facebook/bart-large-mnli (407M params) with heuristic fallback:
| Risk Tier | Behavior |
|---|---|
| Read-Only | Auto-execute β no confirmation needed |
| Low-Risk Write | Auto-execute β logged for review |
| High-Risk Destructive | Pause β user must confirm before execution |
| Network Egress | Pause β domain must be on explicit allowlist |
- JWTs with configurable expiry, HMAC-SHA256 signed
- TOTP 2FA via
pyotp(RFC 6238) β compatible with Google Authenticator, Authy, etc. - Session tokens never written to disk β held in memory only, cleared on logout
- Password hashing via Argon2id (same parameters as key derivation)
secure-openclaw-clone/
βββ backend/
β βββ main.py # FastAPI entrypoint
β βββ core/
β βββ config.py # All settings from .env
β βββ api/
β β βββ app.py # FastAPI app factory + middleware
β β βββ routes/
β β βββ auth.py # Register, login, TOTP, 2FA
β β βββ tasks.py # CRUD + SSE streaming + orchestration
β β βββ revert.py # Revert + ledger export
β βββ auth/service.py # JWT + Argon2id + TOTP + sessions
β βββ background/runner.py # Async worker queue
β βββ db/database.py # SQLite/SQLCipher ORM + migrations
β βββ encryption/ # AES-256-GCM + Argon2id + VaultStore
β βββ knowledge/
β β βββ crawler.py # ArXiv scraper + dedup + brain update
β β βββ scheduler.py # Weekly crawl + job runner
β βββ llm/provider.py # Claude Β· OpenAI Β· Ollama + fallback
β βββ middleware/security.py # Rate limiter Β· auth Β· audit log
β βββ ml/models.py # Phi-3 planner Β· CodeT5+ Β· fine-tune
β βββ models/ # Pydantic: Task, Action, User, Ledger
β βββ orchestrator/engine.py # Agent loop: plan β execute β ledger
β βββ revert/engine.py # Inverse-action execution
β βββ sandbox/executor.py # Docker ephemeral container runner
β βββ security/
β β βββ injection_detector.py # Regex + semantic injection detection
β β βββ risk_classifier.py # BART-MNLI + heuristic risk tiers
β βββ snapshot/
β βββ engine.py # Git stash + SHA-256 manifest
β βββ ledger.py # SQLCipher ledger with HMAC
βββ frontend/
β βββ src/
β βββ App.tsx # Routes + providers
β βββ main.tsx # React root + QueryClient
β βββ components/
β β βββ AuthGuard.tsx # Protected route wrapper
β β βββ ErrorBoundary.tsx # Crash boundary
β β βββ Layout.tsx # Sidebar + nav
β β βββ Toast.tsx # Notification system
β βββ hooks/useAuth.tsx # Auth context + provider
β βββ lib/
β β βββ api.ts # API client + SSE streaming
β β βββ utils.ts # cn() classname utility
β βββ pages/
β βββ Dashboard.tsx # Task list
β βββ Login.tsx # Login + TOTP flow
β βββ NewTask.tsx # Task creation form
β βββ Register.tsx # Account registration
β βββ TaskDetail.tsx # Task view + revert + SSE
βββ deploy/helm/ # Kubernetes Helm chart
βββ docker/
β βββ Dockerfile.backend # Multi-stage backend image
β βββ Dockerfile.frontend # nginx-served React build
β βββ Dockerfile.sandbox # Hardened Alpine sandbox base
β βββ nginx-default.conf # Reverse proxy config
β βββ seccomp.json # Default-deny syscall profile
β βββ requirements-sandbox.txt # Python deps for sandbox
βββ docs/
β βββ action-ledger-schema.md # Full SQLCipher DDL + HMAC spec
β βββ docker-hardening.md # Hardening spec + verification checklist
β βββ threat-model.md # 6 attack surfaces + mitigation matrix
βββ .github/workflows/ci.yml # Lint + security scan + Docker build
βββ docker-compose.yml # Production stack
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Project metadata
βββ .env.example # All configuration documented
βββ CLAUDE.md # Project brief
βββ PROJECT-detail.md # Full technical specification
βββ PROJECT-DEVELOPMENT-PHASE-TRACKING.md# 190 person-day development roadmap
βββ SECOND-KNOWLEDGE-BRAIN.md # Research knowledge base
βββ README.md # This file
βββ LICENSE # MIT
βββ CONTRIBUTING.md # Contributor guide
βββ CHANGELOG.md # Release history
- Python 3.11+ β runtime
- Docker Desktop β sandbox execution (daemon must be running)
- Node.js 20+ β frontend build
- Git β workspace snapshots
git clone https://github.com/dungnotnull/secure-openclaw-clone-agent.git
cd secure-openclaw-clone-agentcp .env.example .envEdit .env with your API keys:
LLM_PROVIDER=claude # or openai | ollama
ANTHROPIC_API_KEY=sk-ant-... # Your Anthropic API key
OPENAI_API_KEY=sk-... # Your OpenAI key (fallback)
OLLAMA_BASE_URL=http://localhost:11434 # Local Ollama serverpython -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
python -m uvicorn backend.core.api.app:app --host 0.0.0.0 --port 8000 --reloadVerify: curl http://localhost:8000/health
{"status":"ok","version":"0.1.0","database":"connected","llm_providers":{...}}cd frontend
npm install
npm run devdocker compose up -d| Endpoint | Method | Description |
|---|---|---|
/api/v1/auth/register |
POST | Create account β returns JWT + TOTP setup URL |
/api/v1/auth/login |
POST | Sign in β returns JWT; totp_required if 2FA enabled |
/api/v1/auth/login/totp |
POST | Complete TOTP login flow |
/api/v1/auth/logout |
POST | Destroy session |
/api/v1/auth/2fa/enable |
POST | Enable TOTP 2FA with verification |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/tasks |
POST | Create task β {"instruction": "...", "token_budget": 100000} |
/api/v1/tasks |
GET | List tasks β ?limit=50&offset=0 |
/api/v1/tasks/{id} |
GET | Get task with subtask decomposition |
/api/v1/tasks/{id}/run |
POST | Execute task in background |
/api/v1/tasks/{id}/stream |
GET | SSE stream β real-time task status updates |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/tasks/{id}/revert |
POST | Revert all actions in a task |
/api/v1/actions/{id}/revert |
POST | Revert a single action |
/api/v1/tasks/{id}/ledger/export |
GET | Download HMAC-signed JSONL audit log |
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Database + LLM provider health |
/docs |
GET | Interactive API documentation (Swagger) |
/redoc |
GET | API documentation (Redoc) |
Every aspect of the system is configurable via environment variables.
| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
claude |
Provider: claude, openai, ollama |
ANTHROPIC_API_KEY |
β | Claude API key |
OPENAI_API_KEY |
β | OpenAI API key |
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama server |
CLAUDE_MODEL |
claude-sonnet-4-6 |
Anthropic model ID |
OPENAI_MODEL |
gpt-4o |
OpenAI model ID |
OLLAMA_MODEL |
mistral:7b |
Ollama model name |
| Variable | Default | Description |
|---|---|---|
JWT_SECRET |
auto | JWT signing secret β auto-generated if empty |
JWT_EXPIRY_MINUTES |
60 |
Session token lifetime |
TOTP_ISSUER |
SecureClawAgent |
TOTP authenticator label |
PII_SCRUB_ENABLED |
true |
Strip credentials before external API calls |
PROMPT_INJECTION_DETECT_ENABLED |
true |
Block adversarial prompt patterns |
| Variable | Default | Description |
|---|---|---|
DOCKER_BASE_IMAGE |
secureclaw-sandbox:latest |
Base image for containers |
DOCKER_NETWORK_MODE |
none |
Container networking |
DOCKER_MEMORY_LIMIT |
512m |
Per-container memory |
DOCKER_CPU_LIMIT |
1.0 |
Per-container CPU cores |
DOCKER_TIMEOUT_SECONDS |
300 |
Max execution time per action |
| Variable | Default | Description |
|---|---|---|
ARGON2_MEMORY_KB |
65536 |
Argon2id memory cost (64 MB) |
ARGON2_ITERATIONS |
3 |
Argon2id time cost |
ARGON2_PARALLELISM |
1 |
Argon2id parallelism |
ARGON2_SALT_BYTES |
16 |
Salt length |
DB_PRAGMA_KEY |
β | SQLCipher encryption key (derived from password if empty) |
| Model | Purpose | Size | Source |
|---|---|---|---|
facebook/bart-large-mnli |
Zero-shot action risk classification | 407M | HuggingFace |
microsoft/phi-3-mini-4k-instruct |
Local task planner (privacy mode) | 3.8B | HuggingFace via Ollama |
Salesforce/codet5p-220m |
Small code patch generation | 220M | HuggingFace |
sentence-transformers/all-MiniLM-L6-v2 |
Injection semantic detection | 22M | HuggingFace |
deepset/roberta-base-squad2 |
Intent/parameter extraction | 125M | HuggingFace |
distilbert-base-uncased |
Fine-tuned risk classifier | 66M | HuggingFace |
Full threat model with 6 attack surfaces is documented in docs/threat-model.md.
| Threat | Mitigation |
|---|---|
| Sandbox escape | Seccomp default-deny, all capabilities dropped, non-root user, read-only rootfs, no Docker socket mount |
| Credential theft | AES-256-GCM at rest, Argon2id-derived master key never persisted, sessions memory-only |
| Prompt injection | Two-layer detection (regex + semantic) before every LLM call; sandboxed execution limits blast radius |
| Unauthorized revert | JWT + optional TOTP verification; HMAC integrity on all ledger entries |
| API data leakage | Mandatory PII scrubber strips credentials before any external API call; privacy mode available |
| Supply chain attack | Base images built from pinned digests (cosign signing in future); dependencies pinned |
See docs/docker-hardening.md for the full verification checklist. Key checks:
# Verify non-root
docker inspect secureclaw-sandbox --format '{{.Config.User}}'
# β 1000:1000
# Verify no network
docker inspect <container> --format '{{.HostConfig.NetworkMode}}'
# β none
# Verify read-only rootfs
docker inspect <container> --format '{{.HostConfig.ReadonlyRootfs}}'
# β true
# Verify seccomp applied
docker inspect <container> --format '{{.HostConfig.SecurityOpt}}'
# β [seccomp=C:\path\to\seccomp.json]# Backend lint
pip install ruff
ruff check backend/
# Frontend type check
cd frontend && npx tsc --noEmit
# Build Docker images
docker build -f docker/Dockerfile.backend -t secureclaw-api:dev .
docker build -f docker/Dockerfile.frontend -t secureclaw-frontend:dev .
docker build -f docker/Dockerfile.sandbox -t secureclaw-sandbox:dev .See CONTRIBUTING.md for module map, code style, and architecture guidelines.
See CHANGELOG.md for full release history.
Initial release. 43 Python backend modules, 15 TypeScript frontend files, 4 Docker images/configs, Helm chart, GitHub Actions CI, full documentation suite. All core features implemented: Docker sandbox executor, AES-256-GCM encryption, action ledger with HMAC, snapshot/revert engine, JWT+TOTP auth, multi-provider LLM with fallback chain, PII scrubbing, prompt injection detection, BART-MNLI risk classifier, Phi-3 local planner, CodeT5+ patch generator, distilBERT fine-tuning pipeline, ArXiv knowledge crawler, SSE streaming, React UI with auth guards and toast notifications.
MIT β see LICENSE
Built with β€οΈ by dungnotnull and contributors.