Skip to content

Deployment

Gully Burns edited this page Mar 1, 2026 · 5 revisions

Deployment

This page covers Stage B and C of the A→B→C workflow: deploying OpenClaw + Alhazen as a hardened, containerized stack with Telegram access, egress filtering, and credential brokering. One script (deploy.sh), two targets: Linux VPS or Mac Mini.

Relationship to OpenClaw Configuration: That page covers manual OpenClaw setup with security hardening tiers. This page documents the automated deploy.sh approach that provisions the entire stack — including OpenClaw, TypeDB, MCP server, dashboard, LiteLLM, and Squid proxy — from scratch using Ansible.

The main production use cases for a persistent deployment are the Skills: Jobhunt (nightly job discovery via cron) and Skills: Rare Disease batch ingestion pipelines. See Getting Started for Stage A (local Claude Code).


Development Workflow: A → B → C

Skills move through three environments on their way to production:

Stage Environment What You Get
(A) Local Development Claude Code + local TypeDB Fast iteration, full debugger, direct file editing
(B) Hardened Local Testing OpenClaw on Dedicated Mac Telegram + LiteLLM
(C) Production VPS Hardened Linux server Full Security
# (A) Local development — see Getting Started for step-by-step
make db-start && claude

# (B) Mac Mini — config in deploy/deploy.env
make deploy-macmini

# (C) Production VPS — config in deploy/deploy.env
make deploy-vps

Prerequisites

On your control machine (the laptop running deploy.sh):

  • ansible and ansible-playbook
  • openssl and ssh-keygen
  • SSH access to the target host

On the target host:

Target Requirements
VPS Fresh Debian/Ubuntu — Podman is installed automatically
Mac Mini Docker Desktop running, SSH enabled (Remote Login)

Quick Start

Recommended: deploy.env + make

Copy the example config and fill in your values:

cp deploy/deploy.env.example deploy/deploy.env
# Edit deploy/deploy.env — see variables below

Key variables in deploy/deploy.env:

Variable Description Example
DEPLOY_TARGET IP address or localhost 10.0.110.100
DEPLOY_TARGET_TYPE macmini or vps macmini
DEPLOY_BRANCH Git branch to deploy main
DEPLOY_PROVIDER LLM provider anthropic
DEPLOY_MODEL Model name claude-sonnet-4-6
DEPLOY_API_KEY Real API key sk-ant-...
DEPLOY_TELEGRAM_TOKEN Telegram bot token (optional) from @BotFather
DEPLOY_TELEGRAM_USER Your Telegram user ID (optional) 7365829064
DEPLOY_ASK_PASS Prompt for SSH/sudo password true

deploy.env is gitignored — never commit it. It contains real API keys.

DEPLOY_BRANCH controls which git branch is checked out on the target. Set to a feature branch to deploy unreleased skills or test changes before merging to main.

Then deploy:

make deploy-macmini   # Mac Mini
make deploy-vps       # Production VPS

Advanced / One-Off: raw deploy.sh flags

cd deploy

# Interactive mode — prompts for everything
./deploy.sh

# Non-interactive VPS
./deploy.sh -t 5.78.187.158 -p anthropic -m claude-sonnet-4-6 -k "$KEY" --non-interactive

# Mac Mini with Telegram
./deploy.sh -t 10.0.110.100 --target-type macmini \
  -p anthropic -m claude-sonnet-4-6 -k "$KEY" \
  --telegram-token "$TELEGRAM_BOT_TOKEN"

# Ollama (no API key needed)
./deploy.sh -t 10.0.110.100 --target-type macmini \
  -p ollama -m "qwen2.5:0.5b" -u "http://10.0.110.1:11434"

# AWS with custom SSH user
./deploy.sh -t 54.x.x.x --ssh-user ubuntu --ssh-key ~/aws.pem \
  -p anthropic -m claude-sonnet-4-6 -k "$KEY"

deploy.sh Options

Flag deploy.env Variable Description Default
-t, --target DEPLOY_TARGET Target IP (or localhost) — (required)
--target-type DEPLOY_TARGET_TYPE vps or macmini vps
-p, --provider DEPLOY_PROVIDER LLM provider: anthropic, openai, ollama, openai_compatible anthropic
-m, --model DEPLOY_MODEL Model name claude-sonnet-4-6
-k, --key DEPLOY_API_KEY API key
-u, --url Base URL (for Ollama/OpenAI-compatible)
--branch DEPLOY_BRANCH Git branch to deploy main
--project-name Compose project name openclaw-docker
--port-offset Offset host ports by N (for dual-stack) 0
--ssh-user DEPLOY_SSH_USER SSH user on target root
--ssh-key DEPLOY_SSH_KEY Path to SSH private key
--ask-pass DEPLOY_ASK_PASS Prompt for SSH/sudo passwords false
--telegram-token DEPLOY_TELEGRAM_TOKEN Telegram bot token
--telegram-user DEPLOY_TELEGRAM_USER Telegram user ID to authorize
--non-interactive Fail on missing args instead of prompting false

Post-Deploy Output

The script outputs:

  • SSH command with the generated key
  • A random 3-word hostname (persisted across re-deploys)
  • Dashboard and gateway URLs
  • LLM provider and model info

Architecture

Container Stack

Container stack diagram

Services

Service Container Name Port Purpose
openclaw {prefix}-agent 18789 AI agent (Telegram, gateway)
litellm {prefix}-litellm 4000 LLM credential broker and model routing
squid {prefix}-squid 3128 Egress-filtered HTTP proxy
typedb {prefix}-typedb 1729 Knowledge graph database
typedb-init {prefix}-typedb-init One-shot schema initialization
alhazen-mcp {prefix}-mcp 3000 MCP server (TypeDB + web tools)
alhazen-dashboard {prefix}-dashboard 3001 Next.js dashboard UI

Network Design

Two Docker/Podman networks isolate traffic:

Network Access Services
openclaw-internal No internet (bridge, internal: true) All services
openclaw-external Internet access LiteLLM, Squid, OpenClaw agent

Why the agent bypasses Squid: The @anthropic-ai/sdk in Node.js honors HTTP_PROXY but ignores NO_PROXY. Setting proxy env vars on the agent would route internal http://litellm:4000 calls through Squid, where the container hostname can't resolve. Instead, the agent gets direct internet via openclaw-external and talks to LiteLLM via openclaw-internal. The MCP server does use Squid because Python's requests library properly respects NO_PROXY.

Credential Isolation

The real API key never reaches the agent container:

Agent  ──LITELLM_MASTER_KEY──▶  LiteLLM  ──REAL_API_KEY──▶  Anthropic
       (locally generated)                  (in litellm.env only)
  • litellm.env contains the real API key — mounted only by the LiteLLM container
  • The agent gets ANTHROPIC_API_KEY=${LITELLM_MASTER_KEY}, a locally-generated token
  • LiteLLM brokers the request to the real provider

VPS vs Mac Mini: Key Differences

Aspect VPS (Linux) Mac Mini (macOS)
Container runtime Podman (rootless) + podman-compose Docker Desktop + docker compose
User creation useradd via Ansible user module dscl (macOS Directory Service)
Home directory /home/openclaw /Users/openclaw
User group openclaw staff (for Docker socket access)
Firewall UFW (deny incoming, allow SSH + Tailscale) pf (same rules, persisted via launchd)
Brute-force protection Fail2Ban (SSH jail) — (not available on macOS)
Package management apt (Podman, Fail2Ban, UFW auto-installed) Homebrew (optional, for Tailscale)
Reboot persistence loginctl enable-linger Docker Desktop auto-starts
Monitoring systemd timer (weekly) launchd plist (weekly)
Security posture Full: UFW + Fail2Ban + SSH + rootless Podman Lighter: pf + SSH key-only

Mac Mini detail page: See Deployment: Mac Mini Native for the full native-services architecture, service management commands, config file locations, and a complete troubleshooting guide covering all known startup issues.

What the Mac Mini Deploy Does

On a Mac Mini with Docker Desktop running and SSH enabled, deploy.sh --target-type macmini will:

  1. Verify Docker Desktop and docker compose are available
  2. Create the openclaw user via dscl (finds next UID >= 500, creates home directory)
  3. Deploy an SSH key so you can ssh openclaw@<mac-mini> with the generated key
  4. Add openclaw to the staff group for Docker socket access
  5. Harden SSH (disable password auth, enable pubkey-only)
  6. Configure the pf firewall (deny all incoming except SSH and Tailscale)
  7. Clone the skillful-alhazen repo to ~openclaw/skillful-alhazen
  8. Render all configuration templates (docker-compose, Dockerfile, configs)
  9. Build and start the 7-container stack
  10. Initialize TypeDB with all schemas (idempotent — skips if database exists)
  11. Run openclaw doctor --fix inside the agent container
  12. Health-check TypeDB (waits up to 60 seconds)

Security Layers

Layer Implementation
SSH key-only auth Password auth disabled on deploy
Firewall UFW (Linux) / pf (macOS) — only SSH + Tailscale allowed
Fail2Ban SSH brute-force protection (Linux only)
Egress filtering Squid allowlist — only approved domains
Credential isolation LiteLLM holds real API keys; agent gets internal-only token
Read-only containers LiteLLM, MCP, Dashboard — tmpfs for /tmp
Resource limits Per-container CPU and memory caps
Rootless Podman No root privileges (Linux VPS)
Weekly security audits Automated monitoring via systemd/launchd
Tool allowlisting Agent can only execute approved commands

Configuration Templates

All templates live in deploy/roles/alhazen-setup/templates/:

Template Deployed As Purpose
docker-compose.yml.j2 docker-compose.yml Container orchestration (7 services, 2 networks)
Dockerfile.j2 Dockerfile OpenClaw agent image (Node 22 + uv + Chromium)
litellm-config.yaml.j2 litellm-config.yaml Model routing (Anthropic/OpenAI/Ollama)
litellm.env.j2 litellm.env Real API keys (only mounted by LiteLLM)
env.j2 .env Compose variable substitution
openclaw.json.j2 openclaw.json Agent config (models, channels, skills)
mcp.json.j2 mcp.json MCP server connections
tools.yaml.j2 tools.yaml Shell command allowlist
exec-approvals.json.j2 exec-approvals.json Per-agent command execution policy
allowlist.txt.j2 allowlist.txt Squid domain allowlist
squid.conf.j2 squid.conf Squid proxy configuration
monitor.sh.j2 monitor.sh Weekly security audit script
pf-alhazen.rules.j2 /etc/pf.anchors/openclaw.rules macOS pf firewall rules

Ansible Playbook Structure

deploy/
├── deploy.sh                    # Entry point — parses args, runs Ansible
├── playbook.yml                 # 3 plays: check identity, local keygen, remote deploy
├── ansible.cfg                  # Ansible settings
├── requirements.yml             # Galaxy dependency (community.general)
├── eff_large_wordlist.txt       # EFF wordlist for 3-word hostname generation
├── ssh-keys/                    # Generated SSH keys (gitignored)
└── roles/
    └── alhazen-setup/
        ├── handlers/main.yml    # Restart fail2ban, restart sshd
        ├── tasks/
        │   ├── main.yml             # Dispatcher → OS-specific + security + deploy
        │   ├── linux-system.yml     # Debian/Ubuntu: packages, user, Podman, Tailscale
        │   ├── macos-system.yml     # macOS: user (dscl), Tailscale
        │   ├── security-linux.yml   # UFW, Fail2Ban, SSH hardening
        │   ├── security-macos.yml   # pf firewall, SSH hardening
        │   ├── container-deploy.yml # Linux VPS: templates, compose up, health checks
        │   └── macos-native.yml     # macOS: launchd daemons, ~/.openclaw/ layout
        └── templates/               # Jinja2 templates (see table above)

Playbook Flow

The playbook has three plays:

  1. Check for Existing Installation (remote): Reads /etc/openclaw_identity to preserve the 3-word hostname on re-deploys.
  2. Local Setup (localhost): Generates a random hostname, SSH key pair, and self-signed SSL certificate (or reuses existing ones).
  3. Remote Deployment: Runs the alhazen-setup role — OS setup, security hardening, container deployment.

The task dispatcher (tasks/main.yml) branches on ansible_facts['os_family']:

  • Debianlinux-system.yml + security-linux.yml + container-deploy.yml
  • Darwinmacos-system.yml + security-macos.yml + macos-native.yml

Post-Deploy Operations

Telegram Setup

# Pass token at deploy time (via deploy.env or flag)
make deploy-macmini   # with DEPLOY_TELEGRAM_TOKEN set in deploy.env
# or:
./deploy.sh ... --telegram-token "$BOT_TOKEN"

# Or configure after deploy
ssh -i deploy/ssh-keys/<hostname>.pem openclaw@<target>

# macOS native deployment:
vi ~/.openclaw/openclaw.json
# Set channels.telegram.botToken
sudo launchctl kickstart -k system/com.openclaw.agent

# VPS container deployment:
vi ~/openclaw-docker/openclaw-data/openclaw.json
# Set channels.telegram.botToken
docker restart alhazen-agent

macOS native stores config in ~/.openclaw/openclaw.json. VPS containers store it in ~/openclaw-docker/openclaw-data/openclaw.json.

Managing Channel Users

Add/remove Telegram users without a full redeploy:

# Add a user
./update-channels.sh -t <ip> --target-type macmini \
  --channel telegram --add-user 7365829064

# Remove a user
./update-channels.sh -t <ip> --channel telegram --remove-user 7365829064

# List current users
./update-channels.sh -t <ip> --channel telegram --list

Updating the Squid Allowlist

# Edit the template
vim roles/alhazen-setup/templates/allowlist.txt.j2

# Push to target
./update-allowlist.sh -t <ip> --target-type macmini

Updating Skills

# Option 1: Full redeploy (preferred)
make deploy-macmini   # or: make deploy-vps

# Option 2: Fast skill-only update (no full redeploy)
# On the target host:
bash /Users/openclaw/skillful-alhazen/deploy/update-skills.sh <branch> /Users/openclaw

Switching Models

Edit two files on the target, then restart:

# 1. LiteLLM config (model routing)
vi ~/openclaw-docker/litellm-config.yaml

# 2. OpenClaw config (agent model selection)
vi ~/openclaw-docker/openclaw-data/openclaw.json
# Change: models.providers.anthropic.models[0].id
# Change: agents.defaults.model.primary

# 3. Restart
docker restart alhazen-litellm alhazen-agent   # Mac Mini
podman restart alhazen-litellm alhazen-agent   # VPS

Valid Anthropic model IDs: claude-sonnet-4-6, claude-opus-4-6, claude-haiku-4-5-20251001.


Resource Requirements

Target Min RAM Min Disk Notes
VPS 4 GB 20 GB TypeDB ~2GB, LiteLLM ~1GB
Mac Mini 8 GB 20 GB Docker Desktop overhead

Container Limits

Container Memory CPU Notes
TypeDB 2 GB 2.0 JVM heap
LiteLLM 1 GB 1.0 Was 512MB, caused OOM on startup
MCP Server 1 GB 1.0 Python + TypeDB driver
Dashboard 512 MB 1.0 Next.js

Troubleshooting

Symptom Cause Fix
LiteLLM crash-loops OOM — needs ~700MB at idle Increase mem_limit in docker-compose (template sets 1GB)
Agent can't reach LiteLLM HTTP_PROXY set on agent container Agent must NOT have proxy env vars; uses direct network
Model not found (HTTP 404) Wrong model ID in litellm-config Use exact IDs: claude-sonnet-4-6, etc.
TypeDB not ready Takes 30-60s to start Wait for healthcheck; check docker logs alhazen-typedb
openclaw can't access Docker Not in staff group (macOS) dscl . -append /Groups/staff GroupMembership openclaw
pf firewall not loading launchd plist not loaded sudo launchctl load /Library/LaunchDaemons/com.openclaw.pf.plist
Skills not found by agent Skills not in managed dir Re-deploy (make deploy-macmini/deploy-vps) or run update-skills.sh on target. macOS native: check ~/.openclaw/skills/. VPS: check openclaw-docker/openclaw-data/skills/
SSH password rejected Password auth disabled by deploy Use generated key: ssh -i ssh-keys/<hostname>.pem openclaw@<ip>
Telegram HTTP 400 no_db_connection (macmini native) LiteLLM needs DB bypass setting Add allow_requests_on_db_unavailable: true to general_settings in litellm-config.yaml; restart LiteLLM
Telegram HTTP 500 (macmini native) prisma module missing from LiteLLM install sudo -u openclaw uv tool install --force --with prisma 'litellm[proxy]'
MCP service crash-loops silently (macmini native) launchd scheduling issue on first boot sudo launchctl kickstart -k system/com.alhazen.mcp

Mac Mini native deployment has additional failure modes. See Deployment: Mac Mini Native for the full troubleshooting guide.


Directory Layout on Target

macOS Native (Mac Mini)

After deployment with make deploy-macmini, the openclaw user's home looks like:

/Users/openclaw/
├── skillful-alhazen/          # Cloned repo (skills, schemas)
├── .openclaw/                 # Agent config (macOS native)
│   ├── openclaw.json          # Agent config (mode 0600)
│   ├── mcp.json               # MCP server connections
│   ├── tools.yaml             # Shell command allowlist
│   ├── exec-approvals.json    # Command execution policy
│   ├── gateway.crt / .key     # Self-signed SSL certificate
│   └── skills/                # Skill directories (symlinked from repo)
├── workspace/                 # Agent workspace (CLAUDE.md, identity files)
├── litellm-config.yaml        # LiteLLM config (macOS native)
├── secrets.env                # LITELLM_MASTER_KEY, OPENCLAW_GATEWAY_TOKEN (mode 0600)
└── logs/                      # Service logs (openclaw, litellm, mcp)

TypeDB and alhazen-hub run as Docker containers operated separately (via Docker Desktop). All other services (openclaw, litellm, alhazen-mcp) run as launchd daemons under the openclaw user.

VPS Container Stack (Linux)

After deployment with make deploy-vps, the openclaw user's home looks like:

/home/openclaw/
├── skillful-alhazen/                # Cloned repo (for container builds + schemas)
└── openclaw-docker/                 # Compose project directory
    ├── docker-compose.yml
    ├── Dockerfile
    ├── .env                         # LITELLM_MASTER_KEY, OPENCLAW_GATEWAY_TOKEN
    ├── litellm-config.yaml
    ├── litellm.env                  # Real API key (mode 0600)
    ├── squid.conf
    ├── allowlist.txt
    ├── monitor.sh
    ├── typedb-data/                 # TypeDB persistent storage
    ├── workspace/                   # Agent workspace (CLAUDE.md, identity files)
    ├── openclaw-data/               # Agent config + state
    │   ├── openclaw.json            # Agent config (mode 0600)
    │   ├── mcp.json                 # MCP server connections
    │   ├── tools.yaml               # Shell command allowlist
    │   ├── exec-approvals.json      # Command execution policy
    │   ├── gateway.crt / .key       # Self-signed SSL certificate
    │   ├── credentials/             # Telegram pairing, etc.
    │   └── skills/                  # Copied from repo .claude/skills/
    └── openclaw-ssh/                # Agent SSH keys

Supported LLM Providers

Provider Model Examples Notes
anthropic claude-sonnet-4-6, claude-opus-4-6 Default provider
openai gpt-4o Via LiteLLM translation
ollama qwen2.5:0.5b, llama3, deepseek-r1:8b Local/network Ollama server
openai_compatible Any Custom base URL + API key

Idempotent Re-Deploys

Running deploy.sh against an already-deployed target is safe:

  • The 3-word hostname is read from /etc/openclaw_identity and reused
  • SSH keys are only generated if the local key file doesn't exist
  • Existing secrets (.env) are preserved — master key and gateway token carry forward
  • TypeDB initialization skips if the alhazen_notebook database already exists
  • User creation skips if the openclaw user already exists
  • Containers are stopped, rebuilt, and restarted cleanly

Known Issues and Future Work

Agent Egress Filtering

The agent currently bypasses Squid proxy (see Network Design above). A future fix would use transparent proxy with iptables in the container namespace — redirecting outbound port 443 to Squid's intercept port while exempting internal network traffic. This restores domain-level egress filtering without depending on HTTP_PROXY env vars.

See deploy/README.md for detailed technical notes on the transparent proxy approach.

Upstream SDK Bug

The root cause is that @anthropic-ai/sdk honors HTTP_PROXY but ignores NO_PROXY. Tracked informally; if fixed upstream, the original explicit proxy architecture works as designed.

Clone this wiki locally