Skip to content

ranton256/code-workers

Repository files navigation

code-workers

Pull-based orchestration for AI coding agents inside the AI Fortress sandbox. A worker container long-polls a host-side coordinator for tasks, runs them under gVisor with a budget-capped virtual key, and posts results back. The fortress provides isolation; this layer adds a workflow.

Architecture

host shell ─POST──> coordinator (uvicorn, 127.0.0.1:7222)
                       │   queue
                       │
                       │  long-poll lease           upstream LLM
                       │                              ▲
                       │                              │ via fortress
                       ▼                              │
              ┌─ vsock relay (host) ─┐                │
              │  socat 7222          │                │
              └──────────┬───────────┘                │
                         │ AF_VSOCK                   │
              ┌──────────▼───────────┐                │
              │  coordinator-shim    │                │
              │  alias=skia in       │   ┌─────────────────────────┐
              │  sandbox_net         │   │ Bifrost @127.0.0.1:4000 │
              └──────────┬───────────┘   └─────────────▲───────────┘
                         │                             │
                         │ TCP within --internal       │
                         ▼                             │
                ┌──── worker container ──────────┐     │
                │ runsc, sandbox_net,            │     │
                │ HOME=/work, virtual key        │     │
                │ no internet, no upstream key   │     │
                │                                │     │
                │ worker.py:                     │     │
                │   GET http://skia:7222/...     │     │
                │   exec opencode run …          ──────┘ (Anthropic SDK
                │   POST result back             │      uses ANTHROPIC_BASE_URL
                │                                │      = authproxy:4000)
                └────────────────────────────────┘

Key boundaries:

  • The worker container has no direct internet egress. Only two things are reachable via the fortress's labelled vsock-shim mechanism: authproxy (LLM proxy) and skia (this coordinator).
  • The worker holds only a per-session sk-bf-* virtual key — never the real Anthropic upstream key.
  • The coordinator binds 127.0.0.1 only; the relay+shim is the only path in from sandboxes.

Prerequisites

  • AI Fortress installed (Phase 1 + Phase 2 verified — see ai-fortress/README.md).
  • You're in the fortress group (sudo -n /usr/local/sbin/fortress-mint works without prompting).
  • Python 3.12+ on host with a venv:
    python3 -m venv .venv && .venv/bin/pip install -r requirements.txt
  • .env in this directory containing at minimum:
    FORTRESS_API_KEY=<random secret used by both coordinator and worker>
    

Install

bash install.sh

What it does:

  1. Installs and starts code-workers-coordinator.service as a user-mode systemd unit (~/.config/systemd/user/), uvicorn bound to 127.0.0.1:7222. Runs as you. Enables loginctl --linger so it survives logout. (System-mode would require relaxing SELinux to let init_t traverse user_home_t paths to read your venv.)
  2. Installs and starts code-workers-coordinator-relay.service as a system-mode unit (vsock 7222 → tcp 127.0.0.1:7222).
  3. Builds the worker image on the host.
  4. Pushes the worker image into the VM via ai-fortress/push_image_to_vm.sh.
  5. Installs coordinator-shim.service in the VM (label ai-fortress.shim.alias=skia) and starts it.
  6. Removes legacy /opt/bin/worker-up if present (replaced by agent).

Status checks afterwards:

systemctl --user status code-workers-coordinator    # user-mode
systemctl status code-workers-coordinator-relay     # system-mode
ssh ranton@<vm> systemctl status coordinator-shim   # in VM

To roll back: bash uninstall.sh.

Usage

Launch a worker session

bash run-worker rhizome

This is a one-line wrapper around the fortress launcher; equivalent to:

~/bin/agent rhizome \
  --image code-workers/worker:latest \
  --env REPO_NAME=rhizome \
  --env FORTRESS_API_KEY="$FORTRESS_API_KEY" \
  --env OPENCODE_API_KEY="$OPENCODE_API_KEY"

The container starts polling the coordinator immediately. Leave it running.

Send a task

bash send_task.sh rhizome "fix the login bug in auth.py"
bash send_task.sh --wait rhizome "add unit tests for the parser module"

The --wait flag polls and prints results when the task completes.

Inspect a task

bash poll_task.sh task-1234567890-12345

Files

File Role
coordinator.py FastAPI app — task queue, lease/result/status endpoints, optional Paperclip webhook
worker.py In-sandbox poller; long-polls GET /tasks/lease/<repo> and runs opencode run
Dockerfile.worker Worker image (Python + Rust + Claude Code + OpenCode + the worker.py entrypoint)
Dockerfile.python General Python dev sandbox (built by build_python.sh)
run-worker Thin wrapper around ~/bin/agent
send_task.sh / poll_task.sh Host-shell helpers
install/code-workers-coordinator.service Host systemd unit for uvicorn
install/code-workers-coordinator-relay.service Host systemd unit for vsock 7222 → tcp 127.0.0.1:7222
install/coordinator-shim.service VM systemd unit for the labelled coordinator-shim container
install.sh / uninstall.sh One-shot install + reverse

OpenCode permissions inside the worker

Dockerfile.worker writes an OpenCode config that allows everything except the three server-side LLM tools that bypass the network sandbox at the application layer (webfetch, websearch, codesearch). The fortress can't catch those at the network or syscall layer because they ride inside legitimate Messages API calls. See ai-fortress/ARCHITECTURE.md, "LLM-as-egress channel," for the full discussion. A future improvement is a Bifrost-side request-scrub that strips these tools from any inbound request body — defense-in-depth that doesn't depend on per-image config.

Troubleshooting

  • Worker logs 403 Forbidden from the coordinatorFORTRESS_API_KEY mismatch. The worker reads from its env (passed by run-worker), the coordinator reads from .env at startup. They must match.
  • Worker logs Cannot connect to coordinator at http://skia:7222 → the coordinator-shim isn't running, or the host relay isn't running, or the host coordinator isn't running. Check (note the different --user flag for the coordinator):
    systemctl --user status code-workers-coordinator
    systemctl status code-workers-coordinator-relay
    ssh ranton@<vm> systemctl status coordinator-shim
  • agent says sudo: a password is required → your shell predates usermod -aG fortress. newgrp fortress (one shell) or log out and back in (everywhere).
  • Worker image EACCES: mkdir /.localagent-vm should set HOME=/work automatically; check that you're on the latest fortress (agent-vm includes the fix).

Hardening roadmap

plan.md lists the originally planned hardening steps (auth, structured logging, persistence). The fortress integration adds a separate set of guarantees on top: sandbox network isolation, virtual-key budgets, no upstream-key leak, runsc syscall confinement. The two layers are complementary.

About

coding agent that pulls from work queue for running more easily from sandboxed environment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors