Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions multi-agent/deploy/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# deploy

Production bring-up templates for the agents in `cmd/`. Unlike `examples/`
(which are end-to-end Go demos), each subdirectory here ships an installer
script and config templates you point at a real host.

| Path | Target |
|---|---|
| [`linux/observer`](linux/observer/) | Generic `observer-server` install. SQLite-backed HTTP daemon (default `:8090`); foreground or `--systemd`. amd64 / arm64. |
| [`linux/driver`](linux/driver/) | Generic `driver-agent` install into a Claude Code project dir (no systemd — Claude Code launches the MCP server on demand). |
| [`linux/slave`](linux/slave/) | Generic `slave-agent` install on any Linux host. Foreground smoke mode or `--systemd` for a managed service. amd64 / arm64. |
| [`linux/compose-test`](linux/compose-test/) | docker-compose end-to-end test wiring all three installers together against a local observer; surfaces the device-code "join workspace" URLs each role prints on first start. |

Pre-built binaries for each release are published at
<https://github.com/agentserver/loom/releases>. Each `install.sh` accepts
`--bin PATH` to point at a downloaded asset; otherwise it looks in `./bin/`
relative to itself.

For the pre-wired prod-test bundle (`driver-prod`, `slave-jetson-prod`,
`slave-local-prod` against `agent.cs.ac.cn` / `ws-prod`), see
[`../tests/prod_test/`](../tests/prod_test/) — that bundle is for the
project's own staging environment and is gitignored.
4 changes: 4 additions & 0 deletions multi-agent/deploy/linux/bin/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Binaries land here at deploy time. Pull pre-built ones from
# https://github.com/agentserver/loom/releases or build into this dir.
*
!.gitignore
2 changes: 2 additions & 0 deletions multi-agent/deploy/linux/compose-test/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Volumes / runtime state if you ever bind-mount instead of using named volumes
state/
13 changes: 13 additions & 0 deletions multi-agent/deploy/linux/compose-test/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM debian:bookworm-slim

# install.sh uses sudo (no-op as root, but the binary must exist), xxd for
# random keygen, curl for healthchecks, ca-certs for HTTPS to agent.cs.ac.cn.
RUN apt-get update && apt-get install -y --no-install-recommends \
bash sudo ca-certificates curl python3 xxd \
&& rm -rf /var/lib/apt/lists/*

# Compose bind-mounts deploy/linux/ at /opt/loom/deploy at runtime.
# Each container's per-instance dir lives under /var/lib/loom/.
WORKDIR /var/lib/loom

ENV API_KEY="COMPOSE_TESTKEY_dont_use_in_prod"
114 changes: 114 additions & 0 deletions multi-agent/deploy/linux/compose-test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# compose-test

End-to-end smoke test for the three `deploy/linux/` templates
(observer / driver / slave). Spins up a local observer in one container,
then exercises the slave and driver installers against it in two more
containers, and surfaces the device-code "join workspace" URL each agent
prints on first start.

## What it verifies

1. `observer/install.sh` renders `observer.yaml`, stages the binary, and
the daemon opens a TCP listener on `:8090`.
2. `slave/install.sh` renders config (with `--observer-url` /
`--workspace`), stages the binary, and `slave-agent` boots and reaches
the device-code OAuth step.
3. `driver/install.sh` renders the project (`config.yaml` + `.mcp.json`)
and `driver-agent register` reaches the device-code step.
4. The same `api-key` flows: observer's workspace bootstrap key ↔
slave / driver `observer.api_key` ↔ per-agent token mint.

Steps that require human interaction (clicking the device-code URLs) are
left as the operator's job — the test surfaces the URLs prominently.

## What it does NOT cover

- `claude` CLI inside the slave container — `chat`-skill tasks won't run.
Add `claude` (and `ANTHROPIC_API_KEY` via `--anthropic-key`) if you want
to exercise that path.
- The driver's `serve-mcp` step. Driver stops at `register` because
`serve-mcp` is invoked by Claude Code's `.mcp.json`, not directly.
- Reachability to `agent.cs.ac.cn` — the tunnel registration is a hard
external dependency. If your sandbox blocks that host, both driver and
slave will stall at the device-code step.

## Prereqs

1. Docker + `docker compose` v2 (or the legacy `docker-compose` binary).
2. The three `linux-amd64` binaries dropped into `../bin/`:
```bash
cd ../bin
for n in observer-server driver-agent slave-agent; do
curl -L -o "$n.linux-amd64" \
"https://github.com/agentserver/loom/releases/download/v0.0.1/$n.linux-amd64"
chmod +x "$n.linux-amd64"
done
```
Or build them with `make` / `go build` (see the per-role install.sh
error messages for the exact `go build` commands).

## Run

```bash
cd deploy/linux/compose-test
docker compose up --build
```

Expected output (interleaved across services):

```
loom-observer | ==> creating /var/lib/loom/compose-observer
loom-observer | ==> done.
loom-observer | ==============================================
loom-observer | Observer is up. Wire other agents with:
loom-observer | observer.url: http://observer:8090
loom-observer | observer.workspace_id: ws-test
loom-observer | observer.api_key: COMPOSE_TESTKEY_dont_use_in_prod
loom-observer | ==============================================
loom-observer | 2026/05/21 17:30:00 observer-server listening on :8090
loom-slave | ==> creating /var/lib/loom/compose-slave
loom-slave | ==============================================
loom-slave | slave: deploy succeeded. Starting slave-agent.
loom-slave | ==============================================
loom-slave |
loom-slave | Open this URL to authenticate:
loom-slave |
loom-slave | https://agent.cs.ac.cn/oauth2/device/verify?user_code=VhspjQLp
loom-slave |
loom-driver | Open this URL to register "compose-driver":
loom-driver | https://agent.cs.ac.cn/oauth2/device/verify?user_code=7AUTGKNs
```

Visit each URL in a browser, approve. After approval:

- driver's `register` will exit 0 and the `loom-driver` container exits.
- slave's `slave-agent` continues running, mints an observer token, and
publishes its capability card. You can verify with:
```bash
curl -sS -H "Authorization: Bearer COMPOSE_TESTKEY_dont_use_in_prod" \
http://127.0.0.1:18090/api/agents | python3 -m json.tool
```

## Tear down

```bash
docker compose down -v # -v wipes the per-instance dirs in named volumes
docker image rm loom-deploy-test:latest
```

## Files

| Path | Purpose |
|---|---|
| `Dockerfile` | debian:bookworm-slim + bash/sudo/curl/ca-certs/python3/xxd |
| `docker-compose.yml` | three services bind-mounting `../` and per-role entrypoints |
| `entrypoint-observer.sh` | runs `observer/install.sh` then execs `observer-server` |
| `entrypoint-driver.sh` | runs `driver/install.sh` then execs `driver-agent register` |
| `entrypoint-slave.sh` | runs `slave/install.sh` then execs `slave-agent` |

## Tweaking the test

Change `API_KEY` in `Dockerfile` and rebuild. Add a second slave by
duplicating the `slave:` service block with a different `container_name`
and `INSTANCE` (edit `entrypoint-slave.sh` to read INSTANCE from env, or
copy it to `entrypoint-slave-2.sh`).
69 changes: 69 additions & 0 deletions multi-agent/deploy/linux/compose-test/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: loom-deploy-test

# Verifies the three deploy templates (observer / driver / slave) work
# end-to-end against a local observer. Each service runs the actual
# deploy/linux/<role>/install.sh inside a debian container, then execs
# the binary in foreground.
#
# The driver and slave hit https://agent.cs.ac.cn for tunnel registration on
# first start. Each will print a "device-code" verification URL — visit it
# in a browser and approve to advance the deploy.
#
# Bind-mounts: this file lives at deploy/linux/compose-test/, so `../` is
# deploy/linux/ and `../bin` is where the binaries are expected.
#
# Usage:
# 1. Drop pre-built or downloaded binaries into deploy/linux/bin/
# (observer-server.linux-amd64, driver-agent.linux-amd64, slave-agent.linux-amd64)
# 2. cd deploy/linux/compose-test
# 3. docker compose up --build
# 4. Watch the logs — each role prints the URL you need to visit.

services:
observer:
build:
context: .
dockerfile: Dockerfile
# Use the host network during build so apt-get can reach deb.debian.org
# in sandboxes where the docker daemon's default build network has no DNS.
network: host
image: loom-deploy-test:latest
container_name: loom-observer
ports:
- "127.0.0.1:18090:8090"
volumes:
- ../:/opt/loom/deploy:ro
command: ["bash", "/opt/loom/deploy/compose-test/entrypoint-observer.sh"]
healthcheck:
test: ["CMD-SHELL", "bash -c 'echo > /dev/tcp/127.0.0.1/8090' 2>/dev/null || exit 1"]
interval: 2s
timeout: 1s
retries: 30
start_period: 5s
init: true

driver:
image: loom-deploy-test:latest
container_name: loom-driver
depends_on:
observer:
condition: service_healthy
volumes:
- ../:/opt/loom/deploy:ro
command: ["bash", "/opt/loom/deploy/compose-test/entrypoint-driver.sh"]
init: true
stdin_open: true
tty: true

slave:
image: loom-deploy-test:latest
container_name: loom-slave
depends_on:
observer:
condition: service_healthy
volumes:
- ../:/opt/loom/deploy:ro
command: ["bash", "/opt/loom/deploy/compose-test/entrypoint-slave.sh"]
init: true
stdin_open: true
tty: true
55 changes: 55 additions & 0 deletions multi-agent/deploy/linux/compose-test/entrypoint-driver.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#!/usr/bin/env bash
# Compose-test entrypoint for the driver.
# 1. Sanity-check the bind-mounted binary
# 2. Run deploy/linux/driver/install.sh to render project dir + config + .mcp.json
# 3. Run `driver-agent register` — blocks on a device-code URL printed on stdout
# 4. After approval, register exits 0; the container exits (the next step,
# `claude` opening the .mcp.json, is up to the operator)

set -euo pipefail

INSTANCE=compose-driver
PROJECT=/var/lib/loom/$INSTANCE
BIN=/opt/loom/deploy/bin/driver-agent.linux-amd64

if [[ ! -x "$BIN" ]]; then
cat <<EOF >&2
ERROR: missing $BIN
Drop the driver binary into deploy/linux/bin/ before 'docker compose up':
curl -L -o deploy/linux/bin/driver-agent.linux-amd64 \\
https://github.com/agentserver/loom/releases/download/v0.0.1/driver-agent.linux-amd64
chmod +x deploy/linux/bin/driver-agent.linux-amd64
EOF
exit 1
fi

# install.sh's --skill-bundle default path doesn't resolve inside the
# container's mount layout; pass empty (no bundle) explicitly.
/opt/loom/deploy/driver/install.sh \
--project "$PROJECT" \
--name "$INSTANCE" \
--observer-url "http://observer:8090" \
--workspace ws-test \
--api-key "$API_KEY" \
--token-dir "$PROJECT" \
--skill-bundle "" \
--bin "$BIN"

cat <<EOF

================================================================
driver: deploy succeeded. Now running 'driver-agent register'
to mint agentserver credentials via device-code OAuth.

In a few seconds you'll see a line like:

Open this URL to authenticate:
https://agent.cs.ac.cn/device?user_code=XXXX-YYYY

Visit that URL, approve, and the register step will write
sandbox + tunnel + proxy tokens back into config.yaml.
================================================================

EOF

exec "$PROJECT/driver-agent" register --config "$PROJECT/config.yaml"
48 changes: 48 additions & 0 deletions multi-agent/deploy/linux/compose-test/entrypoint-observer.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/usr/bin/env bash
# Compose-test entrypoint for the observer.
# 1. Sanity-check the bind-mounted binary
# 2. Run deploy/linux/observer/install.sh to render config + stage binary
# 3. Print the workspace credentials (so the operator can wire other agents)
# 4. exec observer-server in foreground

set -euo pipefail

INSTANCE=compose-observer
LOOM=/var/lib/loom/$INSTANCE
BIN=/opt/loom/deploy/bin/observer-server.linux-amd64

if [[ ! -x "$BIN" ]]; then
cat <<EOF >&2
ERROR: missing $BIN
Drop the observer binary into deploy/linux/bin/ before 'docker compose up':
curl -L -o deploy/linux/bin/observer-server.linux-amd64 \\
https://github.com/agentserver/loom/releases/download/v0.0.1/observer-server.linux-amd64
chmod +x deploy/linux/bin/observer-server.linux-amd64
EOF
exit 1
fi

/opt/loom/deploy/observer/install.sh \
--name "$INSTANCE" \
--user root \
--loom-home "$LOOM" \
--listen ":8090" \
--workspace ws-test \
--workspace-name "Compose Test Workspace" \
--api-key "$API_KEY" \
--bin "$BIN"

cat <<EOF

================================================================
Observer is up. Wire other agents to this workspace with:

observer.url: http://observer:8090 (inside the compose net)
http://127.0.0.1:18090 (from your host)
observer.workspace_id: ws-test
observer.api_key: $API_KEY
================================================================

EOF

exec "$LOOM/observer-server" --config "$LOOM/observer.yaml"
57 changes: 57 additions & 0 deletions multi-agent/deploy/linux/compose-test/entrypoint-slave.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#!/usr/bin/env bash
# Compose-test entrypoint for the slave.
# 1. Sanity-check the bind-mounted binary
# 2. Run deploy/linux/slave/install.sh to render config + stage binary
# 3. exec slave-agent — on first start it does device-code OAuth, prints a URL,
# blocks until approved, then persists creds and starts publishing its
# capability card to the observer.

set -euo pipefail

INSTANCE=compose-slave
LOOM=/var/lib/loom/$INSTANCE
BIN=/opt/loom/deploy/bin/slave-agent.linux-amd64

if [[ ! -x "$BIN" ]]; then
cat <<EOF >&2
ERROR: missing $BIN
Drop the slave binary into deploy/linux/bin/ before 'docker compose up':
curl -L -o deploy/linux/bin/slave-agent.linux-amd64 \\
https://github.com/agentserver/loom/releases/download/v0.0.1/slave-agent.linux-amd64
chmod +x deploy/linux/bin/slave-agent.linux-amd64
EOF
exit 1
fi

/opt/loom/deploy/slave/install.sh \
--name "$INSTANCE" \
--user root \
--loom-home "$LOOM" \
--observer-url "http://observer:8090" \
--workspace ws-test \
--api-key "$API_KEY" \
--tag compose --tag test \
--bin "$BIN"

cat <<EOF

================================================================
slave: deploy succeeded. Starting slave-agent in foreground.

On first start it runs the device-code OAuth flow against
agent.cs.ac.cn — watch for:

Open this URL to authenticate:
https://agent.cs.ac.cn/device?user_code=XXXX-YYYY

Visit that URL, approve, and the slave will persist its
sandbox + tunnel + proxy tokens, then connect to the observer.

Note: the 'chat' skill needs the 'claude' CLI inside the
container, which this image does NOT install. 'bash' / 'file'
/ 'register_mcp' / 'claude_permissions' work without it.
================================================================

EOF

exec "$LOOM/slave-agent" "$LOOM/config.yaml"
Loading
Loading