Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 17 additions & 12 deletions OPENCLAW.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,16 @@ The OpenClaw example entrypoint does the following:
- `gateway.port` from `OPENCLAW_GATEWAY_PORT` (default `8080`)
- `gateway.bind` from `OPENCLAW_GATEWAY_BIND` (default `lan`)
4. Ensures `OPENCLAW_GATEWAY_TOKEN` exists (uses provided token or generates one).
5. Writes the gateway token to a local token file for bridge use.
6. Starts an image-owned ACP compatibility bridge on `0.0.0.0:2529` unless `OPENCLAW_ACP_ENABLED=false`.
7. When gateway auth mode is `trusted-proxy`, automatically trusts loopback for the internal bridge,
injects the required trusted-proxy headers on the bridge's upstream gateway hop, and rewrites the
5. Writes the gateway token to a local token file for ACP adapter use.
6. Starts an image-owned ACP adapter on `0.0.0.0:2529` unless `OPENCLAW_ACP_ENABLED=false`.
7. Exposes:
- WebSocket ACP on `/`
- `GET /healthz`
- `GET /.well-known/spritz-acp`
8. When gateway auth mode is `trusted-proxy`, automatically trusts loopback for the internal adapter,
injects the required trusted-proxy headers on the adapter's upstream gateway hop, and rewrites the
upstream gateway `connect` handshake into a Control UI operator session without device identity.
8. Auto-starts OpenClaw when command is default (`sleep infinity`), unless `OPENCLAW_AUTO_START=false`.
9. Auto-starts OpenClaw when command is default (`sleep infinity`), unless `OPENCLAW_AUTO_START=false`.

Key implication: direct `/w/{name}` access with `bind=lan` expects real gateway auth.

Expand All @@ -61,15 +65,16 @@ Spritz will then:

This ACP path is separate from OpenClaw's dashboard and gateway UI.

Today the example image satisfies that contract with a compatibility bridge:
Today the example image satisfies that contract with a long-lived ACP adapter:

- WebSocket server inside the image listens on `2529`
- each ACP connection spawns `spritz-openclaw-acp-wrapper`
- the wrapper talks to the local OpenClaw gateway over loopback WebSocket
- if gateway auth mode is `trusted-proxy`, the bridge uses a loopback-only header injector so the
- one long-lived Node ACP server inside the image listens on `2529`
- the adapter talks to the local OpenClaw gateway over loopback WebSocket
- ACP websocket clients connect to that long-lived adapter instead of spawning a fresh runtime
- the adapter also exposes cheap HTTP health and metadata endpoints for Spritz operator discovery
- if gateway auth mode is `trusted-proxy`, the adapter uses a loopback-only header injector so the
internal ACP hop satisfies the same trusted-proxy contract as the browser route
- in trusted-proxy mode, the wrapper impersonates a Control UI-style operator profile without
device identity so OpenClaw does not force pairing for the internal ACP bridge
- in trusted-proxy mode, the adapter impersonates a Control UI-style operator profile without
device identity so OpenClaw does not force pairing for the internal ACP client

This keeps the Spritz side backend-agnostic while OpenClaw remains free to add native socket ACP
later.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ Spritz is intended to remain portable and standalone:
- `api/`: authenticated API and ACP gateway
- `ui/`: built-in web UI
- `helm/spritz/`: standalone deployment chart
- `images/examples/openclaw/`: OpenClaw example runtime and ACP bridge
- `images/examples/openclaw/`: OpenClaw example runtime and ACP adapter
- `docs/`: architecture and deployment documents

## Key docs
Expand Down
3 changes: 3 additions & 0 deletions crd/generated/spritz.sh_spritzes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,9 @@ spec:
type: object
lastError:
type: string
lastMetadataAt:
format: date-time
type: string
lastProbeAt:
format: date-time
type: string
Expand Down
3 changes: 3 additions & 0 deletions crd/spritz.sh_spritzes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,9 @@ spec:
type: object
lastError:
type: string
lastMetadataAt:
format: date-time
type: string
lastProbeAt:
format: date-time
type: string
Expand Down
33 changes: 25 additions & 8 deletions docs/2026-03-09-acp-port-and-agent-chat-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,24 @@ Spritz reserves one internal service/container port for ACP:

If a workspace process listens there and answers ACP `initialize`, Spritz treats it as an ACP agent.

For the current Spritz ACP runtime contract, the same service should also expose:

- `GET /healthz`
- `GET /.well-known/spritz-acp`

Those HTTP endpoints exist for control-plane health and metadata discovery. Browser and API ACP
traffic still uses the WebSocket endpoint.

### Discovery ownership

The operator owns ACP discovery.

When a workspace deployment is ready, the operator:

1. connects to `ws://<spritz>.<namespace>.svc.cluster.local:2529/`
2. sends ACP `initialize`
3. normalizes the response into `status.acp`
1. checks `http://<spritz>.<namespace>.svc.cluster.local:2529/healthz`
2. fetches `http://<spritz>.<namespace>.svc.cluster.local:2529/.well-known/spritz-acp` when ACP
metadata is missing or stale
3. normalizes that response into `status.acp`
4. sets the `ACPReady` condition on the `Spritz` resource

The API does not probe ACP during user requests.
Expand Down Expand Up @@ -88,8 +97,11 @@ acp:
enabled: true
port: 2529
path: /
healthPath: /healthz
metadataPath: /.well-known/spritz-acp
probeTimeout: 3s
refreshInterval: 30s
metadataRefreshInterval: 5m
networkPolicy:
enabled: false

Expand Down Expand Up @@ -117,10 +129,14 @@ Any workspace backend may be used as long as it:

OpenClaw is one example backend, not the protocol owner.

For the current OpenClaw preset, Spritz ships an image-owned compatibility bridge that exposes
WebSocket ACP on `2529` and forwards each connection into the image-owned
`spritz-openclaw-acp-wrapper` over stdio. This bridge is intentionally confined to the image so
the Spritz ACP control plane does not become OpenClaw-specific.
For the current OpenClaw preset, Spritz ships an image-owned ACP adapter that exposes:

- WebSocket ACP on `2529`
- `GET /healthz`
- `GET /.well-known/spritz-acp`

That adapter owns OpenClaw-specific session mapping and transcript replay, so the Spritz ACP
control plane stays backend-agnostic.

## UI behavior

Expand All @@ -137,7 +153,7 @@ On reconnect:
- the UI first asks Spritz API to bootstrap the selected conversation
- Spritz API loads the stored ACP session or explicitly repairs it if the
backend confirms that the session is missing
- the UI then connects through the ACP bridge using the confirmed conversation
- the UI then connects through the ACP gateway path using the confirmed conversation
binding

## Security model
Expand Down Expand Up @@ -165,4 +181,5 @@ A deployment is considered correct when all of these are true:

- `README.md`
- `OPENCLAW.md`
- `docs/2026-03-10-acp-adapter-and-runtime-target-architecture.md`
- `docs/2026-02-24-simplest-spritz-deployment-spec.md`
241 changes: 241 additions & 0 deletions docs/2026-03-10-acp-adapter-and-runtime-target-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,241 @@
---
date: 2026-03-10
author: Onur <onur@textcortex.com>
title: ACP Adapter and Runtime Target Architecture
tags: [spritz, acp, adapter, runtime, architecture]
---

## Overview

This document defines the target runtime architecture for ACP-backed
workspaces in Spritz.

The goal is a control plane that stays simple:

- Spritz creates and manages workspaces
- Spritz routes authenticated ACP traffic to those workspaces
- the workspace exposes one stable ACP endpoint on port `2529`
- the backend behind that endpoint remains replaceable

Spritz must stay backend-agnostic. OpenClaw is one backend, not a special case
that shapes the control-plane architecture.

## Problem Statement

The current OpenClaw integration still couples ACP connection lifecycle to
backend process lifecycle too tightly.

That creates avoidable failure modes:

- short-lived ACP probes can trigger real runtime work
- disconnecting one ACP websocket can tear down backend state abruptly
- backend-specific behavior leaks into the control plane
- reconnect and replay behavior becomes harder to reason about

The target architecture is to separate those responsibilities cleanly.

## Desired Role Split

### Spritz

Spritz owns:

- workspace provisioning
- user authentication and authorization
- workspace discovery and metadata
- conversation records
- browser-facing ACP gatewaying

Spritz does not own:

- backend runtime internals
- backend transcript storage
- backend-specific session semantics

### ACP adapter

Each ACP-capable workspace should expose one long-lived ACP service on
port `2529`.

If the backend is not natively ACP, the workspace should run an ACP adapter.

The ACP adapter owns:

- ACP transport termination on `2529`
- HTTP health and metadata on the same port for control-plane use
- ACP session lifecycle
- deterministic mapping from ACP session id to backend runtime session
- transcript replay for `session/load`
- translation between backend-native events and ACP events
- graceful handling of upstream disconnects and shutdown

The ACP adapter must be the only backend-specific integration layer.

### Backend runtime

The backend runtime owns:

- actual execution
- actual transcript and session state
- tool execution
- backend-native storage and runtime semantics

The backend runtime should not be directly coupled to Spritz.

## Target Workspace Contract

Every ACP-capable workspace should provide exactly one stable internal ACP
contract:

- port: `2529`
- transport: WebSocket
- protocol: ACP JSON-RPC

That endpoint should be long-lived and safe for multiple clients over time.

Spritz should be able to assume:

- ACP `initialize` is safe and cheap
- `session/new` creates a backend session for a Spritz conversation
- `session/load` replays transcript from backend storage
- disconnecting one client does not corrupt or abruptly terminate the backend

## Control-Plane Changes

### Operator

The operator should stop using real ACP runtime websocket sessions as the
normal liveness path.

The operator should use:

- Kubernetes readiness for basic health
- the adapter's lightweight HTTP health and metadata paths for ACP capability discovery

ACP metadata refresh should be slow and side-effect free. It must not create
or tear down backend runtime sessions as part of routine health checks.

### API

The Spritz API should remain the only browser-facing ACP gateway.

The path should stay:

`browser -> spritz-api -> workspace ACP endpoint`

The API should own:

- conversation bootstrap
- conversation to ACP session binding
- authenticated ACP proxying

The browser should not invent, repair, or replace ACP session bindings on its
own.

### UI

The UI should stay thin.

It should:

- select conversations by Spritz conversation id
- ask the API to bootstrap the conversation
- connect through the Spritz ACP gateway
- render replayed and live ACP updates

It should not:

- infer backend session ownership
- route by backend runtime ids
- maintain correctness-critical transcript state locally

## Adapter Requirements

The ACP adapter is the key runtime boundary.

It should be implemented as one long-lived process per workspace, not as a new
process spawned for every websocket connection.

Required behavior:

- accept many ACP client connections over time
- expose cheap HTTP health and metadata endpoints without starting real runtime sessions
- keep backend session mapping deterministic
- perform graceful shutdown of upstream resources
- isolate transport disconnects from transcript correctness
- support replay from backend transcript storage
- normalize backend errors into ACP errors instead of leaking raw HTML or
transport pages into transcripts

For a backend like OpenClaw today, the adapter should talk to OpenClaw locally
over private pod-internal addresses only.

It should not route internal requests through public ingress, external CDN
hosts, or other edge paths.

## Transcript and Session Model

The stable model remains:

- `SpritzConversation.metadata.name` is the route and thread id
- `SpritzConversation.spec.sessionId` is the ACP session id
- the backend runtime session key stays internal to the adapter/backend

The source of truth for transcript history should be the backend session store.

The adapter must make `session/load` replay that history correctly without
depending on browser cache.

Browser-local transcript cache is allowed only as a rendering optimization.

## Cutover Rules

This architecture should be treated as a cutover, not a long-term dual-stack
migration.

Rules:

- keep one ACP runtime path
- do not keep parallel legacy and new runtime flows alive indefinitely
- remove per-connection child-process ACP runtime behavior once the adapter is
in place
- remove browser-side session repair logic once API bootstrap is authoritative
- keep one cache format for ACP thread rendering

The target is one clear path from browser to workspace ACP runtime.

## Implementation Direction

The current cutover in Spritz implements the first step of this architecture:

- the OpenClaw example image now runs one long-lived ACP server process on `2529`
- the operator now uses HTTP health and metadata instead of periodic ACP websocket probes
- workspace pod readiness and liveness use the ACP health endpoint

Remaining implementation work should focus on these changes:

1. Ensure all adapter-to-backend traffic stays pod-local or cluster-local.
2. Keep conversation bootstrap and binding ownership in the API.
3. Make transcript replay fully backend-driven through `session/load`.
4. Add soak tests that keep chat sessions open across repeated metadata refresh
intervals and verify no websocket reset churn or transcript corruption.

## Validation

The target architecture is considered correct when all of the following are
true:

- a workspace exposes one stable ACP endpoint on `2529`
- repeated readiness and metadata refresh cycles do not create runtime churn
- disconnecting one ACP client does not produce abnormal backend websocket
reset loops
- reopening a conversation restores transcript from backend replay
- Spritz UI, API, and operator remain backend-agnostic
- replacing OpenClaw with another ACP-compatible backend does not require
control-plane changes

## References

- `docs/2026-03-09-acp-port-and-agent-chat-architecture.md`
- `docs/2026-03-10-acp-conversation-storage-and-replay-model.md`
- `docs/2026-02-24-simplest-spritz-deployment-spec.md`
Loading