textcortex · onutc · Mar 10, 2026 · Mar 10, 2026 · Mar 10, 2026
diff --git a/OPENCLAW.md b/OPENCLAW.md
@@ -35,12 +35,16 @@ The OpenClaw example entrypoint does the following:
    - `gateway.port` from `OPENCLAW_GATEWAY_PORT` (default `8080`)
    - `gateway.bind` from `OPENCLAW_GATEWAY_BIND` (default `lan`)
 4. Ensures `OPENCLAW_GATEWAY_TOKEN` exists (uses provided token or generates one).
-5. Writes the gateway token to a local token file for bridge use.
-6. Starts an image-owned ACP compatibility bridge on `0.0.0.0:2529` unless `OPENCLAW_ACP_ENABLED=false`.
-7. When gateway auth mode is `trusted-proxy`, automatically trusts loopback for the internal bridge,
-   injects the required trusted-proxy headers on the bridge's upstream gateway hop, and rewrites the
+5. Writes the gateway token to a local token file for ACP adapter use.
+6. Starts an image-owned ACP adapter on `0.0.0.0:2529` unless `OPENCLAW_ACP_ENABLED=false`.
+7. Exposes:
+   - WebSocket ACP on `/`
+   - `GET /healthz`
+   - `GET /.well-known/spritz-acp`
+8. When gateway auth mode is `trusted-proxy`, automatically trusts loopback for the internal adapter,
+   injects the required trusted-proxy headers on the adapter's upstream gateway hop, and rewrites the
    upstream gateway `connect` handshake into a Control UI operator session without device identity.
-8. Auto-starts OpenClaw when command is default (`sleep infinity`), unless `OPENCLAW_AUTO_START=false`.
+9. Auto-starts OpenClaw when command is default (`sleep infinity`), unless `OPENCLAW_AUTO_START=false`.
 
 Key implication: direct `/w/{name}` access with `bind=lan` expects real gateway auth.
 
@@ -61,15 +65,16 @@ Spritz will then:
 
 This ACP path is separate from OpenClaw's dashboard and gateway UI.
 
-Today the example image satisfies that contract with a compatibility bridge:
+Today the example image satisfies that contract with a long-lived ACP adapter:
 
-- WebSocket server inside the image listens on `2529`
-- each ACP connection spawns `spritz-openclaw-acp-wrapper`
-- the wrapper talks to the local OpenClaw gateway over loopback WebSocket
-- if gateway auth mode is `trusted-proxy`, the bridge uses a loopback-only header injector so the
+- one long-lived Node ACP server inside the image listens on `2529`
+- the adapter talks to the local OpenClaw gateway over loopback WebSocket
+- ACP websocket clients connect to that long-lived adapter instead of spawning a fresh runtime
+- the adapter also exposes cheap HTTP health and metadata endpoints for Spritz operator discovery
+- if gateway auth mode is `trusted-proxy`, the adapter uses a loopback-only header injector so the
   internal ACP hop satisfies the same trusted-proxy contract as the browser route
-- in trusted-proxy mode, the wrapper impersonates a Control UI-style operator profile without
-  device identity so OpenClaw does not force pairing for the internal ACP bridge
+- in trusted-proxy mode, the adapter impersonates a Control UI-style operator profile without
+  device identity so OpenClaw does not force pairing for the internal ACP client
 
 This keeps the Spritz side backend-agnostic while OpenClaw remains free to add native socket ACP
 later.

diff --git a/README.md b/README.md
@@ -174,7 +174,7 @@ Spritz is intended to remain portable and standalone:
 - `api/`: authenticated API and ACP gateway
 - `ui/`: built-in web UI
 - `helm/spritz/`: standalone deployment chart
-- `images/examples/openclaw/`: OpenClaw example runtime and ACP bridge
+- `images/examples/openclaw/`: OpenClaw example runtime and ACP adapter
 - `docs/`: architecture and deployment documents
 
 ## Key docs

diff --git a/crd/generated/spritz.sh_spritzes.yaml b/crd/generated/spritz.sh_spritzes.yaml
@@ -580,6 +580,9 @@ spec:
                     type: object
                   lastError:
                     type: string
+                  lastMetadataAt:
+                    format: date-time
+                    type: string
                   lastProbeAt:
                     format: date-time
                     type: string

diff --git a/crd/spritz.sh_spritzes.yaml b/crd/spritz.sh_spritzes.yaml
@@ -580,6 +580,9 @@ spec:
                     type: object
                   lastError:
                     type: string
+                  lastMetadataAt:
+                    format: date-time
+                    type: string
                   lastProbeAt:
                     format: date-time
                     type: string

diff --git a/docs/2026-03-09-acp-port-and-agent-chat-architecture.md b/docs/2026-03-09-acp-port-and-agent-chat-architecture.md
@@ -30,15 +30,24 @@ Spritz reserves one internal service/container port for ACP:
 
 If a workspace process listens there and answers ACP `initialize`, Spritz treats it as an ACP agent.
 
+For the current Spritz ACP runtime contract, the same service should also expose:
+
+- `GET /healthz`
+- `GET /.well-known/spritz-acp`
+
+Those HTTP endpoints exist for control-plane health and metadata discovery. Browser and API ACP
+traffic still uses the WebSocket endpoint.
+
 ### Discovery ownership
 
 The operator owns ACP discovery.
 
 When a workspace deployment is ready, the operator:
 
-1. connects to `ws://<spritz>.<namespace>.svc.cluster.local:2529/`
-2. sends ACP `initialize`
-3. normalizes the response into `status.acp`
+1. checks `http://<spritz>.<namespace>.svc.cluster.local:2529/healthz`
+2. fetches `http://<spritz>.<namespace>.svc.cluster.local:2529/.well-known/spritz-acp` when ACP
+   metadata is missing or stale
+3. normalizes that response into `status.acp`
 4. sets the `ACPReady` condition on the `Spritz` resource
 
 The API does not probe ACP during user requests.
@@ -88,8 +97,11 @@ acp:
   enabled: true
   port: 2529
   path: /
+  healthPath: /healthz
+  metadataPath: /.well-known/spritz-acp
   probeTimeout: 3s
   refreshInterval: 30s
+  metadataRefreshInterval: 5m
   networkPolicy:
     enabled: false
 
@@ -117,10 +129,14 @@ Any workspace backend may be used as long as it:
 
 OpenClaw is one example backend, not the protocol owner.
 
-For the current OpenClaw preset, Spritz ships an image-owned compatibility bridge that exposes
-WebSocket ACP on `2529` and forwards each connection into the image-owned
-`spritz-openclaw-acp-wrapper` over stdio. This bridge is intentionally confined to the image so
-the Spritz ACP control plane does not become OpenClaw-specific.
+For the current OpenClaw preset, Spritz ships an image-owned ACP adapter that exposes:
+
+- WebSocket ACP on `2529`
+- `GET /healthz`
+- `GET /.well-known/spritz-acp`
+
+That adapter owns OpenClaw-specific session mapping and transcript replay, so the Spritz ACP
+control plane stays backend-agnostic.
 
 ## UI behavior
 
@@ -137,7 +153,7 @@ On reconnect:
 - the UI first asks Spritz API to bootstrap the selected conversation
 - Spritz API loads the stored ACP session or explicitly repairs it if the
   backend confirms that the session is missing
-- the UI then connects through the ACP bridge using the confirmed conversation
+- the UI then connects through the ACP gateway path using the confirmed conversation
   binding
 
 ## Security model
@@ -165,4 +181,5 @@ A deployment is considered correct when all of these are true:
 
 - `README.md`
 - `OPENCLAW.md`
+- `docs/2026-03-10-acp-adapter-and-runtime-target-architecture.md`
 - `docs/2026-02-24-simplest-spritz-deployment-spec.md`
diff --git a/docs/2026-03-10-acp-adapter-and-runtime-target-architecture.md b/docs/2026-03-10-acp-adapter-and-runtime-target-architecture.md
@@ -0,0 +1,241 @@
+---
+date: 2026-03-10
+author: Onur <onur@textcortex.com>
+title: ACP Adapter and Runtime Target Architecture
+tags: [spritz, acp, adapter, runtime, architecture]
+---
+
+## Overview
+
+This document defines the target runtime architecture for ACP-backed
+workspaces in Spritz.
+
+The goal is a control plane that stays simple:
+
+- Spritz creates and manages workspaces
+- Spritz routes authenticated ACP traffic to those workspaces
+- the workspace exposes one stable ACP endpoint on port `2529`
+- the backend behind that endpoint remains replaceable
+
+Spritz must stay backend-agnostic. OpenClaw is one backend, not a special case
+that shapes the control-plane architecture.
+
+## Problem Statement
+
+The current OpenClaw integration still couples ACP connection lifecycle to
+backend process lifecycle too tightly.
+
+That creates avoidable failure modes:
+
+- short-lived ACP probes can trigger real runtime work
+- disconnecting one ACP websocket can tear down backend state abruptly
+- backend-specific behavior leaks into the control plane
+- reconnect and replay behavior becomes harder to reason about
+
+The target architecture is to separate those responsibilities cleanly.
+
+## Desired Role Split
+
+### Spritz
+
+Spritz owns:
+
+- workspace provisioning
+- user authentication and authorization
+- workspace discovery and metadata
+- conversation records
+- browser-facing ACP gatewaying
+
+Spritz does not own:
+
+- backend runtime internals
+- backend transcript storage
+- backend-specific session semantics
+
+### ACP adapter
+
+Each ACP-capable workspace should expose one long-lived ACP service on
+port `2529`.
+
+If the backend is not natively ACP, the workspace should run an ACP adapter.
+
+The ACP adapter owns:
+
+- ACP transport termination on `2529`
+- HTTP health and metadata on the same port for control-plane use
+- ACP session lifecycle
+- deterministic mapping from ACP session id to backend runtime session
+- transcript replay for `session/load`
+- translation between backend-native events and ACP events
+- graceful handling of upstream disconnects and shutdown
+
+The ACP adapter must be the only backend-specific integration layer.
+
+### Backend runtime
+
+The backend runtime owns:
+
+- actual execution
+- actual transcript and session state
+- tool execution
+- backend-native storage and runtime semantics
+
+The backend runtime should not be directly coupled to Spritz.
+
+## Target Workspace Contract
+
+Every ACP-capable workspace should provide exactly one stable internal ACP
+contract:
+
+- port: `2529`
+- transport: WebSocket
+- protocol: ACP JSON-RPC
+
+That endpoint should be long-lived and safe for multiple clients over time.
+
+Spritz should be able to assume:
+
+- ACP `initialize` is safe and cheap
+- `session/new` creates a backend session for a Spritz conversation
+- `session/load` replays transcript from backend storage
+- disconnecting one client does not corrupt or abruptly terminate the backend
+
+## Control-Plane Changes
+
+### Operator
+
+The operator should stop using real ACP runtime websocket sessions as the
+normal liveness path.
+
+The operator should use:
+
+- Kubernetes readiness for basic health
+- the adapter's lightweight HTTP health and metadata paths for ACP capability discovery
+
+ACP metadata refresh should be slow and side-effect free. It must not create
+or tear down backend runtime sessions as part of routine health checks.
+
+### API
+
+The Spritz API should remain the only browser-facing ACP gateway.
+
+The path should stay:
+
+`browser -> spritz-api -> workspace ACP endpoint`
+
+The API should own:
+
+- conversation bootstrap
+- conversation to ACP session binding
+- authenticated ACP proxying
+
+The browser should not invent, repair, or replace ACP session bindings on its
+own.
+
+### UI
+
+The UI should stay thin.
+
+It should:
+
+- select conversations by Spritz conversation id
+- ask the API to bootstrap the conversation
+- connect through the Spritz ACP gateway
+- render replayed and live ACP updates
+
+It should not:
+
+- infer backend session ownership
+- route by backend runtime ids
+- maintain correctness-critical transcript state locally
+
+## Adapter Requirements
+
+The ACP adapter is the key runtime boundary.
+
+It should be implemented as one long-lived process per workspace, not as a new
+process spawned for every websocket connection.
+
+Required behavior:
+
+- accept many ACP client connections over time
+- expose cheap HTTP health and metadata endpoints without starting real runtime sessions
+- keep backend session mapping deterministic
+- perform graceful shutdown of upstream resources
+- isolate transport disconnects from transcript correctness
+- support replay from backend transcript storage
+- normalize backend errors into ACP errors instead of leaking raw HTML or
+  transport pages into transcripts
+
+For a backend like OpenClaw today, the adapter should talk to OpenClaw locally
+over private pod-internal addresses only.
+
+It should not route internal requests through public ingress, external CDN
+hosts, or other edge paths.
+
+## Transcript and Session Model
+
+The stable model remains:
+
+- `SpritzConversation.metadata.name` is the route and thread id
+- `SpritzConversation.spec.sessionId` is the ACP session id
+- the backend runtime session key stays internal to the adapter/backend
+
+The source of truth for transcript history should be the backend session store.
+
+The adapter must make `session/load` replay that history correctly without
+depending on browser cache.
+
+Browser-local transcript cache is allowed only as a rendering optimization.
+
+## Cutover Rules
+
+This architecture should be treated as a cutover, not a long-term dual-stack
+migration.
+
+Rules:
+
+- keep one ACP runtime path
+- do not keep parallel legacy and new runtime flows alive indefinitely
+- remove per-connection child-process ACP runtime behavior once the adapter is
+  in place
+- remove browser-side session repair logic once API bootstrap is authoritative
+- keep one cache format for ACP thread rendering
+
+The target is one clear path from browser to workspace ACP runtime.
+
+## Implementation Direction
+
+The current cutover in Spritz implements the first step of this architecture:
+
+- the OpenClaw example image now runs one long-lived ACP server process on `2529`
+- the operator now uses HTTP health and metadata instead of periodic ACP websocket probes
+- workspace pod readiness and liveness use the ACP health endpoint
+
+Remaining implementation work should focus on these changes:
+
+1. Ensure all adapter-to-backend traffic stays pod-local or cluster-local.
+2. Keep conversation bootstrap and binding ownership in the API.
+3. Make transcript replay fully backend-driven through `session/load`.
+4. Add soak tests that keep chat sessions open across repeated metadata refresh
+   intervals and verify no websocket reset churn or transcript corruption.
+
+## Validation
+
+The target architecture is considered correct when all of the following are
+true:
+
+- a workspace exposes one stable ACP endpoint on `2529`
+- repeated readiness and metadata refresh cycles do not create runtime churn
+- disconnecting one ACP client does not produce abnormal backend websocket
+  reset loops
+- reopening a conversation restores transcript from backend replay
+- Spritz UI, API, and operator remain backend-agnostic
+- replacing OpenClaw with another ACP-compatible backend does not require
+  control-plane changes
+
+## References
+
+- `docs/2026-03-09-acp-port-and-agent-chat-architecture.md`
+- `docs/2026-03-10-acp-conversation-storage-and-replay-model.md`
+- `docs/2026-02-24-simplest-spritz-deployment-spec.md`