Skip to content

feat(gateway): W3 scaffold for /v1/llm_turn cloud-brain endpoint#24

Merged
LightDriverCS merged 4 commits into
mainfrom
feat/cloud-brain-w3-llm-turn
May 6, 2026
Merged

feat(gateway): W3 scaffold for /v1/llm_turn cloud-brain endpoint#24
LightDriverCS merged 4 commits into
mainfrom
feat/cloud-brain-w3-llm-turn

Conversation

@LightDriverCS
Copy link
Copy Markdown

Summary

Phase 1B W3 of BenchAGI ADR-0002 — Cloud Brain via Relay. Adds the /v1/llm_turn endpoint that the BenchAGI cloud orchestrator (W2 — BenchAGI #878) dispatches LlmTurnDirectives to via the relay.

Persona-off-disk invariant: system_prompt arrives inline in the request body — this endpoint MUST NOT write the persona to disk. The legacy /v1/chat path reads ${workspace}/SOUL.md; this endpoint deliberately does not.

Scope (this PR — scaffold)

  • New file src/gateway/llm-turn-http.ts (~370 lines)
    • Request/response types (camelCase TS, snake_case wire format)
    • validateLlmTurnRequest pure function with size caps + required-field enforcement
    • handleLlmTurnHttpRequest HTTP handler (returns Promise<boolean> matching the existing dispatch contract)
    • isLlmTurnPath matcher
  • Route wiring in src/gateway/server-http.ts (3 small edits)
  • 14 unit tests in src/gateway/llm-turn-http.validate.test.ts
  • Out-of-scope lint fix in extensions/claude-code-bridge/serve.mjs:35-39 (oxlint curly-rule on a pre-existing single-line return — repo-wide pre-commit hook blocked this commit otherwise)

NOT YET in scope (CALL_ANTHROPIC_TODO — follow-up PR)

The handler validates the request body and returns 501 llm_turn_not_implemented. The follow-up PR will:

  1. Resolve agent's local Anthropic auth profile via existing agents/auth-profiles machinery
  2. Translate request → SDK params (messages, system_prompt, tools, thinking_level)
  3. Apply existing anthropic-payload-policy.ts cache-control machinery
  4. Optional write-ahead idempotency record keyed by idempotency_key
  5. Call Anthropic SDK
  6. Translate response (camelCase → snake_case wire format)
  7. Return 200

This scaffold lets the BenchAGI relay claim path land + exercise the route registration end-to-end without burning customer tokens — when the relay POSTs to /v1/llm_turn, it gets a clean 501 and writes error.code: 'llm_turn_not_implemented' to the directive doc, which the cloud orchestrator surfaces to webchat. Once the Anthropic call lands, the same flow returns a real response.

Wire format

Request:

{
  "agent_id": "aurelius",
  "messages": [...],
  "system_prompt": "...",
  "tools": [...],
  "model": "claude-opus-4-7",
  "thinking_level": "high",
  "max_tokens": 8192,
  "cache_control": { "system": "ephemeral" },
  "idempotency_key": "..."
}

Response (when Anthropic call lands):

{
  "content": [...],
  "stop_reason": "end_turn" | "tool_use" | ...,
  "model": "claude-opus-4-7-20260301",
  "usage": { "input_tokens": ..., "output_tokens": ... },
  "used_auth_profile": "anthropic-default"
}

Test plan

  • 14 unit tests on validateLlmTurnRequest (pnpm test src/gateway/llm-turn-http.validate.test.ts)
  • tsc --noEmit clean (incl. tsgo pre-commit check)
  • oxlint clean across the repo
  • Codex Stage B Gate W3 (running)
  • Integration test (full HTTP path with auth + rate limit) — DEFERRED to the Anthropic-call follow-up PR
  • Smoke test against W2's dispatch path (will exercise route end-to-end with 501 response)

Companion PRs in BenchAGI

🤖 Generated with Claude Code

Cory Shelton and others added 4 commits May 5, 2026 21:44
Phase 1B W3 of BenchAGI ADR-0002. Route registration + body validation +
auth check for the cloud-brain dispatch target. Anthropic call is TODO
(returns 501 so relay claim path can land first).

Also: minor lint-fix in extensions/claude-code-bridge/serve.mjs to
unblock the repo-wide pre-commit hook (oxlint curly-rule on a pre-
existing single-line return). Out of W3 scope but required for commit.

Spec: ~/.openclaw/wiki/main/_boards/specs/phase-1a-design-gate-2026-05-05-v2.md §7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small Codex W3 findings:

#1 — Null-body guard. A JSON body of `null` (or a primitive/array)
would skip the LlmTurnWireRequest cast and throw on field access,
producing a 500 from the dispatch error handler instead of a clean
400. Added top-level object guard at the start of validateLlmTurnRequest
that returns invalid_field('<root>', 'request body must be a JSON object').
Renamed local var `body` → `wire` for the rest of the function so
TypeScript sees the narrowed type.

#2 — Fractional max_tokens. Original `Number.isFinite(x) && x > 0`
let `max_tokens: 0.5` pass and floor to a 0-token budget. Now require
`Number.isInteger(x) && x > 0`. Reason text updated to 'must be a positive integer'.

Tests: 14 → 16 (added fractional + non-object body cases).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the 501 stub with an actual Anthropic SDK call. Resolves the
credential from process.env.ANTHROPIC_API_KEY, distinguishing OAuth
tokens (sk-ant-oat*) from API keys, and surfaces the resolved profile
name back as 'used_auth_profile' so the cloud orchestrator (W2) can
decide between OAuth-mode (no aiUsageRecords) and API/coin-mode
(write aiUsageRecords) per spec §6.

NEW IN llm-turn-http.ts:

- callAnthropicForLlmTurn(request) — translates camelCase request to
  Anthropic SDK params. Handles cache_control on system prompt
  (ephemeral), thinking_level → thinking config with conservative
  budget_tokens (low=4096, medium=8192, high=16384, xhigh=32768).
  Calls messages.create, returns wire-format response (snake_case).

- resolveAnthropicCredential() — env-based credential resolution.
  Reports 'env-api-key' or 'env-oauth-token' as the profile name.

- LlmTurnWireResponse type — explicit snake_case shape per spec §7
  line 741.

ERROR PATHS:
- 'no_anthropic_credential' (500) — env not set
- 'anthropic_auth_failed' (401/403) — credential rejected
- 'anthropic_call_failed' (502 default) — SDK error

EXPLICIT NEXT-ITERATION work captured inline:

- Real auth-profile-machinery integration (loads the agent's
  configured profile via OpenClaw's auth-profile store, handles OAuth
  refresh, etc.). The env approach gets Cory's local OpenClaw to a
  working smoke test path; production-grade auth-profile resolution
  is the follow-up.

- Idempotency-key write-ahead store (lease-recovery semantics per
  spec §7 line 851).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1 #1 — thinking.budget_tokens < max_tokens constraint.

Anthropic rejects requests where thinking.budget_tokens >= max_tokens;
my budget map (high=16384, xhigh=32768) exceeded typical max_tokens=8192
which would 400 immediately. Fix: cap budget to (max_tokens - 1024)
reserving room for actual output. If the cap drops below Anthropic's
minimum thinking budget (1024), disable thinking entirely rather than
send an invalid request.

P1 #2 — ANTHROPIC_AUTH_TOKEN env var.

Anthropic SDK reads ANTHROPIC_AUTH_TOKEN for OAuth tokens and
ANTHROPIC_API_KEY for API keys. My resolver only checked
ANTHROPIC_API_KEY, so OAuth-only setups got 'no_anthropic_credential'.
Fix: read both, with ANTHROPIC_AUTH_TOKEN taking precedence (matches
SDK behavior). API_KEY-shaped OAuth tokens (sk-ant-oat*) still detected
and routed via authToken.

P2 (deferred) — `as never` casts. Codex flagged that imports of SDK
param types would catch shape bugs (e.g. Tool.InputSchema requires
`type: "object"`). Validation layer already enforces shape, but
upgrading to proper SDK types is a follow-up — not landing this commit
to keep the patch focused.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@LightDriverCS LightDriverCS marked this pull request as ready for review May 6, 2026 21:07
@LightDriverCS LightDriverCS merged commit e514646 into main May 6, 2026
31 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant