feat(gateway): W3 scaffold for /v1/llm_turn cloud-brain endpoint#24
Merged
Conversation
Phase 1B W3 of BenchAGI ADR-0002. Route registration + body validation + auth check for the cloud-brain dispatch target. Anthropic call is TODO (returns 501 so relay claim path can land first). Also: minor lint-fix in extensions/claude-code-bridge/serve.mjs to unblock the repo-wide pre-commit hook (oxlint curly-rule on a pre- existing single-line return). Out of W3 scope but required for commit. Spec: ~/.openclaw/wiki/main/_boards/specs/phase-1a-design-gate-2026-05-05-v2.md §7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small Codex W3 findings: #1 — Null-body guard. A JSON body of `null` (or a primitive/array) would skip the LlmTurnWireRequest cast and throw on field access, producing a 500 from the dispatch error handler instead of a clean 400. Added top-level object guard at the start of validateLlmTurnRequest that returns invalid_field('<root>', 'request body must be a JSON object'). Renamed local var `body` → `wire` for the rest of the function so TypeScript sees the narrowed type. #2 — Fractional max_tokens. Original `Number.isFinite(x) && x > 0` let `max_tokens: 0.5` pass and floor to a 0-token budget. Now require `Number.isInteger(x) && x > 0`. Reason text updated to 'must be a positive integer'. Tests: 14 → 16 (added fractional + non-object body cases). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the 501 stub with an actual Anthropic SDK call. Resolves the credential from process.env.ANTHROPIC_API_KEY, distinguishing OAuth tokens (sk-ant-oat*) from API keys, and surfaces the resolved profile name back as 'used_auth_profile' so the cloud orchestrator (W2) can decide between OAuth-mode (no aiUsageRecords) and API/coin-mode (write aiUsageRecords) per spec §6. NEW IN llm-turn-http.ts: - callAnthropicForLlmTurn(request) — translates camelCase request to Anthropic SDK params. Handles cache_control on system prompt (ephemeral), thinking_level → thinking config with conservative budget_tokens (low=4096, medium=8192, high=16384, xhigh=32768). Calls messages.create, returns wire-format response (snake_case). - resolveAnthropicCredential() — env-based credential resolution. Reports 'env-api-key' or 'env-oauth-token' as the profile name. - LlmTurnWireResponse type — explicit snake_case shape per spec §7 line 741. ERROR PATHS: - 'no_anthropic_credential' (500) — env not set - 'anthropic_auth_failed' (401/403) — credential rejected - 'anthropic_call_failed' (502 default) — SDK error EXPLICIT NEXT-ITERATION work captured inline: - Real auth-profile-machinery integration (loads the agent's configured profile via OpenClaw's auth-profile store, handles OAuth refresh, etc.). The env approach gets Cory's local OpenClaw to a working smoke test path; production-grade auth-profile resolution is the follow-up. - Idempotency-key write-ahead store (lease-recovery semantics per spec §7 line 851). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P1 #1 — thinking.budget_tokens < max_tokens constraint. Anthropic rejects requests where thinking.budget_tokens >= max_tokens; my budget map (high=16384, xhigh=32768) exceeded typical max_tokens=8192 which would 400 immediately. Fix: cap budget to (max_tokens - 1024) reserving room for actual output. If the cap drops below Anthropic's minimum thinking budget (1024), disable thinking entirely rather than send an invalid request. P1 #2 — ANTHROPIC_AUTH_TOKEN env var. Anthropic SDK reads ANTHROPIC_AUTH_TOKEN for OAuth tokens and ANTHROPIC_API_KEY for API keys. My resolver only checked ANTHROPIC_API_KEY, so OAuth-only setups got 'no_anthropic_credential'. Fix: read both, with ANTHROPIC_AUTH_TOKEN taking precedence (matches SDK behavior). API_KEY-shaped OAuth tokens (sk-ant-oat*) still detected and routed via authToken. P2 (deferred) — `as never` casts. Codex flagged that imports of SDK param types would catch shape bugs (e.g. Tool.InputSchema requires `type: "object"`). Validation layer already enforces shape, but upgrading to proper SDK types is a follow-up — not landing this commit to keep the patch focused. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 1B W3 of BenchAGI ADR-0002 — Cloud Brain via Relay. Adds the
/v1/llm_turnendpoint that the BenchAGI cloud orchestrator (W2 — BenchAGI #878) dispatchesLlmTurnDirectives to via the relay.Persona-off-disk invariant:
system_promptarrives inline in the request body — this endpoint MUST NOT write the persona to disk. The legacy/v1/chatpath reads${workspace}/SOUL.md; this endpoint deliberately does not.Scope (this PR — scaffold)
src/gateway/llm-turn-http.ts(~370 lines)validateLlmTurnRequestpure function with size caps + required-field enforcementhandleLlmTurnHttpRequestHTTP handler (returnsPromise<boolean>matching the existing dispatch contract)isLlmTurnPathmatchersrc/gateway/server-http.ts(3 small edits)src/gateway/llm-turn-http.validate.test.tsextensions/claude-code-bridge/serve.mjs:35-39(oxlint curly-rule on a pre-existing single-line return — repo-wide pre-commit hook blocked this commit otherwise)NOT YET in scope (CALL_ANTHROPIC_TODO — follow-up PR)
The handler validates the request body and returns 501
llm_turn_not_implemented. The follow-up PR will:agents/auth-profilesmachineryanthropic-payload-policy.tscache-control machineryidempotency_keyThis scaffold lets the BenchAGI relay claim path land + exercise the route registration end-to-end without burning customer tokens — when the relay POSTs to
/v1/llm_turn, it gets a clean 501 and writeserror.code: 'llm_turn_not_implemented'to the directive doc, which the cloud orchestrator surfaces to webchat. Once the Anthropic call lands, the same flow returns a real response.Wire format
Request:
{ "agent_id": "aurelius", "messages": [...], "system_prompt": "...", "tools": [...], "model": "claude-opus-4-7", "thinking_level": "high", "max_tokens": 8192, "cache_control": { "system": "ephemeral" }, "idempotency_key": "..." }Response (when Anthropic call lands):
{ "content": [...], "stop_reason": "end_turn" | "tool_use" | ..., "model": "claude-opus-4-7-20260301", "usage": { "input_tokens": ..., "output_tokens": ... }, "used_auth_profile": "anthropic-default" }Test plan
validateLlmTurnRequest(pnpm test src/gateway/llm-turn-http.validate.test.ts)tsc --noEmitclean (incl. tsgo pre-commit check)oxlintclean across the repoCompanion PRs in BenchAGI
🤖 Generated with Claude Code