feat(server): add Anthropic Messages API ingress endpoint#343
Conversation
Add a managed POST /v1/messages endpoint (plus /v1/messages/count_tokens) that accepts the Anthropic Messages API request dialect and routes it to any configured provider through the unchanged chat-completions pipeline. The Anthropic request is translated once at the ingress boundary into the canonical core.ChatRequest, so model aliases, workflow policy, budgets, failover, the response cache, usage/cost tracking, and audit logging all apply with no changes to gateway, providers, usage, or auditlog. - New isolated internal/anthropicapi package: wire DTOs, request/response translation, OpenAI->Anthropic SSE stream conversion, error envelope. - Anthropic-format SSE streaming with the message_start / content_block_* / message_delta / message_stop event sequence. - count_tokens returns a provider-agnostic heuristic input-token estimate. - Errors render in the Anthropic error envelope end to end; handleError is now dialect-aware (writeGatewayError), used by the auth middleware too. - Map OpenAI stop -> Anthropic stop_sequences in the anthropic provider request translation (also fixes stop for /v1/chat/completions). - Reject document/unknown content blocks with a clear 400 instead of silently dropping them. See ADR-0007 for the design rationale and tradeoffs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR adds a managed Anthropic Messages API ingress path. It changes:
Confidence Score: 5/5This looks safe to merge.
Reviews (3): Last reviewed commit: "fix(server): fully reject trailing bytes..." | Re-trigger Greptile |
📝 WalkthroughWalkthroughAdds managed Anthropic /v1/messages ingress: Anthropic wire types and translation to canonical chat requests, response/SSE conversion back to Anthropic shapes, server endpoints (/v1/messages, /v1/messages/count_tokens), dialect-aware error rendering, provider stop-sequence mapping, and docs/tests. ChangesAnthropic Messages API Implementation
Sequence DiagramsequenceDiagram
participant Client
participant Server as /v1/messages Handler
participant RequestTranslator as anthropicapi.ToChatRequest
participant Dispatcher as dispatchMessages
participant Provider
participant ResponseTranslator as anthropicapi.FromChatResponse
participant StreamConverter as anthropicapi.NewStreamConverter
Client->>Server: POST /v1/messages (Anthropic dialect)
Server->>RequestTranslator: MessagesRequest body
RequestTranslator->>RequestTranslator: validate + normalize roles
RequestTranslator->>RequestTranslator: convert tools/messages/reasoning
RequestTranslator->>Dispatcher: core.ChatRequest
Dispatcher->>Dispatcher: enforce budget constraints
alt streaming mode
Dispatcher->>Provider: StreamChatCompletion
Provider->>Dispatcher: OpenAI SSE stream
Dispatcher->>StreamConverter: wrap stream
StreamConverter->>StreamConverter: parse chunks, emit Anthropic SSE
StreamConverter->>Client: Anthropic SSE events
else non-streaming mode
Dispatcher->>Provider: ExecuteChatCompletion
Provider->>Dispatcher: core.ChatResponse
Dispatcher->>ResponseTranslator: ChatResponse
ResponseTranslator->>Client: MessagesResponse JSON
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Two issues found while reviewing the Anthropic Messages ingress: - top_k was carried through ChatRequest.ExtraFields, which MarshalJSON merges into the wire body. The OpenAI-family providers forward request fields verbatim and reject unknown ones, so an Anthropic request with the legitimate top_k field hard-failed with a 400 when routed to OpenAI, Groq, xAI, DeepSeek, or OpenRouter. Drop top_k (it has no portable OpenAI-compatible equivalent); top_p and temperature are kept. - Anthropic server/built-in tools (web search, code execution, …) carry a versioned tool type. The Tool DTO had no Type field, so a server tool decoded as a custom tool and was mistranslated into a phantom custom function the gateway cannot execute. Add Tool.Type and reject server tools with a clear 400, consistent with how unsupported content blocks are handled. Update ADR-0007 and the docs page accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/adr/0007-anthropic-messages-ingress.md`:
- Around line 83-88: Add a language tag to the fenced code block that begins
with the lines "provider chat SSE" / "→ ObservedSSEStream(...)" so markdownlint
MD040 is satisfied; update the opening fence from ``` to ```text (or another
appropriate language) so the block is annotated (the block containing provider
chat SSE → ObservedSSEStream → anthropicapi.StreamConverter → HTTP response).
In `@internal/anthropicapi/errors_test.go`:
- Around line 17-53: Add two table-driven test cases to the existing test slice
used by TestErrorFromGateway in internal/anthropicapi/errors_test.go to cover
the special anthropicErrorType branches: one for request-too-large using
core.NewInvalidRequestErrorWithStatus(http.StatusRequestEntityTooLarge, "payload
too large", nil) with wantType "request_too_large" and wantStatus
http.StatusRequestEntityTooLarge, and one for forbidden using
core.ParseProviderError("p", http.StatusForbidden, []byte("forbidden"), nil)
with wantType "permission_error" and wantStatus http.StatusForbidden so the
anthropicErrorType mapping for 413 and 403 is exercised.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9f811ed6-e6c7-4a4b-83d4-9339f1f0f740
📒 Files selected for processing (23)
docs/adr/0007-anthropic-messages-ingress.mddocs/advanced/anthropic-messages-api.mdxdocs/docs.jsoninternal/anthropicapi/errors.gointernal/anthropicapi/errors_test.gointernal/anthropicapi/request.gointernal/anthropicapi/request_test.gointernal/anthropicapi/response.gointernal/anthropicapi/response_test.gointernal/anthropicapi/stream.gointernal/anthropicapi/stream_test.gointernal/anthropicapi/types.gointernal/core/endpoints.gointernal/providers/anthropic/anthropic_test.gointernal/providers/anthropic/request_translation.gointernal/providers/anthropic/types.gointernal/server/auth.gointernal/server/error_support.gointernal/server/error_support_test.gointernal/server/http.gointernal/server/messages_handler.gointernal/server/messages_handler_test.gointernal/server/translated_inference_service.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/anthropicapi/request.go`:
- Around line 306-312: The validation error uses the shorthand path
"/p/anthropic" which is misleading; update the error text created in the tool
type check (the block referencing tool.Type and returning
core.NewInvalidRequestError) to point users to the full passthrough endpoint
"/p/anthropic/v1/messages" instead of "/p/anthropic" so the message reads
something like "... use the /p/anthropic/v1/messages passthrough for
provider-native tools".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 55180a33-2236-4a27-b214-7eae41fd87b1
📒 Files selected for processing (5)
docs/adr/0007-anthropic-messages-ingress.mddocs/advanced/anthropic-messages-api.mdxinternal/anthropicapi/request.gointernal/anthropicapi/request_test.gointernal/anthropicapi/types.go
Address PR review feedback on the Messages ingress dialect: - Reject messages whose role is not "user" or "assistant" instead of silently coercing every unknown role (e.g. a typo) to "user", which changed conversation semantics for malformed requests. - Reject a present-but-malformed `system` value (wrong shape or a non-text block) instead of silently dropping it, which made the model run without the caller's instructions. - Reject malformed or non-text `tool_result` content instead of silently converting it to an empty string, which made the downstream provider answer as if the tool returned no data. - Reject request bodies with trailing bytes after the JSON object so a malformed body cannot look valid while audit/cache inputs disagree. - Point validation error messages at the full passthrough path `/p/anthropic/v1/messages` rather than the `/p/anthropic` shorthand. - Add a language tag to the ADR fenced code block (markdownlint MD040). - Cover the request_too_large (413) and permission_error (403) branches of anthropicErrorType in TestErrorFromGateway. Tool-result blocks are still emitted before the user text of the same message: that ordering is required by the OpenAI message contract (tool messages must immediately follow the assistant tool-call message), so the existing order is correct and left unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/anthropicapi/request_test.go`:
- Around line 64-87: The test TestToChatRequestRejectsInvalidShapes currently
only checks that ToChatRequest returns a *core.GatewayError but does not assert
the specific error type; update the test to also assert that the returned
gatewayErr.Type equals core.ErrorTypeInvalidRequest (similar to
TestToChatRequestRejectsServerTool and
TestToChatRequestRejectsUnsupportedContentBlock) so the Anthropic error-envelope
mapping remains validated. Locate the test function
TestToChatRequestRejectsInvalidShapes and after verifying err is a
*core.GatewayError, cast it to gatewayErr and add an assertion that
gatewayErr.Type == core.ErrorTypeInvalidRequest, failing the test if it differs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: be31b772-b2d1-4a3d-903b-92e83906b59a
📒 Files selected for processing (4)
docs/adr/0007-anthropic-messages-ingress.mdinternal/anthropicapi/errors_test.gointernal/anthropicapi/request.gointernal/anthropicapi/request_test.go
Address follow-up PR review feedback: - DecodeMessagesRequest now requires a second decode to reach io.EOF instead of relying on json.Decoder.More(). More() returns false for a trailing "]" or "}", so those bytes slipped through; decoding one more value and requiring EOF rejects every trailing-byte case. - Tighten TestToChatRequestRejectsInvalidShapes to assert the error type is core.ErrorTypeInvalidRequest, matching the sibling rejection tests so the Anthropic error-envelope mapping stays validated. The "require tool_result content" suggestion is intentionally not applied: Anthropic's Messages API treats tool_result `content` as optional (only `tool_use_id` and `type` are required), so rejecting an omitted or empty value would make GoModel stricter than the API it emulates and break requests that work directly against Anthropic. An empty tool result faithfully represents a tool that produced no output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
internal/anthropicapi/request_test.go (1)
58-60:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winTighten
TestToChatRequestValidationto assertcore.ErrorTypeInvalidRequest.This sibling test still only checks for
*core.GatewayError, whileTestToChatRequestRejectsInvalidShapes(lines 83-86),TestToChatRequestRejectsServerTool(lines 221-224), andTestToChatRequestRejectsUnsupportedContentBlock(lines 332-335) all verifyType == core.ErrorTypeInvalidRequest. The Anthropic error-envelope mapping keys offType, so a future regression that surfaces these as a non-invalid type would slip past this assertion.♻️ Proposed change
- if _, ok := err.(*core.GatewayError); !ok { - t.Fatalf("expected *core.GatewayError, got %T", err) + gatewayErr, ok := err.(*core.GatewayError) + if !ok || gatewayErr.Type != core.ErrorTypeInvalidRequest { + t.Fatalf("expected invalid_request_error, got %T: %v", err, err) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/anthropicapi/request_test.go` around lines 58 - 60, The test TestToChatRequestValidation currently only asserts the error is a *core.GatewayError; update the assertion to also check that the error's Type equals core.ErrorTypeInvalidRequest. Locate where the test does the type check (the err.(*core.GatewayError) branch) and add an assertion that the mappedGatewayErr.Type == core.ErrorTypeInvalidRequest (or equivalent) so the test fails if the error envelope type is not ErrorTypeInvalidRequest.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@internal/anthropicapi/request_test.go`:
- Around line 58-60: The test TestToChatRequestValidation currently only asserts
the error is a *core.GatewayError; update the assertion to also check that the
error's Type equals core.ErrorTypeInvalidRequest. Locate where the test does the
type check (the err.(*core.GatewayError) branch) and add an assertion that the
mappedGatewayErr.Type == core.ErrorTypeInvalidRequest (or equivalent) so the
test fails if the error envelope type is not ErrorTypeInvalidRequest.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 94fefead-5985-46b5-83bc-5021d3d21fae
📒 Files selected for processing (2)
internal/anthropicapi/request.gointernal/anthropicapi/request_test.go
Summary
Adds a managed
POST /v1/messagesendpoint (plusPOST /v1/messages/count_tokens) that accepts the Anthropic Messages API request dialect and routes it to any configured provider — not just Anthropic.The Anthropic request is translated once at the ingress boundary into the canonical
core.ChatRequest, then runs through the unchanged chat-completions pipeline. Model aliases, workflow policy, budgets, failover, the response cache, usage/cost tracking, and audit logging all apply — with zero changes togateway,providers,usage, orauditlog.This mirrors the existing
/v1/responsesingress dialect. See ADR-0007 for the full rationale and tradeoffs.User-visible impact
type: "message",contentblocks,stop_reason,usage); errors use the Anthropic error envelope ({"type":"error","error":{...}}).message_start/content_block_*/message_delta/message_stop)./v1/chat/completionsstill returns the OpenAI shape.What's included
internal/anthropicapipackage: wire DTOs, request/response translation, OpenAI→Anthropic SSE stream conversion, error-envelope mapping.internal/server/messages_handler.go; routes registered inhttp.go, classified incore/endpoints.go(dialectanthropic).count_tokensreturns a provider-agnostic heuristic input-token estimate.handleErroris now dialect-aware (writeGatewayError) so middleware errors (auth, workflow resolution) also render the Anthropic envelope.Provider-specific behavior
stop→ Anthropicstop_sequencesin the anthropic provider request translation. This also fixes a pre-existing gap wherestopwas dropped for/v1/chat/completionsrouted to Anthropic./v1/messagesrequest routed to the Anthropic provider is lossy (cache_control, thinking signatures, server tools do not survive the canonical hop) — clients needing byte-exact fidelity should use the/p/anthropic/v1/messagespassthrough.Bugs found during live testing (and fixed)
Tested extensively with curl against real provider keys (Anthropic, OpenAI, Groq, Gemini): non-streaming, streaming, function calling (incl. multi-turn
tool_result), multimodal images, extended thinking,count_tokens, error cases, response cache, cost tracking, audit logs./v1/messages— auth/bad-model errors bypass the handler. Fixed via dialect-awarehandleError; let us delete the duplicate error helpers.stop_sequencessilently dropped when routed to the Anthropic provider — the provider never readstop. Fixed (see above).document/PDF) silently dropped — the model answered as if no attachment was sent. Now returns a clear400.Read()setclosed=trueon EOF, soClose()skippedbody.Close()→ no audit log, no usage/cost record, leaked provider connection. Fixed soClosealways propagates.Testing
go build ./...✓ · fullgo test ./...✓ ·golangci-lint0 issues ·gofmtclean ·make test-race✓ (pre-commit).stop_sequencesmapping./v1/messagesrecords usage rows (with cost) andstream=1audit rows; cache hits recorded at0.000000withcache_type=exact.Docs
docs/advanced/anthropic-messages-api.mdx) with nav entry.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation