feat(providers): claude on vertex (#1009 cell)#1024
Merged
Conversation
Adds the Vertex AI Anthropic-partner cell to the provider matrix. Mirrors the openai+azure (#1010) and gemini+vertex (#1023) patterns: the per-provider factory derives the platform URL from PlatformConfig when caller passes an empty BaseURL, and the registry skips the vendor default for that platform pair so the factory branch is reachable. Vertex Anthropic specifics handled: - URL: {baseURL}/{model}:rawPredict for Predict / PredictWithTools, {baseURL}/{model}:streamRawPredict for the streaming variants. - Body: shared partner shape (no `model`, no `stream`, `anthropic_version` body key) with the platform-specific value `vertex-2023-10-16` (vs Bedrock's `bedrock-2023-05-31`). - Auth: Bearer token via the GCP credential chain (the same credential.Apply path Bedrock already uses for SigV4 — neither partner uses x-api-key). - Streaming transport: SSE (NewSSEScanner), distinct from Bedrock's binary event-stream (NewBedrockEventScanner). Both wired through RunStreamingRequest for retry / budget / semaphore parity. - Errors: routed through providers.ParsePlatformHTTPError so the Vertex error envelope is parsed instead of returned as raw bytes. Code structure: - `vertexPlatform`, `vertexVersionValue`, `vertexAnthropicEndpoint` in claude.go. - `isVertex()`, `isPartnerHosted()`, `platformAnthropicVersion()` predicates so callers branch by capability rather than enumerating platforms inline. - `messagesURL()` / `messagesStreamURL()` extended with a Vertex branch. - `marshalBedrockRequest` and `marshalBedrockStreamingRequest` generalized: same body shape, version sourced from `platformAnthropicVersion()` so they serve both Bedrock and Vertex. - `applyToolRequestHeaders` and `makeRequest` switched from `isBedrock` to `isPartnerHosted` for body-modification and auth dispatch. Tool-streaming gains a `streamVertexToolRequest` peer to the existing Bedrock and direct-API helpers; the SSE wiring is factored into `runSSEToolStream` so direct and Vertex paths share the engine and only differ in body bytes. - Registry skips `https://api.anthropic.com` default when `spec.Platform=="vertex"` (regression test added). The Bedrock path is unchanged — the existing Bedrock integration test still passes after the refactor. Tests: - `vertex_unit_test.go` covers helpers (predicates, URL builders, body marshallers) and the factory URL derivation. - `vertex_streaming_test.go` uses httptest to drive PredictStream through the new Vertex branch and asserts wire format (`:streamRawPredict` URL action, no `model`/`stream` in body, `anthropic_version=vertex-2023-10-16`, Bearer auth, no x-api-key, no anthropic-version header). - `registry_extended_test.go` adds the claude+vertex regression test matching the openai+azure and gemini+vertex ones. Refs #1009
Mirrors the gemini+vertex (#1023) and bedrock integration suites: build-tag `integration`, env-gated skips when GCP_PROJECT is unset or ADC tokens are unavailable. Covers URL derivation, predict, predict-with-tools, streaming, error propagation, and cost calculation against a real Vertex Anthropic-partner deployment. Live runs additionally require Anthropic models to be enabled in the project's Vertex Model Garden (a manual one-time terms-acceptance in the GCP console, analogous to the Bedrock use-case form). Without that step the tests propagate Vertex's standard 404 "Publisher Model not found or your project does not have access" which the new ParsePlatformHTTPError integration in claude.go surfaces cleanly. Run: gcloud auth application-default login gcloud auth application-default set-quota-project <project> export GCP_PROJECT=<project> go test -tags=integration ./runtime/providers/claude/... -run Vertex -v
|
3 tasks
chaholl
added a commit
that referenced
this pull request
Apr 20, 2026
Adds the AWS Bedrock OpenAI-partner cell (gpt-oss family) to the provider matrix. 7th cell delivered against #1009 following: openai+azure (#1010), gemini+vertex (#1023), claude+vertex (#1024), fail-fast rejections (#1025). Wire format (verified against live Bedrock us-west-2): - URL: {baseURL}/model/{model-id}/invoke (standard Bedrock invoke) - Auth: AWS SigV4 via credentials.AWSCredential - Request: OpenAI Chat Completions JSON. Bedrock ignores the model field in the body (it routes by URL) so no stripping is needed — the existing request builder works as-is. - Response: standard OpenAI Chat Completions JSON — parsed by the existing openAIResponse unmarshal path. Code structure: - `bedrockPlatform` constant + `isBedrock()` predicate in openai.go. - `chatCompletionsURL()` and `responsesURL()` branch for Bedrock: the invoke URL path; Responses API falls back to Chat Completions (mirrors the existing Azure fallback — neither partner exposes it). - `NewProviderFromConfig` derives baseURL from PlatformConfig.Region via credentials.BedrockEndpoint when BaseURL is empty, forces APIModeCompletions, and auto-adds `max_tokens` and `top_p` to UnsupportedParams for the Bedrock default set: * max_tokens: gpt-oss requires `max_completion_tokens` instead (same switch the request builder already flips for o-series). * top_p: gpt-oss rejects 0.0 and the framework default is 0; omitting leaves the model default in place. Operators can override the full unsupported set via ProviderSpec.UnsupportedParams. - registry.go: api.openai.com/v1 default now skipped for Platform=="bedrock" as well as "azure", so the factory's PlatformConfig URL derivation is reachable (same root cause as #1010). Streaming (known limitation): Bedrock's `invoke-with-response-stream` returns binary event-stream framing with OpenAI-format chunks inside — distinct from the SSE the rest of the openai stream code expects. Until a dedicated scanner lands, PredictStream / PredictStreamWithTools run a non-streaming Predict and emit a single terminal StreamChunk with content + FinishReason + CostInfo. This is honest (Bedrock's plain `invoke` is non-streaming) and keeps Arena scenarios with `streaming: true` compatible. Tests: - bedrock_unit_test.go covers predicates, URL builders, factory URL derivation, API-mode forcing, unsupported-params auto-detection, and that explicit UnsupportedParams overrides Bedrock auto-detect. - bedrock_integration_test.go (//go:build integration) covers URL construction, predict, predict-with-tools, streaming fallback, error propagation, and cost calculation against a real Bedrock gpt-oss deployment. All 6 pass locally against us-west-2 on the omnia-aws profile. Refs #1009
chaholl
added a commit
that referenced
this pull request
Apr 20, 2026
* feat(providers): add openai bedrock code path (#1009 cell) Adds the AWS Bedrock OpenAI-partner cell (gpt-oss family) to the provider matrix. 7th cell delivered against #1009 following: openai+azure (#1010), gemini+vertex (#1023), claude+vertex (#1024), fail-fast rejections (#1025). Wire format (verified against live Bedrock us-west-2): - URL: {baseURL}/model/{model-id}/invoke (standard Bedrock invoke) - Auth: AWS SigV4 via credentials.AWSCredential - Request: OpenAI Chat Completions JSON. Bedrock ignores the model field in the body (it routes by URL) so no stripping is needed — the existing request builder works as-is. - Response: standard OpenAI Chat Completions JSON — parsed by the existing openAIResponse unmarshal path. Code structure: - `bedrockPlatform` constant + `isBedrock()` predicate in openai.go. - `chatCompletionsURL()` and `responsesURL()` branch for Bedrock: the invoke URL path; Responses API falls back to Chat Completions (mirrors the existing Azure fallback — neither partner exposes it). - `NewProviderFromConfig` derives baseURL from PlatformConfig.Region via credentials.BedrockEndpoint when BaseURL is empty, forces APIModeCompletions, and auto-adds `max_tokens` and `top_p` to UnsupportedParams for the Bedrock default set: * max_tokens: gpt-oss requires `max_completion_tokens` instead (same switch the request builder already flips for o-series). * top_p: gpt-oss rejects 0.0 and the framework default is 0; omitting leaves the model default in place. Operators can override the full unsupported set via ProviderSpec.UnsupportedParams. - registry.go: api.openai.com/v1 default now skipped for Platform=="bedrock" as well as "azure", so the factory's PlatformConfig URL derivation is reachable (same root cause as #1010). Streaming (known limitation): Bedrock's `invoke-with-response-stream` returns binary event-stream framing with OpenAI-format chunks inside — distinct from the SSE the rest of the openai stream code expects. Until a dedicated scanner lands, PredictStream / PredictStreamWithTools run a non-streaming Predict and emit a single terminal StreamChunk with content + FinishReason + CostInfo. This is honest (Bedrock's plain `invoke` is non-streaming) and keeps Arena scenarios with `streaming: true` compatible. Tests: - bedrock_unit_test.go covers predicates, URL builders, factory URL derivation, API-mode forcing, unsupported-params auto-detection, and that explicit UnsupportedParams overrides Bedrock auto-detect. - bedrock_integration_test.go (//go:build integration) covers URL construction, predict, predict-with-tools, streaming fallback, error propagation, and cost calculation against a real Bedrock gpt-oss deployment. All 6 pass locally against us-west-2 on the omnia-aws profile. Refs #1009 * test(providers): unit-test bedrock streaming fallbacks for SonarCloud coverage The bedrock_integration_test.go suite covers the new code path end-to-end against real AWS Bedrock, but build-tag `integration` is not run in CI so SonarCloud sees the predictStreamBedrockFallback and predictStreamWithToolsBedrockFallback bodies as uncovered (new_coverage 41.4% < 80% threshold). Adds httptest-based unit tests that mock the Bedrock invoke endpoint with canned OpenAI Chat Completions JSON and exercise: - PredictStream Bedrock path: emits exactly one terminal chunk with Content/Delta/FinishReason=stop/CostInfo populated. - PredictStream Bedrock error: 4xx upstream surfaces as error from the streaming entrypoint. - PredictStreamWithTools Bedrock path: emits exactly one terminal chunk with ToolCalls populated and FinishReason=tool_calls. - PredictStreamWithTools Bedrock error: 5xx upstream surfaces as error. No production code changes.
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Adds Vertex AI Anthropic-partner support to the claude provider. Third
cell delivered against the #1009 matrix following openai+azure (#1010)
and gemini+vertex (#1023). Same pattern as those two: per-provider
factory derives the platform URL from
PlatformConfigwhen callerpasses empty
BaseURL, registry skips the vendor default for thatplatform pair so the factory branch is reachable.
What changed
Code
runtime/providers/claude/claude.govertexPlatform,vertexVersionValue,vertexAnthropicEndpoint.isVertex(),isPartnerHosted(),platformAnthropicVersion()predicates so callers branch by capability rather than enumerating platforms inline.messagesURL()/messagesStreamURL()extended with the Vertex branch ({baseURL}/{model}:rawPredictand:streamRawPredict).marshalBedrockRequest/marshalBedrockStreamingRequestgeneralized: same body shape for both partners, version sourced fromplatformAnthropicVersion().makeClaudeHTTPRequestswitched fromisBedrocktoisPartnerHostedfor body/header decisions; Vertex errors routed throughproviders.ParsePlatformHTTPError.NewProviderWithCredentialderives the Vertex publishers/anthropic/models URL when caller passes emptyBaseURL.runtime/providers/claude/claude_streaming.go— new Vertex branch inPredictStreamthat reusesmarshalBedrockStreamingRequestfor the partner body shape, posts to:streamRawPredict, applies the GCP credential as a Bearer token, noanthropic-versionheader.runtime/providers/claude/claude_tools.go—applyToolRequestHeadersandmakeRequestswitched toisPartnerHosted. Tool-streaming gainsstreamVertexToolRequest. SSE wiring extracted intorunSSEToolStreamso direct and Vertex paths share the engine and only differ in body bytes (kills adupllint).runtime/providers/registry.go— skipshttps://api.anthropic.comdefault whenspec.Platform=="vertex".Tests
runtime/providers/claude/vertex_unit_test.go— predicates, URL builders, body marshallers, factory URL derivation; verifies the direct API path is unchanged (still uses x-api-key + anthropic-version header).runtime/providers/claude/vertex_streaming_test.go— drivesPredictStreamthrough the new Vertex branch viahttptest. Asserts wire format::streamRawPredictURL action, nomodel/streamin body,anthropic_version=vertex-2023-10-16, Bearer auth, nox-api-key/anthropic-versionheaders.runtime/providers/registry_extended_test.go— claude+vertex regression test mirroring the openai+azure and gemini+vertex ones.runtime/providers/claude/vertex_integration_test.go—//go:build integrationsuite covering URL derivation, predict, predict-with-tools, streaming, error propagation, and cost calculation against a real Vertex deployment.Bedrock untouched
The shared
marshalBedrockRequest/marshalBedrockStreamingRequestnow consultplatformAnthropicVersion()instead of hardcodingbedrockVersionValue, but Bedrock's value is unchanged so the wire format on the Bedrock path is byte-identical. Verified:BEDROCK_MODEL=us.anthropic.claude-haiku-4-5-20251001-v1:0 go test -tags=integration -run ^TestBedrock_ ./runtime/providers/claude/— 5/5 still pass against AWS.Pre-commit
Both commits passed locally:
Test plan
go test -race -count=1 ./runtime/providers/...— greengo test ./runtime/providers/claude/ -run TestVertex— 13 unit + 2 streaming tests passgo test ./runtime/providers/ -run TestCreateProviderFromSpecClaudeVertexSkipsDefault— pass-tags=integration -run ^TestBedrock_) — 5/5 still pass after the refactorOut of scope
examples/azure-foundry-test/and friends; trivial to add once Anthropic is enabled in Model Garden.Refs #1009