feat(providers): claude on vertex (#1009 cell) by chaholl · Pull Request #1024 · AltairaLabs/PromptKit

chaholl · 2026-04-20T11:44:53Z

Summary

Adds Vertex AI Anthropic-partner support to the claude provider. Third
cell delivered against the #1009 matrix following openai+azure (#1010)
and gemini+vertex (#1023). Same pattern as those two: per-provider
factory derives the platform URL from PlatformConfig when caller
passes empty BaseURL, registry skips the vendor default for that
platform pair so the factory branch is reachable.

What changed

Code

runtime/providers/claude/claude.go
- New vertexPlatform, vertexVersionValue, vertexAnthropicEndpoint.
- isVertex(), isPartnerHosted(), platformAnthropicVersion() predicates so callers branch by capability rather than enumerating platforms inline.
- messagesURL() / messagesStreamURL() extended with the Vertex branch ({baseURL}/{model}:rawPredict and :streamRawPredict).
- marshalBedrockRequest / marshalBedrockStreamingRequest generalized: same body shape for both partners, version sourced from platformAnthropicVersion().
- makeClaudeHTTPRequest switched from isBedrock to isPartnerHosted for body/header decisions; Vertex errors routed through providers.ParsePlatformHTTPError.
- NewProviderWithCredential derives the Vertex publishers/anthropic/models URL when caller passes empty BaseURL.
runtime/providers/claude/claude_streaming.go — new Vertex branch in PredictStream that reuses marshalBedrockStreamingRequest for the partner body shape, posts to :streamRawPredict, applies the GCP credential as a Bearer token, no anthropic-version header.
runtime/providers/claude/claude_tools.go — applyToolRequestHeaders and makeRequest switched to isPartnerHosted. Tool-streaming gains streamVertexToolRequest. SSE wiring extracted into runSSEToolStream so direct and Vertex paths share the engine and only differ in body bytes (kills a dupl lint).
runtime/providers/registry.go — skips https://api.anthropic.com default when spec.Platform=="vertex".

Tests

runtime/providers/claude/vertex_unit_test.go — predicates, URL builders, body marshallers, factory URL derivation; verifies the direct API path is unchanged (still uses x-api-key + anthropic-version header).
runtime/providers/claude/vertex_streaming_test.go — drives PredictStream through the new Vertex branch via httptest. Asserts wire format: :streamRawPredict URL action, no model/stream in body, anthropic_version=vertex-2023-10-16, Bearer auth, no x-api-key/anthropic-version headers.
runtime/providers/registry_extended_test.go — claude+vertex regression test mirroring the openai+azure and gemini+vertex ones.
runtime/providers/claude/vertex_integration_test.go — //go:build integration suite covering URL derivation, predict, predict-with-tools, streaming, error propagation, and cost calculation against a real Vertex deployment.

Bedrock untouched

The shared marshalBedrockRequest / marshalBedrockStreamingRequest now consult platformAnthropicVersion() instead of hardcoding bedrockVersionValue, but Bedrock's value is unchanged so the wire format on the Bedrock path is byte-identical. Verified: BEDROCK_MODEL=us.anthropic.claude-haiku-4-5-20251001-v1:0 go test -tags=integration -run ^TestBedrock_ ./runtime/providers/claude/ — 5/5 still pass against AWS.

Pre-commit

Both commits passed locally:

lint: 0 issues
coverage on changed files: claude.go 83.1%, claude_streaming.go 88.0% (up from 78.5% pre-test), claude_tools.go 84.1%, registry.go 91.2% (all ≥80%)

Test plan

go test -race -count=1 ./runtime/providers/... — green
go test ./runtime/providers/claude/ -run TestVertex — 13 unit + 2 streaming tests pass
go test ./runtime/providers/ -run TestCreateProviderFromSpecClaudeVertexSkipsDefault — pass
Bedrock integration suite (-tags=integration -run ^TestBedrock_) — 5/5 still pass after the refactor
[-] Live Vertex Anthropic integration tests — skipped: account requires Anthropic models to be enabled in Vertex Model Garden (manual GCP-console terms acceptance, analogous to AWS use-case form). Structural tests (URL construction, error propagation) confirm the code path is reachable; live runs will turn green once the project enables Anthropic in Model Garden.

Out of scope

Remaining 5 cells of [FEATURE] Support full providerType × platform matrix in runtime providers (claude/openai/gemini × bedrock/vertex/azure) #1009: claude+azure, openai+bedrock, openai+vertex, gemini+bedrock, gemini+azure.
Arena example for claude+vertex — same pattern as examples/azure-foundry-test/ and friends; trivial to add once Anthropic is enabled in Model Garden.

Refs #1009

Adds the Vertex AI Anthropic-partner cell to the provider matrix. Mirrors the openai+azure (#1010) and gemini+vertex (#1023) patterns: the per-provider factory derives the platform URL from PlatformConfig when caller passes an empty BaseURL, and the registry skips the vendor default for that platform pair so the factory branch is reachable. Vertex Anthropic specifics handled: - URL: {baseURL}/{model}:rawPredict for Predict / PredictWithTools, {baseURL}/{model}:streamRawPredict for the streaming variants. - Body: shared partner shape (no `model`, no `stream`, `anthropic_version` body key) with the platform-specific value `vertex-2023-10-16` (vs Bedrock's `bedrock-2023-05-31`). - Auth: Bearer token via the GCP credential chain (the same credential.Apply path Bedrock already uses for SigV4 — neither partner uses x-api-key). - Streaming transport: SSE (NewSSEScanner), distinct from Bedrock's binary event-stream (NewBedrockEventScanner). Both wired through RunStreamingRequest for retry / budget / semaphore parity. - Errors: routed through providers.ParsePlatformHTTPError so the Vertex error envelope is parsed instead of returned as raw bytes. Code structure: - `vertexPlatform`, `vertexVersionValue`, `vertexAnthropicEndpoint` in claude.go. - `isVertex()`, `isPartnerHosted()`, `platformAnthropicVersion()` predicates so callers branch by capability rather than enumerating platforms inline. - `messagesURL()` / `messagesStreamURL()` extended with a Vertex branch. - `marshalBedrockRequest` and `marshalBedrockStreamingRequest` generalized: same body shape, version sourced from `platformAnthropicVersion()` so they serve both Bedrock and Vertex. - `applyToolRequestHeaders` and `makeRequest` switched from `isBedrock` to `isPartnerHosted` for body-modification and auth dispatch. Tool-streaming gains a `streamVertexToolRequest` peer to the existing Bedrock and direct-API helpers; the SSE wiring is factored into `runSSEToolStream` so direct and Vertex paths share the engine and only differ in body bytes. - Registry skips `https://api.anthropic.com` default when `spec.Platform=="vertex"` (regression test added). The Bedrock path is unchanged — the existing Bedrock integration test still passes after the refactor. Tests: - `vertex_unit_test.go` covers helpers (predicates, URL builders, body marshallers) and the factory URL derivation. - `vertex_streaming_test.go` uses httptest to drive PredictStream through the new Vertex branch and asserts wire format (`:streamRawPredict` URL action, no `model`/`stream` in body, `anthropic_version=vertex-2023-10-16`, Bearer auth, no x-api-key, no anthropic-version header). - `registry_extended_test.go` adds the claude+vertex regression test matching the openai+azure and gemini+vertex ones. Refs #1009

Mirrors the gemini+vertex (#1023) and bedrock integration suites: build-tag `integration`, env-gated skips when GCP_PROJECT is unset or ADC tokens are unavailable. Covers URL derivation, predict, predict-with-tools, streaming, error propagation, and cost calculation against a real Vertex Anthropic-partner deployment. Live runs additionally require Anthropic models to be enabled in the project's Vertex Model Garden (a manual one-time terms-acceptance in the GCP console, analogous to the Bedrock use-case form). Without that step the tests propagate Vertex's standard 404 "Publisher Model not found or your project does not have access" which the new ParsePlatformHTTPError integration in claude.go surfaces cleanly. Run: gcloud auth application-default login gcloud auth application-default set-quota-project <project> export GCP_PROJECT=<project> go test -tags=integration ./runtime/providers/claude/... -run Vertex -v

sonarqubecloud · 2026-04-20T11:52:10Z

Quality Gate passed

Issues
3 New issues
0 Accepted issues

Measures
0 Security Hotspots
86.7% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Adds the AWS Bedrock OpenAI-partner cell (gpt-oss family) to the provider matrix. 7th cell delivered against #1009 following: openai+azure (#1010), gemini+vertex (#1023), claude+vertex (#1024), fail-fast rejections (#1025). Wire format (verified against live Bedrock us-west-2): - URL: {baseURL}/model/{model-id}/invoke (standard Bedrock invoke) - Auth: AWS SigV4 via credentials.AWSCredential - Request: OpenAI Chat Completions JSON. Bedrock ignores the model field in the body (it routes by URL) so no stripping is needed — the existing request builder works as-is. - Response: standard OpenAI Chat Completions JSON — parsed by the existing openAIResponse unmarshal path. Code structure: - `bedrockPlatform` constant + `isBedrock()` predicate in openai.go. - `chatCompletionsURL()` and `responsesURL()` branch for Bedrock: the invoke URL path; Responses API falls back to Chat Completions (mirrors the existing Azure fallback — neither partner exposes it). - `NewProviderFromConfig` derives baseURL from PlatformConfig.Region via credentials.BedrockEndpoint when BaseURL is empty, forces APIModeCompletions, and auto-adds `max_tokens` and `top_p` to UnsupportedParams for the Bedrock default set: * max_tokens: gpt-oss requires `max_completion_tokens` instead (same switch the request builder already flips for o-series). * top_p: gpt-oss rejects 0.0 and the framework default is 0; omitting leaves the model default in place. Operators can override the full unsupported set via ProviderSpec.UnsupportedParams. - registry.go: api.openai.com/v1 default now skipped for Platform=="bedrock" as well as "azure", so the factory's PlatformConfig URL derivation is reachable (same root cause as #1010). Streaming (known limitation): Bedrock's `invoke-with-response-stream` returns binary event-stream framing with OpenAI-format chunks inside — distinct from the SSE the rest of the openai stream code expects. Until a dedicated scanner lands, PredictStream / PredictStreamWithTools run a non-streaming Predict and emit a single terminal StreamChunk with content + FinishReason + CostInfo. This is honest (Bedrock's plain `invoke` is non-streaming) and keeps Arena scenarios with `streaming: true` compatible. Tests: - bedrock_unit_test.go covers predicates, URL builders, factory URL derivation, API-mode forcing, unsupported-params auto-detection, and that explicit UnsupportedParams overrides Bedrock auto-detect. - bedrock_integration_test.go (//go:build integration) covers URL construction, predict, predict-with-tools, streaming fallback, error propagation, and cost calculation against a real Bedrock gpt-oss deployment. All 6 pass locally against us-west-2 on the omnia-aws profile. Refs #1009

* feat(providers): add openai bedrock code path (#1009 cell) Adds the AWS Bedrock OpenAI-partner cell (gpt-oss family) to the provider matrix. 7th cell delivered against #1009 following: openai+azure (#1010), gemini+vertex (#1023), claude+vertex (#1024), fail-fast rejections (#1025). Wire format (verified against live Bedrock us-west-2): - URL: {baseURL}/model/{model-id}/invoke (standard Bedrock invoke) - Auth: AWS SigV4 via credentials.AWSCredential - Request: OpenAI Chat Completions JSON. Bedrock ignores the model field in the body (it routes by URL) so no stripping is needed — the existing request builder works as-is. - Response: standard OpenAI Chat Completions JSON — parsed by the existing openAIResponse unmarshal path. Code structure: - `bedrockPlatform` constant + `isBedrock()` predicate in openai.go. - `chatCompletionsURL()` and `responsesURL()` branch for Bedrock: the invoke URL path; Responses API falls back to Chat Completions (mirrors the existing Azure fallback — neither partner exposes it). - `NewProviderFromConfig` derives baseURL from PlatformConfig.Region via credentials.BedrockEndpoint when BaseURL is empty, forces APIModeCompletions, and auto-adds `max_tokens` and `top_p` to UnsupportedParams for the Bedrock default set: * max_tokens: gpt-oss requires `max_completion_tokens` instead (same switch the request builder already flips for o-series). * top_p: gpt-oss rejects 0.0 and the framework default is 0; omitting leaves the model default in place. Operators can override the full unsupported set via ProviderSpec.UnsupportedParams. - registry.go: api.openai.com/v1 default now skipped for Platform=="bedrock" as well as "azure", so the factory's PlatformConfig URL derivation is reachable (same root cause as #1010). Streaming (known limitation): Bedrock's `invoke-with-response-stream` returns binary event-stream framing with OpenAI-format chunks inside — distinct from the SSE the rest of the openai stream code expects. Until a dedicated scanner lands, PredictStream / PredictStreamWithTools run a non-streaming Predict and emit a single terminal StreamChunk with content + FinishReason + CostInfo. This is honest (Bedrock's plain `invoke` is non-streaming) and keeps Arena scenarios with `streaming: true` compatible. Tests: - bedrock_unit_test.go covers predicates, URL builders, factory URL derivation, API-mode forcing, unsupported-params auto-detection, and that explicit UnsupportedParams overrides Bedrock auto-detect. - bedrock_integration_test.go (//go:build integration) covers URL construction, predict, predict-with-tools, streaming fallback, error propagation, and cost calculation against a real Bedrock gpt-oss deployment. All 6 pass locally against us-west-2 on the omnia-aws profile. Refs #1009 * test(providers): unit-test bedrock streaming fallbacks for SonarCloud coverage The bedrock_integration_test.go suite covers the new code path end-to-end against real AWS Bedrock, but build-tag `integration` is not run in CI so SonarCloud sees the predictStreamBedrockFallback and predictStreamWithToolsBedrockFallback bodies as uncovered (new_coverage 41.4% < 80% threshold). Adds httptest-based unit tests that mock the Bedrock invoke endpoint with canned OpenAI Chat Completions JSON and exercise: - PredictStream Bedrock path: emits exactly one terminal chunk with Content/Delta/FinishReason=stop/CostInfo populated. - PredictStream Bedrock error: 4xx upstream surfaces as error from the streaming entrypoint. - PredictStreamWithTools Bedrock path: emits exactly one terminal chunk with ToolCalls populated and FinishReason=tool_calls. - PredictStreamWithTools Bedrock error: 5xx upstream surfaces as error. No production code changes.

chaholl added 2 commits April 20, 2026 12:43

chaholl merged commit d366b62 into main Apr 20, 2026
23 checks passed

chaholl deleted the feat/claude-vertex-1009 branch April 20, 2026 11:52

chaholl mentioned this pull request Apr 20, 2026

feat(providers): reject unsupported provider×platform pairs (#1009) #1025

Merged

3 tasks

chaholl mentioned this pull request Apr 20, 2026

[FEATURE] Support full providerType × platform matrix in runtime providers (claude/openai/gemini × bedrock/vertex/azure) #1009

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): claude on vertex (#1009 cell)#1024

feat(providers): claude on vertex (#1009 cell)#1024
chaholl merged 2 commits intomainfrom
feat/claude-vertex-1009

chaholl commented Apr 20, 2026

Uh oh!

sonarqubecloud Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chaholl commented Apr 20, 2026

Summary

What changed

Code

Tests

Bedrock untouched

Pre-commit

Test plan

Out of scope

Uh oh!

sonarqubecloud Bot commented Apr 20, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant