Skip to content

feat(providers): claude on vertex (#1009 cell)#1024

Merged
chaholl merged 2 commits intomainfrom
feat/claude-vertex-1009
Apr 20, 2026
Merged

feat(providers): claude on vertex (#1009 cell)#1024
chaholl merged 2 commits intomainfrom
feat/claude-vertex-1009

Conversation

@chaholl
Copy link
Copy Markdown
Contributor

@chaholl chaholl commented Apr 20, 2026

Summary

Adds Vertex AI Anthropic-partner support to the claude provider. Third
cell delivered against the #1009 matrix following openai+azure (#1010)
and gemini+vertex (#1023). Same pattern as those two: per-provider
factory derives the platform URL from PlatformConfig when caller
passes empty BaseURL, registry skips the vendor default for that
platform pair so the factory branch is reachable.

What changed

Code

  • runtime/providers/claude/claude.go
    • New vertexPlatform, vertexVersionValue, vertexAnthropicEndpoint.
    • isVertex(), isPartnerHosted(), platformAnthropicVersion() predicates so callers branch by capability rather than enumerating platforms inline.
    • messagesURL() / messagesStreamURL() extended with the Vertex branch ({baseURL}/{model}:rawPredict and :streamRawPredict).
    • marshalBedrockRequest / marshalBedrockStreamingRequest generalized: same body shape for both partners, version sourced from platformAnthropicVersion().
    • makeClaudeHTTPRequest switched from isBedrock to isPartnerHosted for body/header decisions; Vertex errors routed through providers.ParsePlatformHTTPError.
    • NewProviderWithCredential derives the Vertex publishers/anthropic/models URL when caller passes empty BaseURL.
  • runtime/providers/claude/claude_streaming.go — new Vertex branch in PredictStream that reuses marshalBedrockStreamingRequest for the partner body shape, posts to :streamRawPredict, applies the GCP credential as a Bearer token, no anthropic-version header.
  • runtime/providers/claude/claude_tools.goapplyToolRequestHeaders and makeRequest switched to isPartnerHosted. Tool-streaming gains streamVertexToolRequest. SSE wiring extracted into runSSEToolStream so direct and Vertex paths share the engine and only differ in body bytes (kills a dupl lint).
  • runtime/providers/registry.go — skips https://api.anthropic.com default when spec.Platform=="vertex".

Tests

  • runtime/providers/claude/vertex_unit_test.go — predicates, URL builders, body marshallers, factory URL derivation; verifies the direct API path is unchanged (still uses x-api-key + anthropic-version header).
  • runtime/providers/claude/vertex_streaming_test.go — drives PredictStream through the new Vertex branch via httptest. Asserts wire format: :streamRawPredict URL action, no model/stream in body, anthropic_version=vertex-2023-10-16, Bearer auth, no x-api-key/anthropic-version headers.
  • runtime/providers/registry_extended_test.go — claude+vertex regression test mirroring the openai+azure and gemini+vertex ones.
  • runtime/providers/claude/vertex_integration_test.go//go:build integration suite covering URL derivation, predict, predict-with-tools, streaming, error propagation, and cost calculation against a real Vertex deployment.

Bedrock untouched

The shared marshalBedrockRequest / marshalBedrockStreamingRequest now consult platformAnthropicVersion() instead of hardcoding bedrockVersionValue, but Bedrock's value is unchanged so the wire format on the Bedrock path is byte-identical. Verified: BEDROCK_MODEL=us.anthropic.claude-haiku-4-5-20251001-v1:0 go test -tags=integration -run ^TestBedrock_ ./runtime/providers/claude/ — 5/5 still pass against AWS.

Pre-commit

Both commits passed locally:

  • lint: 0 issues
  • coverage on changed files: claude.go 83.1%, claude_streaming.go 88.0% (up from 78.5% pre-test), claude_tools.go 84.1%, registry.go 91.2% (all ≥80%)

Test plan

  • go test -race -count=1 ./runtime/providers/... — green
  • go test ./runtime/providers/claude/ -run TestVertex — 13 unit + 2 streaming tests pass
  • go test ./runtime/providers/ -run TestCreateProviderFromSpecClaudeVertexSkipsDefault — pass
  • Bedrock integration suite (-tags=integration -run ^TestBedrock_) — 5/5 still pass after the refactor
  • [-] Live Vertex Anthropic integration tests — skipped: account requires Anthropic models to be enabled in Vertex Model Garden (manual GCP-console terms acceptance, analogous to AWS use-case form). Structural tests (URL construction, error propagation) confirm the code path is reachable; live runs will turn green once the project enables Anthropic in Model Garden.

Out of scope

Refs #1009

chaholl added 2 commits April 20, 2026 12:43
Adds the Vertex AI Anthropic-partner cell to the provider matrix.
Mirrors the openai+azure (#1010) and gemini+vertex (#1023) patterns:
the per-provider factory derives the platform URL from PlatformConfig
when caller passes an empty BaseURL, and the registry skips the vendor
default for that platform pair so the factory branch is reachable.

Vertex Anthropic specifics handled:

- URL: {baseURL}/{model}:rawPredict for Predict / PredictWithTools,
  {baseURL}/{model}:streamRawPredict for the streaming variants.
- Body: shared partner shape (no `model`, no `stream`,
  `anthropic_version` body key) with the platform-specific value
  `vertex-2023-10-16` (vs Bedrock's `bedrock-2023-05-31`).
- Auth: Bearer token via the GCP credential chain (the same
  credential.Apply path Bedrock already uses for SigV4 — neither
  partner uses x-api-key).
- Streaming transport: SSE (NewSSEScanner), distinct from Bedrock's
  binary event-stream (NewBedrockEventScanner). Both wired through
  RunStreamingRequest for retry / budget / semaphore parity.
- Errors: routed through providers.ParsePlatformHTTPError so the
  Vertex error envelope is parsed instead of returned as raw bytes.

Code structure:

- `vertexPlatform`, `vertexVersionValue`, `vertexAnthropicEndpoint`
  in claude.go.
- `isVertex()`, `isPartnerHosted()`, `platformAnthropicVersion()`
  predicates so callers branch by capability rather than enumerating
  platforms inline.
- `messagesURL()` / `messagesStreamURL()` extended with a Vertex
  branch.
- `marshalBedrockRequest` and `marshalBedrockStreamingRequest`
  generalized: same body shape, version sourced from
  `platformAnthropicVersion()` so they serve both Bedrock and Vertex.
- `applyToolRequestHeaders` and `makeRequest` switched from
  `isBedrock` to `isPartnerHosted` for body-modification and auth
  dispatch. Tool-streaming gains a `streamVertexToolRequest` peer to
  the existing Bedrock and direct-API helpers; the SSE wiring is
  factored into `runSSEToolStream` so direct and Vertex paths share
  the engine and only differ in body bytes.
- Registry skips `https://api.anthropic.com` default when
  `spec.Platform=="vertex"` (regression test added).

The Bedrock path is unchanged — the existing Bedrock integration test
still passes after the refactor.

Tests:

- `vertex_unit_test.go` covers helpers (predicates, URL builders,
  body marshallers) and the factory URL derivation.
- `vertex_streaming_test.go` uses httptest to drive PredictStream
  through the new Vertex branch and asserts wire format
  (`:streamRawPredict` URL action, no `model`/`stream` in body,
  `anthropic_version=vertex-2023-10-16`, Bearer auth, no x-api-key,
  no anthropic-version header).
- `registry_extended_test.go` adds the claude+vertex regression test
  matching the openai+azure and gemini+vertex ones.

Refs #1009
Mirrors the gemini+vertex (#1023) and bedrock integration suites:
build-tag `integration`, env-gated skips when GCP_PROJECT is unset or
ADC tokens are unavailable. Covers URL derivation, predict,
predict-with-tools, streaming, error propagation, and cost
calculation against a real Vertex Anthropic-partner deployment.

Live runs additionally require Anthropic models to be enabled in the
project's Vertex Model Garden (a manual one-time terms-acceptance in
the GCP console, analogous to the Bedrock use-case form). Without
that step the tests propagate Vertex's standard 404
"Publisher Model not found or your project does not have access"
which the new ParsePlatformHTTPError integration in claude.go
surfaces cleanly.

Run:

  gcloud auth application-default login
  gcloud auth application-default set-quota-project <project>
  export GCP_PROJECT=<project>
  go test -tags=integration ./runtime/providers/claude/... -run Vertex -v
@sonarqubecloud
Copy link
Copy Markdown

@chaholl chaholl merged commit d366b62 into main Apr 20, 2026
23 checks passed
@chaholl chaholl deleted the feat/claude-vertex-1009 branch April 20, 2026 11:52
chaholl added a commit that referenced this pull request Apr 20, 2026
Adds the AWS Bedrock OpenAI-partner cell (gpt-oss family) to the
provider matrix. 7th cell delivered against #1009 following:
openai+azure (#1010), gemini+vertex (#1023), claude+vertex (#1024),
fail-fast rejections (#1025).

Wire format (verified against live Bedrock us-west-2):

- URL:      {baseURL}/model/{model-id}/invoke  (standard Bedrock invoke)
- Auth:     AWS SigV4 via credentials.AWSCredential
- Request:  OpenAI Chat Completions JSON. Bedrock ignores the model
            field in the body (it routes by URL) so no stripping is
            needed — the existing request builder works as-is.
- Response: standard OpenAI Chat Completions JSON — parsed by the
            existing openAIResponse unmarshal path.

Code structure:

- `bedrockPlatform` constant + `isBedrock()` predicate in openai.go.
- `chatCompletionsURL()` and `responsesURL()` branch for Bedrock:
  the invoke URL path; Responses API falls back to Chat Completions
  (mirrors the existing Azure fallback — neither partner exposes it).
- `NewProviderFromConfig` derives baseURL from PlatformConfig.Region
  via credentials.BedrockEndpoint when BaseURL is empty, forces
  APIModeCompletions, and auto-adds `max_tokens` and `top_p` to
  UnsupportedParams for the Bedrock default set:
    * max_tokens: gpt-oss requires `max_completion_tokens` instead
      (same switch the request builder already flips for o-series).
    * top_p:      gpt-oss rejects 0.0 and the framework default is 0;
      omitting leaves the model default in place.
  Operators can override the full unsupported set via
  ProviderSpec.UnsupportedParams.
- registry.go: api.openai.com/v1 default now skipped for
  Platform=="bedrock" as well as "azure", so the factory's
  PlatformConfig URL derivation is reachable (same root cause as
  #1010).

Streaming (known limitation):

Bedrock's `invoke-with-response-stream` returns binary event-stream
framing with OpenAI-format chunks inside — distinct from the SSE the
rest of the openai stream code expects. Until a dedicated scanner
lands, PredictStream / PredictStreamWithTools run a non-streaming
Predict and emit a single terminal StreamChunk with content +
FinishReason + CostInfo. This is honest (Bedrock's plain `invoke` is
non-streaming) and keeps Arena scenarios with `streaming: true`
compatible.

Tests:

- bedrock_unit_test.go covers predicates, URL builders, factory URL
  derivation, API-mode forcing, unsupported-params auto-detection,
  and that explicit UnsupportedParams overrides Bedrock auto-detect.
- bedrock_integration_test.go (//go:build integration) covers
  URL construction, predict, predict-with-tools, streaming fallback,
  error propagation, and cost calculation against a real Bedrock
  gpt-oss deployment. All 6 pass locally against us-west-2 on the
  omnia-aws profile.

Refs #1009
chaholl added a commit that referenced this pull request Apr 20, 2026
* feat(providers): add openai bedrock code path (#1009 cell)

Adds the AWS Bedrock OpenAI-partner cell (gpt-oss family) to the
provider matrix. 7th cell delivered against #1009 following:
openai+azure (#1010), gemini+vertex (#1023), claude+vertex (#1024),
fail-fast rejections (#1025).

Wire format (verified against live Bedrock us-west-2):

- URL:      {baseURL}/model/{model-id}/invoke  (standard Bedrock invoke)
- Auth:     AWS SigV4 via credentials.AWSCredential
- Request:  OpenAI Chat Completions JSON. Bedrock ignores the model
            field in the body (it routes by URL) so no stripping is
            needed — the existing request builder works as-is.
- Response: standard OpenAI Chat Completions JSON — parsed by the
            existing openAIResponse unmarshal path.

Code structure:

- `bedrockPlatform` constant + `isBedrock()` predicate in openai.go.
- `chatCompletionsURL()` and `responsesURL()` branch for Bedrock:
  the invoke URL path; Responses API falls back to Chat Completions
  (mirrors the existing Azure fallback — neither partner exposes it).
- `NewProviderFromConfig` derives baseURL from PlatformConfig.Region
  via credentials.BedrockEndpoint when BaseURL is empty, forces
  APIModeCompletions, and auto-adds `max_tokens` and `top_p` to
  UnsupportedParams for the Bedrock default set:
    * max_tokens: gpt-oss requires `max_completion_tokens` instead
      (same switch the request builder already flips for o-series).
    * top_p:      gpt-oss rejects 0.0 and the framework default is 0;
      omitting leaves the model default in place.
  Operators can override the full unsupported set via
  ProviderSpec.UnsupportedParams.
- registry.go: api.openai.com/v1 default now skipped for
  Platform=="bedrock" as well as "azure", so the factory's
  PlatformConfig URL derivation is reachable (same root cause as
  #1010).

Streaming (known limitation):

Bedrock's `invoke-with-response-stream` returns binary event-stream
framing with OpenAI-format chunks inside — distinct from the SSE the
rest of the openai stream code expects. Until a dedicated scanner
lands, PredictStream / PredictStreamWithTools run a non-streaming
Predict and emit a single terminal StreamChunk with content +
FinishReason + CostInfo. This is honest (Bedrock's plain `invoke` is
non-streaming) and keeps Arena scenarios with `streaming: true`
compatible.

Tests:

- bedrock_unit_test.go covers predicates, URL builders, factory URL
  derivation, API-mode forcing, unsupported-params auto-detection,
  and that explicit UnsupportedParams overrides Bedrock auto-detect.
- bedrock_integration_test.go (//go:build integration) covers
  URL construction, predict, predict-with-tools, streaming fallback,
  error propagation, and cost calculation against a real Bedrock
  gpt-oss deployment. All 6 pass locally against us-west-2 on the
  omnia-aws profile.

Refs #1009

* test(providers): unit-test bedrock streaming fallbacks for SonarCloud coverage

The bedrock_integration_test.go suite covers the new code path
end-to-end against real AWS Bedrock, but build-tag `integration` is
not run in CI so SonarCloud sees the predictStreamBedrockFallback and
predictStreamWithToolsBedrockFallback bodies as uncovered (new_coverage
41.4% < 80% threshold).

Adds httptest-based unit tests that mock the Bedrock invoke endpoint
with canned OpenAI Chat Completions JSON and exercise:

- PredictStream Bedrock path: emits exactly one terminal chunk with
  Content/Delta/FinishReason=stop/CostInfo populated.
- PredictStream Bedrock error: 4xx upstream surfaces as error from
  the streaming entrypoint.
- PredictStreamWithTools Bedrock path: emits exactly one terminal
  chunk with ToolCalls populated and FinishReason=tool_calls.
- PredictStreamWithTools Bedrock error: 5xx upstream surfaces as
  error.

No production code changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant