Skip to content

feat(provider): add anthropic chat format for streaming#27

Merged
bzp2010 merged 5 commits intomainfrom
bzp/feat-new-provider-anthropic-format-s
Apr 7, 2026
Merged

feat(provider): add anthropic chat format for streaming#27
bzp2010 merged 5 commits intomainfrom
bzp/feat-new-provider-anthropic-format-s

Conversation

@bzp2010
Copy link
Copy Markdown
Collaborator

@bzp2010 bzp2010 commented Apr 7, 2026

As 13th part of the major provider refactoring, add the Anthropic Messages API chat format for streaming.

Summary by CodeRabbit

  • New Features

    • Hub-to-Anthropic bridging now streams complete bridged message sequences (incremental text, tool-use, images) and native responses include extracted usage details (including cache token counts).
  • Bug Fixes

    • Improved streaming event sequencing and aggregated token accounting; tighter validation surfaces rejections for unsupported cache-control or malformed tool deltas.
  • Documentation

    • Docs updated to remove the prior limitation and reflect supported streaming/usage behavior.

Copilot AI review requested due to automatic review settings April 7, 2026 09:13
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a0f61cfe-2f07-4a6a-befb-074688239a8f

📥 Commits

Reviewing files that changed from the base of the PR and between a8cbe7b and 02562bc.

📒 Files selected for processing (1)
  • docs/internals/llm-gateway.md
✅ Files skipped from review due to trivial changes (1)
  • docs/internals/llm-gateway.md

📝 Walkthrough

Walkthrough

Implemented a stateful hub→Anthropic streaming bridge that incrementally converts OpenAI chat SSE chunks into Anthropic stream events, tightens hub validation, propagates aggregated usage including prompt cache tokens, and changed non‑stream native usage extraction to be format‑driven via ChatFormat::response_usage(). Docs limitation removed.

Changes

Cohort / File(s) Summary
Anthropic Messages Bridge
src/gateway/formats/anthropic_messages.rs
Replaced unit bridge state with AnthropicBridgeState; added anthropic_bridge_state_machine to incrementally emit MessageStart/ContentBlockStart/ContentBlockDelta/ContentBlockStop/MessageDelta/MessageStop; added stream_end_events; tightened SSE validation and tool-delta checks; updated tests.
Gateway runtime & tests
src/gateway/gateway.rs
Non‑stream native responses now use F::response_usage(&response) and return ChatResponse::Complete { response, usage }; updated/added tests to assert bridged SSE sequencing and cache-aware usage aggregation.
Anthropic provider & transforms
src/gateway/providers/anthropic/transform.rs, src/gateway/providers/anthropic/mod.rs
Set cache_control: None on synthesized requests/blocks; added streaming usage appliers (start/delta) and cached-token accounting; loosened content-block pattern matches; small test setup fix.
Types & Anthropic model
src/gateway/types/anthropic.rs
Added cache_control: Option<CacheControl> (and ttl) to requests/blocks; introduced AnthropicCacheCreation; replaced InputUsage with MessageStartUsage; made DeltaUsage fields optional and added cache token fields; updated tests/fixtures.
Stream state & usage plumbing
src/gateway/streams/bridged.rs, src/gateway/streams/hub.rs
Bridged stream now includes cache_creation_input_tokens and cache_read_input_tokens in Usage; HubChunkStream populates cache_read_input_tokens from transformed chunk cached tokens when appropriate.
Traits / Native stream state
src/gateway/traits/chat_format.rs, src/gateway/traits/native.rs
Added ChatFormat::response_usage hook (defaulting to empty Usage); extended ChatStreamState with optional cache token fields; AnthropicMessagesNativeStreamState now contains a Usage field.
Docs
docs/internals/llm-gateway.md
Removed documented limitation that AnthropicMessagesFormat could not stream via non‑native hub paths; updated description of native complete-call usage extraction to be driven by ChatFormat::response_usage().

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Gateway
    participant Hub as OpenAI Hub (SSE)
    participant Bridge as AnthropicBridgeState
    participant ClientStream as Anthropic Client Stream

    Client->>Gateway: messages(stream: true)
    Gateway->>Hub: POST /v1/chat/completions (SSE)
    Hub-->>Gateway: chat.completion.chunk SSE events

    loop per chunk
        Gateway->>Bridge: process OpenAI chunk
        alt first relevant chunk
            Bridge->>ClientStream: MessageStart (usage incl. cache fields)
        end
        alt text delta
            Bridge->>ClientStream: ContentBlockStart (if needed)
            Bridge->>ClientStream: ContentBlockDelta (TextDelta)
        end
        alt tool/input delta
            Bridge->>ClientStream: ContentBlockStart (ToolUse)
            Bridge->>ClientStream: ContentBlockDelta (InputJsonDelta)
        end
        alt block boundary
            Bridge->>ClientStream: ContentBlockStop
        end
    end

    Gateway->>Bridge: stream_end_events()
    Bridge->>ClientStream: finalize ContentBlockStop (if open)
    Bridge->>ClientStream: MessageDelta (aggregated usage)
    Bridge->>ClientStream: MessageStop
    ClientStream-->>Client: Anthropic stream events
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
E2e Test Quality Review ⚠️ Warning The PR's E2E tests only cover server startup and basic HTTP status checks for the proxy and admin endpoints, but they do not exercise any chat endpoints or the newly added Anthropic streaming formats, leaving the full business flow untested. Add end-to-end tests that simulate real chat requests through the gateway to the Anthropic Messages API—covering both native and bridged streaming paths—asserting complete SSE event sequences, usage reporting and error handling.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(provider): add anthropic chat format for streaming' accurately reflects the main change: implementing streaming support for the Anthropic Messages format via a new ChatFormat implementation with comprehensive streaming event handling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bzp/feat-new-provider-anthropic-format-s

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables hub-based streaming for the AnthropicMessagesFormat by bridging OpenAI-style chat.completion.chunk SSE chunks into Anthropic streaming events, and updates tests/docs to reflect the new streaming behavior.

Changes:

  • Implement hub streaming bridge logic for AnthropicMessagesFormat using a state machine that emits Anthropic stream event lifecycles (message/content blocks) and stop reasons.
  • Update gateway integration test to validate end-to-end hub SSE streaming → Anthropic stream events and final usage delivery.
  • Update internal documentation to remove the previous limitation note about Anthropic hub streaming being unsupported.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/gateway/gateway.rs Reworks the messages streaming test to exercise hub SSE streaming and assert Anthropic stream events + usage.
src/gateway/formats/anthropic_messages.rs Adds a bridge state + state machine to convert hub ChatCompletionChunk streams into AnthropicStreamEvents (text + tool_use).
docs/internals/llm-gateway.md Removes outdated statement that AnthropicMessagesFormat rejects non-native hub streaming.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/gateway/formats/anthropic_messages.rs (1)

833-840: Avoid duplicating finish-reason mapping logic.

openai_finish_reason_to_anthropic_stream mirrors openai_finish_reason_to_anthropic; centralizing this mapping will prevent drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/gateway/formats/anthropic_messages.rs` around lines 833 - 840, The
mapping logic in openai_finish_reason_to_anthropic_stream duplicates
openai_finish_reason_to_anthropic; refactor by centralizing the mapping into a
single function (e.g., keep openai_finish_reason_to_anthropic as the canonical
mapper) and have openai_finish_reason_to_anthropic_stream call that function and
convert/output the result as needed, updating any callers to use the centralized
mapper to avoid drift between openai_finish_reason_to_anthropic and
openai_finish_reason_to_anthropic_stream.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gateway/formats/anthropic_messages.rs`:
- Around line 27-37: Add a rustdoc comment (///) above the public struct
AnthropicBridgeState that briefly explains its role tracking streaming
conversion state for Anthropic messages—e.g., that it records whether a message
has started, the current block index/type/open state, stop reason, token counts,
and tool->block mappings during incremental parsing/streaming. Keep the comment
short (one or two sentences) and mention that the struct is used internally to
manage streaming message assembly and token accounting.
- Around line 728-775: On the first tool delta path (inside the loop over
choice.delta.tool_calls) validate the tool_call.type before emitting a ToolUse
block: check tool_call.type.as_deref() is present and equals the expected
tool-call kind (reject/return GatewayError::Bridge for unsupported/missing
types), then only proceed to call close_current_block, insert into
state.tool_block_map, set current block state and push the
AnthropicStreamEvent::ContentBlockStart with AnthropicContentBlock::ToolUse;
update the code around where tool_id, function and tool_name are read (the
tool_call branch that calls close_current_block and events.push) to perform this
type check first and fail fast for non-tool types.

---

Nitpick comments:
In `@src/gateway/formats/anthropic_messages.rs`:
- Around line 833-840: The mapping logic in
openai_finish_reason_to_anthropic_stream duplicates
openai_finish_reason_to_anthropic; refactor by centralizing the mapping into a
single function (e.g., keep openai_finish_reason_to_anthropic as the canonical
mapper) and have openai_finish_reason_to_anthropic_stream call that function and
convert/output the result as needed, updating any callers to use the centralized
mapper to avoid drift between openai_finish_reason_to_anthropic and
openai_finish_reason_to_anthropic_stream.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: abe234f8-5e62-4812-8097-c0671f003f05

📥 Commits

Reviewing files that changed from the base of the PR and between 96d45f1 and e058c5a.

📒 Files selected for processing (3)
  • docs/internals/llm-gateway.md
  • src/gateway/formats/anthropic_messages.rs
  • src/gateway/gateway.rs
💤 Files with no reviewable changes (1)
  • docs/internals/llm-gateway.md

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/gateway/formats/anthropic_messages.rs`:
- Around line 69-73: The current check only rejects top-level req.cache_control
but still allows per-message cache_control which later gets dropped during
Anthropic→OpenAI conversion; update the validation (where req is inspected) to
also iterate req.messages and return Err(GatewayError::Bridge(...)) if any
message.cache_control.is_some() for non-system roles (i.e., reject per-block
cache_control on user/assistant messages), referencing req.messages,
message.cache_control and the BridgeContext conversion path so the request fails
fast instead of silently dropping caching directives.
- Around line 652-690: The delta handler recomputes input/total tokens from
sparse DeltaUsage, which zeroes omitted cached-token fields and undercounts when
merged; update anthropic_delta_usage_to_common_usage (or adjust
update_native_usage_from_event) so it doesn't rebuild totals from missing
fields: only sum input_tokens/cache_* when those Option values are Some (use
Option::zip/and_then to produce input_tokens and total_tokens only if all
components present), or compute input/total by consulting the existing state
(state.usage) when cache_creation_input_tokens/cache_read_input_tokens are None;
ensure message deltas only supply the fields they contain and avoid replacing
totals with zeros before calling state.usage.merge.

In `@src/gateway/providers/anthropic/transform.rs`:
- Around line 605-622: The function apply_anthropic_delta_usage_to_stream_state
currently zeroes omitted cache counters by using
usage.cache_creation_input_tokens.unwrap_or(0) and unwrap_or(0) on cache_read
when computing state.input_tokens; instead preserve previously-seen cache
counters from state when DeltaUsage omits them. Update the input_tokens branch
in apply_anthropic_delta_usage_to_stream_state to add usage.input_tokens plus
state.cache_creation_input_tokens.unwrap_or(0) and
state.cache_read_input_tokens.unwrap_or(0) (not usage.unwrap_or(0)), and
continue to only overwrite cache_creation_input_tokens and
cache_read_input_tokens when usage provides Some(...) so existing counters are
retained otherwise.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 009f91fe-04d5-495e-b05e-ad7d481293dd

📥 Commits

Reviewing files that changed from the base of the PR and between 0cffc53 and e79201d.

📒 Files selected for processing (4)
  • src/gateway/formats/anthropic_messages.rs
  • src/gateway/gateway.rs
  • src/gateway/providers/anthropic/transform.rs
  • src/gateway/types/anthropic.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/gateway/gateway.rs
  • src/gateway/types/anthropic.rs

Copilot AI review requested due to automatic review settings April 7, 2026 15:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bzp2010 bzp2010 merged commit f925e44 into main Apr 7, 2026
10 checks passed
@bzp2010 bzp2010 deleted the bzp/feat-new-provider-anthropic-format-s branch April 7, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants