You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement AnthropicModelInterface — the real provider implementation of the ModelInterface trait that makes actual Anthropic API calls. Also implements AnthropicCacheProvider as a natural companion since both are Anthropic-specific and share the same API client.
This is a leaf-node implementation — it only depends on the ModelInterface and CacheProvider traits being defined (#1, #25).
When to Build
Implement before ContextManager (#7). Accurate token counting stops being an approximation and starts being a correctness issue when compaction decisions, cache block hashing, and budget tracking all depend on it. The bytes / 4 heuristic in ReplayModelInterface must be replaced with the real Anthropic token counting API before #7 is implemented.
Compute costs using Anthropic's published cache pricing for the model
supports_caching() → true
provider_name() → "anthropic"
Implementation Notes
All four languages must implement both AnthropicModelInterface and AnthropicCacheProvider
HTTP client: use the language's standard async HTTP library (reqwest in Rust, fetch in TypeScript, httpx in Python, net/http in Go)
API key must never appear in logs or traces — redact in ObservabilityProvider spans
The AnthropicModelInterface is not a mock — integration tests that use it make real API calls and are tagged accordingly (Level 3 tests per Decision: Testing Strategy #20, excluded from default CI)
Retry logic lives inside the implementation, not in the harness loop — the caller never sees RateLimited from a transient 429
Test Structure
Unit tests (no real API calls):
Request serialization: verify ModelRequest translates to correct Anthropic JSON body
Response deserialization: verify Anthropic JSON response translates to correct ModelResponse
Error mapping: verify each HTTP error code maps to the correct ModelError variant
Cache annotation: verify AnthropicCacheProvider.annotate() inserts markers in the right positions
Overview
Implement
AnthropicModelInterface— the real provider implementation of theModelInterfacetrait that makes actual Anthropic API calls. Also implementsAnthropicCacheProvideras a natural companion since both are Anthropic-specific and share the same API client.This is a leaf-node implementation — it only depends on the
ModelInterfaceandCacheProvidertraits being defined (#1, #25).When to Build
Implement before
ContextManager(#7). Accurate token counting stops being an approximation and starts being a correctness issue when compaction decisions, cache block hashing, and budget tracking all depend on it. Thebytes / 4heuristic inReplayModelInterfacemust be replaced with the real Anthropic token counting API before #7 is implemented.Also unblocks:
AnthropicModelInterface
Constructor
ModelInterface implementation
call(request)ModelRequest→ Anthropic Messages API JSON bodyhttps://api.anthropic.com/v1/messagesModelResponseTokenUsagefrom responseusagefieldModelErrorvariants:ModelError::RateLimited { retry_after }ModelError::RateLimited { retry_after: None }ModelError::TimeoutModelError::ProviderError { code, message }RateLimitedandTimeoutwith exponential backoff (handled internally, caller never sees these)ContextLimitExceededchecked before the API call usingcount_tokensBudgetExceededchecked before the API call usingrequest.params.max_tokenscall_streaming(request, on_token)stream: truemessage_start→ extract input token countcontent_block_start { type: "text" }→ begin text blockcontent_block_start { type: "thinking" }→ begin thinking blockcontent_block_start { type: "tool_use" }→ begin tool use blockcontent_block_delta { type: "text_delta" }→ fireStreamEvent::TextDeltacontent_block_delta { type: "thinking_delta" }→ fireStreamEvent::ThinkingDeltacontent_block_delta { type: "input_json_delta" }→ fireStreamEvent::ToolCallDeltacontent_block_stop→ close current blockmessage_delta { stop_reason, usage }→ extract output tokensmessage_stop→ fireStreamEvent::Done, return completeModelResponsecount_tokens(request)https://api.anthropic.com/v1/messages/count_tokensbytes / 4heuristic used inReplayModelInterfaceprovider()AnthropicCacheProvider
Implements
CacheProviderfor the Anthropic prefix caching API.annotate(context)cache_control: { type: "ephemeral" }on the last message/block in each cache block:parse_cache_stats(response)usage:cache_read_input_tokens→CacheStats.cache_read_tokenscache_creation_input_tokens→CacheStats.cache_write_tokenssupports_caching()→trueprovider_name()→"anthropic"Implementation Notes
AnthropicModelInterfaceandAnthropicCacheProviderreqwestin Rust,fetchin TypeScript,httpxin Python,net/httpin Go)ObservabilityProviderspansAnthropicModelInterfaceis not a mock — integration tests that use it make real API calls and are tagged accordingly (Level 3 tests per Decision: Testing Strategy #20, excluded from default CI)RateLimitedfrom a transient 429Test Structure
Unit tests (no real API calls):
ModelRequesttranslates to correct Anthropic JSON bodyModelResponseModelErrorvariantAnthropicCacheProvider.annotate()inserts markers in the right positionsparse_cache_stats()extracts correct values from mock response JSONIntegration tests (real API, Level 3, tagged
#[ignore]/ skipped by default):call()— verifies the response shape and token usage fieldscall_streaming()— verifies all SSE event types are handledcount_tokens()— verifies it returns a non-zero countChecklist
AnthropicModelInterface+AnthropicCacheProviderAnthropicModelInterface+AnthropicCacheProviderAnthropicModelInterface+AnthropicCacheProviderAnthropicModelInterface+AnthropicCacheProvidercount_tokensheuristic replaced inReplayModelInterfacefixtures/model_responses/model_interface/basic_text.jsonlregenerated by recording against real APIRelated Issues