v0.5.0
- Native SSE streaming for Anthropic & OpenAI — the single-shot fallback is
gone. Text/thinking deltas stream as partial chunks, tool-call arguments are
reassembled across fragments, and the final chunk carries the stop reason and
usage. All three providers now share one chunk contract. - Multimodal input for Anthropic & OpenAI — inline images map to base64
imageblocks /data:URIimage_urlparts, inline PDFs to Anthropic
documentblocks,https://file references to URL sources. Unsupported
parts are dropped with a warning — never silently. - Anthropic prompt caching — the same
ContextCacheConfigthat drives
Gemini's explicit cache becomes acache_controlbreakpoint on the system
block (or last tool);ttl_seconds >= 3600selects the 1-hour tier. Cache
activity surfaces viacache_metadataandcached_content_token_count. - Reasoning-model fix —
max_output_tokensis sent as
max_completion_tokensfor o-series / gpt-5 models (which reject the
deprecatedmax_tokens), keeping older OpenAI-compatible servers on the
legacy field. examples/compat_check— live wire-compatibility smoke test against the
real Anthropic and OpenAI APIs (generation, streaming, tools, structured
output, images, caching). Run it with your keys; exits non-zero on failure.
Full Changelog: v0.4.0...v0.5.0