Skip to content

Add provider-specific prompt caching support in owrap.ai.js (OpenAI + Anthropic)#1844

Merged
nmaguiar merged 9 commits into
t8from
replicate-1841-t8
May 22, 2026
Merged

Add provider-specific prompt caching support in owrap.ai.js (OpenAI + Anthropic)#1844
nmaguiar merged 9 commits into
t8from
replicate-1841-t8

Conversation

@nmaguiar
Copy link
Copy Markdown
Collaborator

This PR extends GPT provider handling in js/owrap.ai.js to properly surface prompt-cache token accounting and enable Anthropic prompt caching controls without changing default behavior. Gemini/Ollama behavior remains unchanged aside from documentation clarity.\n\n- OpenAI: capture cache-aware usage stats\n - Extended _captureStats to include nested usage fields when present:\n - usage.prompt_tokens_details.cached_tokenstokens.cached\n - usage.prompt_tokens_details.audio_tokenstokens.audio\n - usage.completion_tokens_details.reasoning_tokenstokens.reasoning\n\n- Anthropic: opt-in prompt caching support\n - Added aOptions.promptCaching (default false).\n - When enabled, both _request and _requestStream send:\n - anthropic-beta: prompt-caching-2024-07-31\n - Extended _captureStats with:\n - usage.cache_creation_input_tokenstokens.cacheCreation\n - usage.cache_read_input_tokenstokens.cacheRead\n\n- Anthropic: cache boundary hints in payload\n - When promptCaching is enabled:\n - system is emitted as content blocks with cache_control: { type: \"ephemeral\" }.\n - The last cacheable user message/block is marked with cache_control: { type: \"ephemeral\" } (while avoiding tool_result-only blocks).\n\n- ODoc updates (ow.ai.gpt + ``)\n - Documented promptCaching option for Anthropic.\n - Documented that `getLastStats()` now includes OpenAI cached-token accounting and Anthropic cache read/creation token counters.\n - Added note that Gemini may implicitly cache large system instructions.\n\n- Focused AI tests\n - Added tests for:\n - OpenAI cached/audio/reasoning token capture.\n - Anthropic prompt-caching beta header behavior.\n - Anthropic `cache_control` payload shaping and cache stats extraction.\n\njavascript\nif (isMap(aResponse.usage.prompt_tokens_details)) {\n if (isDef(aResponse.usage.prompt_tokens_details.cached_tokens))\n tokens.cached = aResponse.usage.prompt_tokens_details.cached_tokens;\n}\nif (isMap(aResponse.usage.completion_tokens_details)) {\n if (isDef(aResponse.usage.completion_tokens_details.reasoning_tokens))\n tokens.reasoning = aResponse.usage.completion_tokens_details.reasoning_tokens;\n}\n

nmaguiar and others added 9 commits May 16, 2026 16:48
Update dependencies and enhance documentation for OpenAF
Agent-Logs-Url: https://github.com/OpenAF/openaf/sessions/62a5d58d-75bd-4f94-a820-746c4b047a40

Co-authored-by: nmaguiar <11761746+nmaguiar@users.noreply.github.com>
Agent-Logs-Url: https://github.com/OpenAF/openaf/sessions/62a5d58d-75bd-4f94-a820-746c4b047a40

Co-authored-by: nmaguiar <11761746+nmaguiar@users.noreply.github.com>
Agent-Logs-Url: https://github.com/OpenAF/openaf/sessions/62a5d58d-75bd-4f94-a820-746c4b047a40

Co-authored-by: nmaguiar <11761746+nmaguiar@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@nmaguiar nmaguiar self-assigned this May 22, 2026
@nmaguiar nmaguiar merged commit 042fac6 into t8 May 22, 2026
1 check passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Continuous Enhancement May 22, 2026
@nmaguiar nmaguiar deleted the replicate-1841-t8 branch May 22, 2026 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants