Skip to content

feat: cache tool definitions and add prompt_caching config toggle#78

Merged
emal-avala merged 1 commit intomainfrom
feat/provider-prompt-caching
Apr 6, 2026
Merged

feat: cache tool definitions and add prompt_caching config toggle#78
emal-avala merged 1 commit intomainfrom
feat/provider-prompt-caching

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

Completes prompt caching support by adding cache_control to tool definitions and exposing a user-facing config toggle.

What was already implemented (this PR does NOT change):

  • System prompt cached with cache_control: { type: "ephemeral" }
  • Conversation history breakpoints via messages_to_api_params_cached()
  • Usage struct tracks cache_read_input_tokens and cache_creation_input_tokens
  • /cost command shows cache hit percentage per model
  • CacheTracker detects cache breaks via fingerprinting
  • anthropic-beta: prompt-caching-2024-07-31 header

What this PR adds:

  • Tool definition caching: cache_control: { type: "ephemeral" } on the last tool in the tools array, so the API caches the entire prefix (system + tools) as one block. With 32 tools (~15K tokens), this saves ~$0.003/turn on hits.
  • Config toggle: features.prompt_caching (default: true) in config.toml. Users on providers without caching support can disable it to avoid unknown fields in requests.
  • Wire config into request: query/mod.rs now reads the feature flag instead of hardcoding enable_caching: true.

Files changed

File Change
crates/lib/src/llm/anthropic.rs Add cache_control to last tool definition
crates/lib/src/llm/client.rs Same in legacy client path
crates/lib/src/config/schema.rs Add prompt_caching feature flag
crates/lib/src/query/mod.rs Wire feature flag into ProviderRequest

Implements roadmap item 7.13.

Test plan

  • cargo fmt --all -- --check — clean
  • cargo clippy -- -D warnings — zero warnings
  • cargo test — all tests pass
  • Manual: verify tool definitions include cache_control in API request (debug log)
  • Manual: verify prompt_caching = false in config.toml disables all cache_control markers
  • Manual: verify /cost shows cache hit rate improvement after tools caching

🤖 Generated with Claude Code

Prompt caching for system prompt and conversation history was already
implemented. This adds the missing piece: cache_control on the last
tool definition in the tools array, which lets the API cache the
entire prefix (system prompt + tools) as a single block.

Changes:
- anthropic.rs: Add cache_control: {type: "ephemeral"} to the last
  tool definition when enable_caching is true
- client.rs: Same change in the legacy client path
- schema.rs: Add `prompt_caching` feature flag (default: true) so
  users can disable caching for providers that don't support it
- query/mod.rs: Wire feature flag into ProviderRequest instead of
  hardcoding enable_caching: true

With 32 tool definitions (~15K tokens), this saves ~$0.003/turn on
cache hits. Over a 50-turn session, that's ~$0.15 saved on tools
alone, on top of the existing system prompt and history caching.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@emal-avala emal-avala merged commit 367b759 into main Apr 6, 2026
13 of 14 checks passed
@emal-avala emal-avala deleted the feat/provider-prompt-caching branch April 6, 2026 10:01
emal-avala added a commit that referenced this pull request Apr 6, 2026
PR #78 squash merge accidentally removed the 15 feature specs added by
PR #76. This restores them and applies accuracy corrections from an
audit against the actual codebase:

- 7.13 Provider Prompt Caching: marked Done — system prompt caching,
  message breakpoints, cache tracking, cost display, tool caching,
  and config toggle all implemented
- 7.14 Local LLM Auto-Discovery: marked Partially Done — Ollama
  detection already exists in setup.rs, only LM Studio/llama.cpp
  remaining
- 7.15 Conversation Branching: marked Partially Done — /fork command
  and /resume exist, advanced branching (named branches, checkout,
  merge) still needed
- Updated Contributing section to reflect completed items
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant