Skip to content

[FEATURE]: Anthropic (and others) caching improvement #5416

@ormandj

Description

@ormandj

Feature hasn't been suggested before.

  • I have verified this feature I'm about to request hasn't been suggested before.

Describe the enhancement you want to request

I've recently started utilizing opencode, and saw my token usage was somewhat higher with the same general workflow in opencode that I was using in claude code. After doing a little research, I determined the cache model and prompt structure being used was suboptimal for claude-based models.

I've submitted a PR that attempts to address this, and allows configuration at the provider and per-agent level. Some of my workstreams may run for long periods of time, and have gaps inbetween runs for certain types of agents (for example, review agents may not run frequently, but when they do have a large amount of static context used for their instructions), so allowing TTL to be overridden at the agent level made sense to me.

I did some basic testing with the patch, and it made a significant difference in non-cached vs. cached usage, which with claude pricing, can make a huge difference in the cost of using these LLMs. Unfortunately, the minimum cache size wasn't available programmatically, so I had to build a lookup table for the various models. Basic performance/cache testing is in the PR.

I tried to create the PR in a way that wouldn't negatively impact any other models/providers, but could also be used as a starting point for other models/providers that had specific cache implementation requirements. Later this model can be extended to configuration beyond caching.

Metadata

Metadata

Assignees

No one assigned

    Labels

    discussionUsed for feature requests, proposals, ideas, etc. Open discussion

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions