[FEATURE]: ~90% of compaction cost is avoidable cache miss

### Feature hasn't been suggested before.

- [x] I have verified this feature I'm about to request hasn't been suggested before.

### Describe the enhancement you want to request

### Why compaction matters

opencode runs long multi-turn agent sessions. As conversation history grows, we periodically compact it — asking the LLM to summarize old messages and replacing them with a concise summary. This is essential for staying within the context window.

### The paradox

Compaction is supposed to save tokens. But to summarize the history, we have to **re-send** that same history to the LLM. The compaction request itself becomes one of the largest requests in the entire session — often 40K–50K tokens of input.

That would be acceptable if the provider's prompt cache helped. But it doesn't.

### Why the cache misses

opencode's normal chat requests share a common prefix — system prompt, tool definitions — that hits the provider's prompt cache across turns. This is already working well.

The compaction request, however, builds a completely different structure:

- **No system prompt** (compaction agent has none)
- **No tools** (empty tool set)
- **Different message history** (prior compaction messages filtered out)
- **Different serialization** (media stripped, tool output truncated)

Because the request shares zero prefix overlap with normal chat requests, **every token is charged at full input price**. The cache that was built up over dozens of turns is completely wasted.

### The cost in numbers

Compacting ~45K tokens of history with Claude Sonnet 4 ($3.00/MTok input, $0.30/MTok cached):

| | Tokens | Full price | If cached |
|---|---|---|---|
| System prompt | ~2K | $0.006 | $0.0006 |
| Tools | ~3K | $0.009 | $0.0009 |
| Dropped messages | ~40K | $0.120 | $0.012 |
| Summary instruction | ~200 | $0.0006 | $0.0006 (not cached) |
| **Total** | **~45.2K** | **$0.136** | **$0.014  (~90% saved)** |

The difference is **~10x**. Over a long session with multiple compactions, this adds up — especially ironic for an operation whose entire purpose is cost reduction.

### The fix is straightforward

The compaction request should share the same prefix as normal chat requests. Instead of building a separate request structure, keep the system prompt, tool definitions, and message serialization identical — only append a short "please summarize" instruction at the end.

```
Normal chat:   [system] + [tools] + [old messages] + [kept messages] + [user input]
Compaction:    [system] + [tools] + [old messages] + [summary prompt]
                                          ↑ cache hit up to here
```

The prefix `[system] + [tools] + [old messages]` was already sent in previous turns. The provider serves it from cache at 10% of the normal price. Only the ~200-token summary instruction is new.

This is the same approach described in the [bash-agent wiki on Cache-Aligned Summarization](https://github.com/lloydzhou/bash-agent/wiki/Cache%E2%80%90Aligned-Summarization), where it cuts compaction cost by ~90% in practice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: ~90% of compaction cost is avoidable cache miss #25120

Feature hasn't been suggested before.

Describe the enhancement you want to request

Why compaction matters

The paradox

Why the cache misses

The cost in numbers

The fix is straightforward

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	Tokens	Full price	If cached
System prompt	~2K	$0.006	$0.0006
Tools	~3K	$0.009	$0.0009
Dropped messages	~40K	$0.120	$0.012
Summary instruction	~200	$0.0006	$0.0006 (not cached)
Total	~45.2K	$0.136	$0.014 (~90% saved)

[FEATURE]: ~90% of compaction cost is avoidable cache miss #25120

Description

Feature hasn't been suggested before.

Describe the enhancement you want to request

Why compaction matters

The paradox

Why the cache misses

The cost in numbers

The fix is straightforward

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions