fix(core): apply deferred summaries when count reaches cutoff by bug-ops · Pull Request #1306 · bug-ops/zeph

bug-ops · 2026-03-06T22:34:38Z

Problem

Deferred tool pair summaries (introduced in #1303) were never applied in practice, causing tool outputs to accumulate as [pruned] content instead of being replaced by summaries.

Root cause: prepare_context calls recompute_prompt_tokens() at the end of every turn, resetting cached_prompt_tokens to the actual post-pruning value. Since pruning keeps the actual token count low, the token-based trigger (cached_prompt_tokens > budget * 0.70) was never satisfied. Deferred summaries accumulated indefinitely in message metadata without ever being applied.

Observable symptoms (visible in --debug-dump output):

Request sizes shrink while message count grows (pruning active, no summarization)
All old tool results contain [pruned]; no [tool summary] messages appear
Message count grows without bound

Fix

Add a count-based fallback in maybe_apply_deferred_summaries: apply all pending deferred summaries when pending >= tool_call_cutoff (default 6), regardless of token count.

Rationale: once prune_stale_tool_outputs has already replaced content with [pruned], the cache prefix is already invalidated — there is no benefit to deferring summary application further.

Test plan

New test tier0_count_trigger_fires_without_budget_pressure: verifies that 6 deferred summaries are applied when cached_prompt_tokens is well below the 70% budget threshold
Existing test tier0_does_not_set_compacted_this_turn unchanged
cargo nextest run --workspace --features full --lib --bins: 4436 passed

Deferred tool pair summaries were never applied in practice. After prepare_context calls recompute_prompt_tokens() at the end of each turn, cached_prompt_tokens reflects the actual post-pruning value (low), so should_apply_deferred (70% of budget) was never triggered. Content was silently replaced with [pruned] instead of [tool summary]. Add a count-based fallback: apply deferred summaries when the number of pending summaries reaches tool_call_cutoff (default 6). Once prune_stale_tool_outputs has already invalidated the cache prefix, there is no benefit to further deferring application. Add regression test: tier0_count_trigger_fires_without_budget_pressure

github-actions bot added documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate bug Something isn't working size/M Medium PR (51-200 lines) labels Mar 6, 2026

bug-ops merged commit 3c8a3db into main Mar 6, 2026
25 checks passed

bug-ops deleted the fix/deferred-summary-count-trigger branch March 6, 2026 22:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): apply deferred summaries when count reaches cutoff#1306

fix(core): apply deferred summaries when count reaches cutoff#1306
bug-ops merged 1 commit intomainfrom
fix/deferred-summary-count-trigger

bug-ops commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 6, 2026

Problem

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant