Add body_after_prefix auto-compact token limit scope#22870
Conversation
body_after_prefixbody_after_prefix auto-compact token limit scope
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 973806b1cb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let baseline = sess | ||
| .auto_compact_window_prefix_input_tokens() | ||
| .await | ||
| .unwrap_or(active_context_tokens); |
There was a problem hiding this comment.
Count first-turn growth after a carried prefix
When body_after_prefix starts a new compaction window after compaction or resume, this fallback makes the current active context the baseline whenever no prefix has been recorded yet. That means any large user/developer input added after the carried prefix but before the first model response is budgeted as zero, and the later ensure_auto_compact_window_prefix_input_tokens() call records an input_tokens baseline that already includes that growth, so a small scoped limit can be bypassed until the full context-window cap is hit.
Useful? React with 👍 / 👎.
|
|
||
| /// Controls whether `model_auto_compact_token_limit` applies to the full | ||
| /// active context or only tokens after the carried compaction-window prefix. | ||
| pub model_auto_compact_token_limit_scope: AutoCompactTokenLimitScope, |
There was a problem hiding this comment.
Add compaction scope to guardian reuse key
Adding this behavior-affecting Config field without also including it in GuardianReviewSessionReuseKey::from_spawn_config lets cached guardian review sessions be reused when the parent config changes only between total and body_after_prefix. In that scenario the spawned review session keeps the stale auto-compaction behavior even though build_guardian_review_session_config clones the updated config for each review request, so the new field should participate in the reuse key just like model_auto_compact_token_limit.
Useful? React with 👍 / 👎.
| let token_status = auto_compact_token_status(sess.as_ref(), turn_context.as_ref()).await; | ||
| let should_run = token_status.token_limit_reached |
There was a problem hiding this comment.
Reserve previous-model compaction for downshifts
In body_after_prefix mode during a switch to a smaller-context model, token_status.token_limit_reached can be true solely because the scoped growth budget is exhausted while the full active context still fits the new model. This sends the pre-sampling compaction through previous_model_turn_context, so the request goes to the old model/provider instead of the newly requested one; the previous-model path should stay limited to actual new-window overflow/downshift cases.
Useful? React with 👍 / 👎.
Why
model_auto_compact_token_limithas only been able to budget the full active context. That makes it hard to set a small "growth since compaction" budget for sessions that preserve a large carried window prefix: the preserved prefix can consume the whole budget and force immediate repeated compaction.This PR adds an opt-in
body_after_prefixscope so callers can applymodel_auto_compact_token_limitto sampled output and later growth after the current carried prefix, while still forcing compaction before the full model context window is exhausted.What changed
AutoCompactTokenLimitScopewith the existingtotalbehavior as the default and a newbody_after_prefixmode:config_types.rs.model_auto_compact_token_limit_scopethrough config loading,Config,core-api, and app-server v2 schema/TypeScript generation.body_after_prefixcompaction window and uses it as the baseline when deciding whether the scoped auto-compaction budget is exhausted:turn.rs.body_after_prefix, so scoped budgeting cannot let the active context overrun the usable window.Verification
Added compact-suite coverage for the two key behaviors:
body_after_prefixdoes not re-compact just because the carried prefix is larger than the scoped budget, and it still compacts when the total active context reaches the configured context window:compact.rs.