Skip to content

Add body_after_prefix auto-compact token limit scope#22870

Open
jif-oai wants to merge 1 commit into
mainfrom
jif/more-compat-mech
Open

Add body_after_prefix auto-compact token limit scope#22870
jif-oai wants to merge 1 commit into
mainfrom
jif/more-compat-mech

Conversation

@jif-oai
Copy link
Copy Markdown
Collaborator

@jif-oai jif-oai commented May 15, 2026

Why

model_auto_compact_token_limit has only been able to budget the full active context. That makes it hard to set a small "growth since compaction" budget for sessions that preserve a large carried window prefix: the preserved prefix can consume the whole budget and force immediate repeated compaction.

This PR adds an opt-in body_after_prefix scope so callers can apply model_auto_compact_token_limit to sampled output and later growth after the current carried prefix, while still forcing compaction before the full model context window is exhausted.

What changed

  • Adds AutoCompactTokenLimitScope with the existing total behavior as the default and a new body_after_prefix mode: config_types.rs.
  • Threads model_auto_compact_token_limit_scope through config loading, Config, core-api, and app-server v2 schema/TypeScript generation.
  • Records the first observed input-token count for a body_after_prefix compaction window and uses it as the baseline when deciding whether the scoped auto-compaction budget is exhausted: turn.rs.
  • Keeps a hard context-window cap in body_after_prefix, so scoped budgeting cannot let the active context overrun the usable window.

Verification

Added compact-suite coverage for the two key behaviors: body_after_prefix does not re-compact just because the carried prefix is larger than the scoped budget, and it still compacts when the total active context reaches the configured context window: compact.rs.

@jif-oai jif-oai requested a review from a team as a code owner May 15, 2026 17:13
@jif-oai jif-oai changed the title feat: add body_after_prefix Add body_after_prefix auto-compact token limit scope May 15, 2026
@jif-oai
Copy link
Copy Markdown
Collaborator Author

jif-oai commented May 15, 2026

@codex review

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 973806b1cb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +759 to +762
let baseline = sess
.auto_compact_window_prefix_input_tokens()
.await
.unwrap_or(active_context_tokens);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count first-turn growth after a carried prefix

When body_after_prefix starts a new compaction window after compaction or resume, this fallback makes the current active context the baseline whenever no prefix has been recorded yet. That means any large user/developer input added after the carried prefix but before the first model response is budgeted as zero, and the later ensure_auto_compact_window_prefix_input_tokens() call records an input_tokens baseline that already includes that growth, so a small scoped limit can be bypassed until the full context-window cap is hit.

Useful? React with 👍 / 👎.


/// Controls whether `model_auto_compact_token_limit` applies to the full
/// active context or only tokens after the carried compaction-window prefix.
pub model_auto_compact_token_limit_scope: AutoCompactTokenLimitScope,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add compaction scope to guardian reuse key

Adding this behavior-affecting Config field without also including it in GuardianReviewSessionReuseKey::from_spawn_config lets cached guardian review sessions be reused when the parent config changes only between total and body_after_prefix. In that scenario the spawned review session keeps the stale auto-compaction behavior even though build_guardian_review_session_config clones the updated config for each review request, so the new field should participate in the reuse key just like model_auto_compact_token_limit.

Useful? React with 👍 / 👎.

Comment on lines +841 to +842
let token_status = auto_compact_token_status(sess.as_ref(), turn_context.as_ref()).await;
let should_run = token_status.token_limit_reached
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reserve previous-model compaction for downshifts

In body_after_prefix mode during a switch to a smaller-context model, token_status.token_limit_reached can be true solely because the scoped growth budget is exhausted while the full active context still fits the new model. This sends the pre-sampling compaction through previous_model_turn_context, so the request goes to the old model/provider instead of the newly requested one; the previous-model path should stay limited to actual new-window overflow/downshift cases.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant