-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Add the utility to truncate by tokens #6746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@codex review this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
codex/codex-rs/core/src/truncate.rs
Lines 114 to 117 in d1d0644
| if omitted_text_items > 0 { | |
| out.push(FunctionCallOutputContentItem::InputText { | |
| text: format!("[omitted {omitted_text_items} text items ...]"), | |
| }); |
The new token budget helper for structured tool output is still able to exceed the caller-provided limit. In truncate_function_output_items_to_token_limit, once we run out of remaining_tokens we increment omitted_text_items, but after the loop we always append the [omitted …] summary string (lines 114-117). That summary itself consumes several tokens yet we add it even when remaining_tokens is zero, so the returned content_items can exceed the configured cap. A concrete regression: if a user sets context_manager_function_output_max_tokens = 0, every text item hits the remaining_tokens == 0 branch, and we still return a single [omitted …] span—so the history now stores tokens even though the limit was zero. The same happens whenever the budget is exhausted exactly; the summary pushes us over. Please gate the summary on having remaining budget (or drop it when the budget is zero) so the function truly enforces the requested token cap.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
|
|
||
| #[test] | ||
| fn record_items_truncates_function_call_output_content() { | ||
| let model = "gpt-5.1-codex"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove all dynamic model lookups in tests that don't care about a specific mdoel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| TruncationPolicy::Bytes(family_bytes) => { | ||
| if let Some(token_limit) = config_token_limit { | ||
| Self::Bytes(approx_bytes_for_tokens(token_limit)) | ||
| } else { | ||
| Self::Bytes(approx_bytes_for_tokens(family_bytes)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stop inflating byte truncation budgets
When we hit the Bytes branch we already have a byte budget coming from model_family.truncation_policy — e.g. the default family for GPT‑5.1 sets TruncationPolicy::Bytes(10_000) (see codex-rs/core/src/model_family.rs:95,192), which was carried over from the previous MODEL_FORMAT_MAX_BYTES = 10 * 1024 limit. Running that value through approx_bytes_for_tokens multiplies it by four (~40 KiB), so byte-based families will now keep ~4× more tool output in history before truncating. That’s a regression: by default GPT‑5.1 will no longer truncate until ~40 KiB, re-introducing the context bloat this change was meant to prevent. We should just return Self::Bytes(family_bytes) in the None case.
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
calls_output_max_tokenstokensas the main budget unit but truncate based on the model family by IntroducingTruncationPolicy.truncate_textas a router for truncation based on the mode.In next PRs: