Skip to content

Conversation

@aibrahim-oai
Copy link
Collaborator

@aibrahim-oai aibrahim-oai commented Nov 16, 2025

  • This PR is to make it on path for truncating by tokens. This path will be initially used by unified exec and context manager (responsible for MCP calls mainly).
  • We are exposing new config calls_output_max_tokens
  • Use tokens as the main budget unit but truncate based on the model family by Introducing TruncationPolicy.
  • Introduce truncate_text as a router for truncation based on the mode.

In next PRs:

  • remove truncate_with_line_bytes_budget
  • Add the ability to the model to override the token budget.

@aibrahim-oai
Copy link
Collaborator Author

@codex review this

@aibrahim-oai aibrahim-oai changed the title Truncate by tokens Add the utility to truncate by tokens Nov 17, 2025
Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

if omitted_text_items > 0 {
out.push(FunctionCallOutputContentItem::InputText {
text: format!("[omitted {omitted_text_items} text items ...]"),
});

P1 Badge Respect token budget for tool content_items

The new token budget helper for structured tool output is still able to exceed the caller-provided limit. In truncate_function_output_items_to_token_limit, once we run out of remaining_tokens we increment omitted_text_items, but after the loop we always append the [omitted …] summary string (lines 114-117). That summary itself consumes several tokens yet we add it even when remaining_tokens is zero, so the returned content_items can exceed the configured cap. A concrete regression: if a user sets context_manager_function_output_max_tokens = 0, every text item hits the remaining_tokens == 0 branch, and we still return a single [omitted …] span—so the history now stores tokens even though the limit was zero. The same happens whenever the budget is exhausted exactly; the summary pushes us over. Please gate the summary on having remaining budget (or drop it when the budget is zero) so the function truly enforces the requested token cap.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@aibrahim-oai
Copy link
Collaborator Author

@codex review this


#[test]
fn record_items_truncates_function_call_output_content() {
let model = "gpt-5.1-codex";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove all dynamic model lookups in tests that don't care about a specific mdoel?

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 24 to 29
TruncationPolicy::Bytes(family_bytes) => {
if let Some(token_limit) = config_token_limit {
Self::Bytes(approx_bytes_for_tokens(token_limit))
} else {
Self::Bytes(approx_bytes_for_tokens(family_bytes))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop inflating byte truncation budgets

When we hit the Bytes branch we already have a byte budget coming from model_family.truncation_policy — e.g. the default family for GPT‑5.1 sets TruncationPolicy::Bytes(10_000) (see codex-rs/core/src/model_family.rs:95,192), which was carried over from the previous MODEL_FORMAT_MAX_BYTES = 10 * 1024 limit. Running that value through approx_bytes_for_tokens multiplies it by four (~40 KiB), so byte-based families will now keep ~4× more tool output in history before truncating. That’s a regression: by default GPT‑5.1 will no longer truncate until ~40 KiB, re-introducing the context bloat this change was meant to prevent. We should just return Self::Bytes(family_bytes) in the None case.

Useful? React with 👍 / 👎.

@aibrahim-oai
Copy link
Collaborator Author

@codex review

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@aibrahim-oai
Copy link
Collaborator Author

@codex review this

@aibrahim-oai aibrahim-oai merged commit 3de8790 into main Nov 18, 2025
25 checks passed
@aibrahim-oai aibrahim-oai deleted the truncate-by-tokens branch November 18, 2025 19:36
@github-actions github-actions bot locked and limited conversation to collaborators Nov 18, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants