-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Description
What feature would you like to see?
Recently, Codex has been enhanced with a configurable tool_output_token_limit.
Hey everyone! Thanks for waiting and for the feedback. In
v0.60.0, we pumped the default for Codex models to 10k tokens. Almost 4x the previous limit. You can configure it also using this config valuetool_output_token_limit. Please let us know how does this work for you.
by @aibrahim-oai in #6426
Although the increased defaults only apply to gpt-5.1-codex models, the tool_output_token config also applies to gpt-5.1 (non-codex) models.
Which leads to the issue: the system prompt for gpt-5.1 (non-codex) models hard-codes outmoded output limits via prompting:
codex/codex-rs/core/gpt_5_1_prompt.md
Line 322 in 13c0919
| - Read files in chunks with a max chunk size of 250 lines. Do not use python scripts to attempt to output larger chunks of a file. Command line output will be truncated after 10 kilobytes or 256 lines of output, regardless of the command used. |
This hard-coding also applies to non-5.1, non-codex models:
Line 300 in 13c0919
| - Read files in chunks with a max chunk size of 250 lines. Do not use python scripts to attempt to output larger chunks of a file. Command line output will be truncated after 10 kilobytes or 256 lines of output, regardless of the command used. |
How this affects behavior
Consider the scenario where config.toml has the following settings:
model = "gpt-5.1"
tool_output_token_limit = 25000
Despite this configuration, gpt-5.1 (non-codex) insists on behaving as if tool_output_token_limit is constrained to 256 lines/10kb (~2,500 tokens). It refuses instructions to read files in full. If you request that gpt-5.1 read a file in full, it will spend its time devising a strategy to read that file in 250-line chunks. Furthermore, in the case that you are able to coerce the model to read an entire file, the summarized thinking traces reveal that to model believes the output is truncated.
This leads to the same problematic behavior for gpt-5.1 (non-codex) models stated in the original issue like too many tool calls, model confusion, extended task execution time, and lower-iq output.
The gpt-*-codex models do not have these instructions and, thus, do not face these problems.
Proposed Solution
Remove the outmoded shell instructions.
When using the shell, you must adhere to the following guidelines:
- When searching for text or files, prefer using `rg` or `rg --files` respectively because `rg` is much faster than alternatives like `grep`. (If the `rg` command is not found, then use alternatives.)
-- Read files in chunks with a max chunk size of 250 lines. Do not use python scripts to attempt to output larger chunks of a file. Command line output will be truncated after 10 kilobytes or 256 lines of output, regardless of the command used. If chunked file reading is desirable, then please replace the line with a softer prompt that does not incorrectly suggest a 256-line/10kb truncation threshold. Furthermore, we should be able to prompt gpt-5.1 (non-codex) to read files in full.
Additional information
No response