[codex] Preserve raw code-mode exec output by default#23564
Conversation
|
@codex review this |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
| "# | ||
| ); | ||
|
|
||
| assert_code_mode_exec_output_snapshot( |
| let req = second_mock.single_request(); | ||
| let items = custom_tool_output_items(&req, "call-1"); | ||
| assert_eq!(items.len(), 2); | ||
| assert_regex_match( |
There was a problem hiding this comment.
can we keep this style of assertion? for all new tests?
As in capture exactly what the model will see
| original_token_count: self.original_token_count, | ||
| output: self.truncated_output(), | ||
| output: match self.max_output_tokens { | ||
| Some(_) => self.truncated_output(), |
There was a problem hiding this comment.
should we
- have upper bound even for exec
- allow exec to request higher limit than the model defaut?
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 48c903e48a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| output: match self.max_output_tokens { | ||
| Some(max_tokens) => self.truncated_output(max_tokens), | ||
| None => String::from_utf8_lossy(&self.raw_output).to_string(), |
There was a problem hiding this comment.
Preserve truncation notice for oversized raw exec output
When a code-mode exec omits max_output_tokens, this now returns raw_output verbatim, but the bytes are still sourced from the unified-exec HeadTailBuffer, which caps retained output at 1 MiB and drops the middle for a fast/short-lived command that emits more than that before the collector drains it. In that scenario code mode receives a head+tail concatenation with no …truncated… marker, so scripts can parse or summarize corrupted data as if it were complete; the previous formatted path at least surfaced that the result was truncated.
Useful? React with 👍 / 👎.
| let (_test, second_mock) = run_code_mode_turn_with_config( | ||
| &server, | ||
| "use exec_command from code mode", | ||
| r#"// @exec: {"max_output_tokens": 20000} |
There was a problem hiding this comment.
can we add a test for exec_command compaction high but working?
Why
Code mode can use nested unified exec calls as data sources. When those calls omit
max_output_tokens, code mode should receive raw command output so the script can parse or summarize it itself. When code mode does providemax_output_tokens, that explicit nested budget should be respected, including values above the default unified exec limit, rather than being capped before code mode sees the result.What
max_output_tokensasNoneunless explicitly supplied.truncated_outputtakes the caller-selected token budget.Testing
just fmtgit diff --check