Spill large hook outputs from context#21069
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0ed973a25e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ); | ||
| } | ||
|
|
||
| if let Err(err) = fs::write(path.as_ref(), &text).await { |
There was a problem hiding this comment.
Bound spilled hook output retention
Every oversized hook output is written to a fresh UUID file, but this change adds no retention, quota, or cleanup for $CODEX_HOME/hook_outputs. A noisy or malicious hook that emits large text on each turn will leak disk indefinitely and can fill the user's home directory; add a size/age cap or cleanup path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
since we're currently leaving in rollouts unbounded and other paths like imagegen artifacts have no cleanup I think we can punt on this
eternal-openai
left a comment
There was a problem hiding this comment.
probably more sturdy to write these files to a nice cross-os /tmp thing? tmp files are often cleaned up automatically by the OS. this solves several problems here:
- your codex home directory growing indefinitely, especially if there's lots of hook runs
- ephemeral threads shouldn't really have anything out of memory, but seems fine to stash some stuff in tmp that'll be deleted quickish
downside of this is that the spilled data could be lost if you come back to the thread after an os reboot, but that seems okay in this case
Why
Large hook outputs can enter model-visible context through hook-specific paths such as
additionalContextandStopcontinuation prompts. Without a dedicated cap, one hook can inject a large blob directly into conversation history instead of leaving a bounded preview for the model and preserving the full text elsewhere.What
2_500-token budget, preserving the full output on disk and leaving a head/tail preview plus saved path in contextCODEX_HOME/hook_outputs/<thread_id>/<uuid>.txtadditionalContext,feedback_message, andStopcontinuation fragments