-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Description
What version of Codex is running?
0.39.0
Which model were you using?
codex5-thinking
What platform is your computer?
macOS
What steps can reproduce the bug?
TL;DR: I feed a series of git diff chunks via my CLI (gitbut) and Codex kills the stream with “Your input exceeds the context window.” Claude Code has an autocompact mode: when tokens run out, it compresses context and keeps going. Codex doesn’t — I’m forced to manually slice/summarize. Feature request: built-in auto-compact + auto-continue.
What happened
I was doing a front-end review and sent diffs one after another:
git diff -- 'frontend/src/app/(auth)/register/page.tsx'
git diff -- 'frontend/src/services/api.ts'
git diff -- 'frontend/src/services/seoApi.ts'
Then the stream died with:
your input and try again.; retrying 3/5 in 795ms…
your input and try again.; retrying 4/5 in 1.698s…
your input and try again.; retrying 5/5 in 3.44s…
■ stream disconnected before completion: Your input exceeds the context window of this model. Please adjust your input
and try again.
Expectation vs reality
Expected: when nearing the context limit, the model would automatically compress prior messages/diffs (semantic summarization) and continue without hand-holding.
Got: hard stop. I have to manually chunk diffs, tell it where to resume, and restate context.
Why this matters
Big PRs, refactors, API migrations… that’s normal life. Code models shine here if they maintain a “sliding window” and don’t drop the thread.
What Claude Code does
Claude Code has an autocompact behavior: when tokens run out, it condenses the history and keeps working. That removes 90% of the friction on long diffs.
Feature request for Codex (OpenAI)
Auto-compact / Auto-continue: compress conversation state (instructions, conclusions, invariants) as the limit approaches, then keep going.
Pre-flight token estimate: warn and pre-split input before streaming dies.
Priority-aware sliding window: pin critical artifacts (architecture decisions, type contracts), aggressively compress raw diff chatter.
Better error UX: not just “exceeds context window,” but “compressed X→Y tokens; continue? summarize last N messages?”
APIs for tools like gitbut: let the model request “next batch” or “send summary of previous chunk” without human intervention.
Questions for the community
Any good open-source middlewares that auto-compact history for Codex?
Favorite diff-batching strategies for mono-repos/front-end heavy changes?
If you’ve hit the same wall, an upvote/comment would help get this on the team’s radar. Thanks!
What is the expected behavior?
No response
What do you see instead?
No response
Additional information
No response