Skip to content

Codex chokes on big diffs — please add auto-compact/auto-continue like Claude Code #3967

@ArtAndrew

Description

@ArtAndrew

What version of Codex is running?

0.39.0

Which model were you using?

codex5-thinking

What platform is your computer?

macOS

What steps can reproduce the bug?

TL;DR: I feed a series of git diff chunks via my CLI (gitbut) and Codex kills the stream with “Your input exceeds the context window.” Claude Code has an autocompact mode: when tokens run out, it compresses context and keeps going. Codex doesn’t — I’m forced to manually slice/summarize. Feature request: built-in auto-compact + auto-continue.

What happened

I was doing a front-end review and sent diffs one after another:
git diff -- 'frontend/src/app/(auth)/register/page.tsx'
git diff -- 'frontend/src/services/api.ts'
git diff -- 'frontend/src/services/seoApi.ts'

Then the stream died with:
⚠️ stream error: stream disconnected before completion: Your input exceeds the context window of this model. Please adjust
your input and try again.; retrying 3/5 in 795ms…

⚠️ stream error: stream disconnected before completion: Your input exceeds the context window of this model. Please adjust
your input and try again.; retrying 4/5 in 1.698s…

⚠️ stream error: stream disconnected before completion: Your input exceeds the context window of this model. Please adjust
your input and try again.; retrying 5/5 in 3.44s…

■ stream disconnected before completion: Your input exceeds the context window of this model. Please adjust your input
and try again.

Expectation vs reality

Expected: when nearing the context limit, the model would automatically compress prior messages/diffs (semantic summarization) and continue without hand-holding.
Got: hard stop. I have to manually chunk diffs, tell it where to resume, and restate context.

Why this matters
Big PRs, refactors, API migrations… that’s normal life. Code models shine here if they maintain a “sliding window” and don’t drop the thread.

What Claude Code does
Claude Code has an autocompact behavior: when tokens run out, it condenses the history and keeps working. That removes 90% of the friction on long diffs.

Feature request for Codex (OpenAI)

Auto-compact / Auto-continue: compress conversation state (instructions, conclusions, invariants) as the limit approaches, then keep going.
Pre-flight token estimate: warn and pre-split input before streaming dies.
Priority-aware sliding window: pin critical artifacts (architecture decisions, type contracts), aggressively compress raw diff chatter.
Better error UX: not just “exceeds context window,” but “compressed X→Y tokens; continue? summarize last N messages?”
APIs for tools like gitbut: let the model request “next batch” or “send summary of previous chunk” without human intervention.

Questions for the community

Any good open-source middlewares that auto-compact history for Codex?
Favorite diff-batching strategies for mono-repos/front-end heavy changes?

If you’ve hit the same wall, an upvote/comment would help get this on the team’s radar. Thanks!

What is the expected behavior?

No response

What do you see instead?

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions