Skip to content

Codex VS Code follow-up turn fails with context_length_exceeded after needs_follow_up=true #28816

Description

@amikad09

What version of the IDE extension are you using?

0.140.0-alpha.2

What subscription do you have?

ChatGPT Plus

Which IDE are you using?

VS Code

What platform is your computer?

Linux 6.12.74+deb13+1-amd64 x86_64 unknown

What issue are you seeing?

Codex stops with this user-facing message:

Codex ran out of room in the model's context window. Start a new thread or clear earlier history before retrying.

The corresponding websocket failure in the logs is:

event.kind=response.failed
error.code=context_length_exceeded
error.message="Your input exceeds the context window of this model. Please adjust your input and try again."

The confusing part is the state immediately before the failure. The post-sampling token usage logs repeatedly show the turn entering a follow-up/continuation state, while the local context-limit flags are still false:

thread_id=019ed7f3-2509-76d3-8975-a5b51ab0411b
turn_id=019ed7fa-e2a2-7a62-a617-0e780a2b06ce
model=gpt-5.5
reasoning_effort=xhigh
total_usage_tokens=282204
estimated_token_count=Some(304934)
auto_compact_scope_tokens=282204
auto_compact_scope_limit=334800
full_context_window_limit=None
full_context_window_limit_reached=false
token_limit_reached=false
model_needs_follow_up=true
has_pending_input=false
needs_follow_up=true

There are several similar rows leading up to the failure:

total_usage_tokens=237867, estimated_token_count=Some(257051), needs_follow_up=true
total_usage_tokens=242488, estimated_token_count=Some(261711), needs_follow_up=true
total_usage_tokens=257185, estimated_token_count=Some(276508), needs_follow_up=true
total_usage_tokens=270057, estimated_token_count=Some(290746), needs_follow_up=true
total_usage_tokens=274825, estimated_token_count=Some(297507), needs_follow_up=true
total_usage_tokens=282204, estimated_token_count=Some(304934), needs_follow_up=true

All of those rows also report:

full_context_window_limit_reached=false
token_limit_reached=false

Then the same turn fails with:

event.kind=response.failed
error.code=context_length_exceeded
message="Your input exceeds the context window of this model. Please adjust your input and try again."

So the failure pattern looks like this:

post sampling token usage:
  model_needs_follow_up=true
  needs_follow_up=true
  token_limit_reached=false
  full_context_window_limit_reached=false

then:
  event.kind=response.failed
  error.code=context_length_exceeded

This looks like the follow-up / remote compaction continuation request is still too large when it is sent, even though the local turn accounting did not mark the token/context limit as reached.

I’m not sure whether the root cause is remote compaction, follow-up handling, token estimation, or the UI error message, but the logs make the current behavior hard to reason about: local accounting says the limit was not reached, but the next request fails with backend context_length_exceeded.

What steps can reproduce the bug?

I can reproduce this with long-running Codex VS Code sessions using gpt-5.5 with reasoning_effort=xhigh, especially during repo-heavy tasks where Codex reads many files and runs shell commands.

My config:

model = "gpt-5.5"
model_reasoning_effort = "xhigh"
service_tier = "default"

[projects."/srv/repsnord/app"]
trust_level = "trusted"

[projects."/"]
trust_level = "trusted"

[projects."/srv"]
trust_level = "trusted"

Typical reproduction flow:

1. Open Codex in VS Code.
2. Use model gpt-5.5 with reasoning_effort=xhigh.
3. Start a repo-heavy task that reads rules/docs/source files and runs shell commands.
4. Let the turn continue until Codex starts producing automatic follow-up/continuation behavior.
5. Codex eventually stops with: “Codex ran out of room in the model's context window.”
6. Inspect ~/.codex/logs_2.sqlite.
7. The logs show repeated model_needs_follow_up=true / needs_follow_up=true rows followed by response.failed context_length_exceeded.

Example thread:

thread_id=019ed7f3-2509-76d3-8975-a5b51ab0411b
turn_id=019ed7fa-e2a2-7a62-a617-0e780a2b06ce
app.version=0.140.0-alpha.2
originator=codex_vscode
model=gpt-5.5
reasoning_effort=xhigh

I also saw the same failure pattern in another thread:

thread_id=019ed7dd-d86c-7270-a147-404cdf301f0f
model=gpt-5.5
app.version=0.140.0-alpha.2
event.kind=response.failed
error.code=context_length_exceeded

What is the expected behavior?

If Codex logs:

model_needs_follow_up=true
needs_follow_up=true

then I would expect one of these outcomes:

  1. Codex successfully performs the follow-up turn after compaction.
  2. Codex compacts further before sending the follow-up request.
  3. Codex stops earlier with a specific message that the follow-up/compaction request could not fit.
  4. Codex logs the local context-limit flags consistently before the backend rejects the request.

Right now the local post-sampling state says:

full_context_window_limit_reached=false
token_limit_reached=false

but the next request fails with:

error.code=context_length_exceeded

That makes it difficult to tell whether the failure is caused by model context, remote compaction, follow-up handling, token estimation, or just the UI message.

Additional information

I generated a local evidence bundle from:

~/.codex/logs_2.sqlite
~/.codex/state_5.sqlite

Safe summary:

Codex VS Code 0.140.0-alpha.2
model=gpt-5.5
reasoning_effort=xhigh
originator=codex_vscode
wire_api=responses
transport=responses_websocket
x-codex-beta-features includes remote_compaction_v2

The most relevant local log pattern is:

model_needs_follow_up=true
needs_follow_up=true
token_limit_reached=false
full_context_window_limit_reached=false

followed by:

event.kind=response.failed
error.code=context_length_exceeded

I can provide more redacted excerpts if needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcontextIssues related to context management (including compaction)extensionIssues related to the VS Code extension

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions