Skip to content

--remote resume fails on large saved sessions after large thread/resume response #19837

@mib00038

Description

@mib00038

What version of Codex CLI is running?

v0.125.0

What subscription do you have?

PRO

Which model were you using?

gpt-5.5

What platform is your computer?

Ubuntu 24.04.4 LTS

What terminal emulator and version are you using (if applicable)?

Terminal

What issue are you seeing?

codex --remote <ws> resume <thread-id> fails on large saved sessions.
The same session resumes with plain codex resume <thread-id>.

Observed user-facing error:

Error: Failed to resume session from /home/<user>/.codex/sessions/2026/04/23/rollout-2026-04-23T19-13-22-<thread-id>.jsonl

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>

The visible output does not include a caused-by chain, and
codex-tui.log does not show the underlying failure reason.

Environment:

  • codex-cli 0.125.0
  • Ubuntu 24.04, x86_64
  • Isolated CODEX_HOME copied from the user's auth/config only
  • codex --remote ws://127.0.0.1:<port> resume <thread-id> through a
    WebSocket proxy to a stock Codex app-server child

What was observed on the wire for the smallest failing fixture:

  • The visible TUI opens the WebSocket and completes the upgrade.
  • It sends normal startup RPCs (initialize, account/thread/model reads).
  • It sends thread/resume.
  • The app-server side returns a thread/resume result.
  • The response is large: the WebSocket frame length observed by strace
    was about 16 MiB for a 56 MiB JSONL fixture.
  • The visible TUI exits with Failed to resume session... just after it
    begins receiving that large response.

This does not look like a bad saved-session file:

  • Plain non-remote codex resume <thread-id> works on the original large
    session.
  • Plain non-remote codex resume <thread-id> also stayed alive on the
    generated 56 MiB fixture until manually interrupted.

What steps can reproduce the bug?

  1. Prepare a Codex saved-session JSONL large enough to produce a large
    remote thread/resume response. In our bisect, 52 MiB passed and
    56 MiB failed.

  2. Start a Codex app-server reachable over WebSocket. In our setup a
    WebSocket proxy sits in front of a stock Codex app-server child.

  3. Run:

    codex --remote ws://127.0.0.1:<port> resume <thread-id>
  4. Observe that the TUI exits with Failed to resume session from ....

  5. Run:

    codex resume <thread-id>
  6. Observe that the same saved session opens without the remote path.

Measured fixture results:

JSONL size Result
4 MiB Pass
16 MiB Pass
32 MiB Pass
48 MiB Pass
52 MiB Pass
56 MiB Fail
64 MiB Fail
70.8 MiB full fixture, cli_version edited to 0.125.0 Fail

The original session recorded cli_version: 0.122.0; editing the copied
fixture header to 0.125.0 did not change the failure, so this does not
appear to be caused only by the recorded version stamp.

What is the expected behavior?

codex --remote <ws> resume <thread-id> should handle large saved
sessions on parity with plain codex resume <thread-id>.

If a large remote resume cannot be supported, the TUI should report a
specific cause, such as a response-size limit, timeout, decode failure,
or transport close reason. The current output hides the actionable cause.

Additional information

Operator workaround:

codex resume <thread-id>

That bypasses --remote, so it is usable for manual recovery but does
not work for tools that need the remote app-server transport.

Metadata

Metadata

Assignees

Labels

TUIIssues related to the terminal user interface: text input, menus and dialogs, and terminal displaybugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions