You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Several open Codex issues appear to be different symptoms of the same larger reliability problem:
Codex Desktop allows local session/turn state to become unbounded, then places that state on hot UI, resume, context assembly, IPC, and active-turn ownership paths. Once a thread is long-running, tool-heavy, image-heavy, repeatedly compacted, or has experienced renderer/restart/interrupt recovery, the app can freeze, become extremely slow, overfill context, lose trace/progress visibility, fail Stop/cancel, or continue backend work while the UI appears stuck in Thinking.
I do not think every linked issue is the same single code bug. The clearer framing is one parent/meta bug with several fix surfaces:
unbounded persisted rollout/session records;
full-history hydration into app-server/client/renderer state;
context replay/compaction retaining large tool or image payloads;
renderer/app-server turn ownership and trace-stream rehydration losing authority;
insufficient diagnostics, because very different waits all collapse into Thinking, Working, or Reconnecting.
Evidence from related open reports
1. Large local rollout/history hydration makes Desktop slow, frozen, or unrecoverable
App UI gets extremely slow and laggy during long sessions #11984 is the long-running umbrella report for Desktop UI lag during long sessions. Later comments tie the problem to full history hydration, renderer CPU/RSS, and large IPC/app-server payloads rather than only DOM rendering.
4. Active turn ownership and renderer/session state can desynchronize
Codex Desktop accepts prompt but UI stays stuck in Thinking; Stop fails and turn can become invisible after restart #24287 reports the dangerous control-plane symptom: prompt accepted, UI stuck in Thinking, Stop fails or is misleading, progress traces disappear, and backend usage can continue decreasing while no activity is visible. It also reports multi-window rehydration making already-visible traces disappear and state disagreements between the goal bar, prompt box, and chat/trace area.
Codex Desktop: composer submit times out after stale conversation state accumulates; restart clears it #23644 reports composer submit timing out after stale conversation state accumulated over several days. Local app-state snapshots included pending_request_count=20, thread_count_active=15, thread_count_streaming_owner=6, thread_count_streaming_without_active_runtime=13, item_count_total_loaded=22550, and about 201 MB of estimated delta bytes. Restart cleared the issue.
5. Transport/first-output stalls are adjacent and currently indistinguishable in the UI
Some reports may not share the same local-history root cause, but they matter because the UI collapses them into the same stuck state and recovery paths can interact with renderer/session state:
Codex Desktop: gpt-5.5 xhigh turn stalled 30m before first output, then resumed normally #24260 reports a gpt-5.5 xhigh turn accepted immediately, then 30m38s before the first persisted reasoning item. Later comments include responses_http idle spans of hundreds of seconds and a packet-capture case where a reused connection was reset but recovery waited for the 300s stream idle timeout.
These may require separate transport/request watchdog fixes, but the Desktop issue should still distinguish them from renderer detachment, post-tool continuation stalls, and full-history hydration.
Suspected root cause family
The common design problem seems to be that "thread state" is doing too many jobs at once:
durable audit log;
UI transcript;
context replay source;
app-server resume payload;
renderer live state;
trace/progress stream state;
tool output/image artifact store;
recovery source after restart or renderer reload.
When those roles are all served by unbounded JSONL and large in-memory turn arrays, one oversized or inconsistent thread can poison many surfaces.
Requested fixes / invariants
Bound persisted session records
Cap persisted function_call_output, custom tool output, event_msg payloads, and InputText/image fields before they enter rollout JSONL.
Do not persist raw data:image / base64 image payloads in normal rollout records or compacted.payload.replacement_history; use file/blob references, hashes, or placeholders.
Avoid duplicating the same image/tool payload across both response_item and event_msg.
Add rollout size warnings and automatic safe repair/export paths.
Make thread loading lazy and paged
thread/read, thread/resume, thread/turns/list, stream-state snapshots, and sidebar state should have hard byte/count caps.
Opening a thread should return metadata and a recent bounded tail, for example the most recent 100-200 messages or a small byte cap.
Older turns, heavy tool outputs, and images should load only on scroll/expand.
Transcript rendering should be virtualized, but virtualization alone is insufficient if full history is still hydrated into renderer/app-server state.
Separate context assembly from durable transcript history
Context compaction should summarize or reference old heavy content instead of replaying raw tool outputs/images.
Approval transcript or retry transcript injection should be bounded and deduplicated.
Context/token accounting should expose whether growth came from visible user text, hidden tool output, compaction replacement history, approval transcript injection, or old transcript replay.
Make active-turn ownership durable and recoverable
Persist a turn record and its originating thread before streaming begins.
Persist accepted user prompts before upstream/network submission.
On renderer reload/restart, reattach by durable turn id and event cursor, not just in-memory renderer state.
Stop/cancel should resolve to an explicit state: cancelled, still running remotely, already completed, failed to cancel, or unknown/detached.
If backend active turns exist without renderer ownership, show a recovery banner and reattach option.
Make resume/reconciliation authoritative
A terminal backend turn must not rehydrate as markedStreaming=true.
If a renderer receives item deltas for an unknown conversation, it should buffer briefly and force a thread/turn re-read instead of dropping state and spamming Item not found in turn state.
On resume, reconcile the full item id set for the active/recent turn before applying live deltas.
Orphan task_started turns should be repaired as interrupted/failed during resume rather than crashing LocalConversationPage.
Improve diagnostics
Add phase-specific timing/log events for: request accepted, upstream request sent, response headers received, first byte, first Responses event, first assistant/reasoning/tool item, tool result returned, post-tool continuation requested, context compaction started/completed, renderer attached/detached, and Stop/cancel result.
Preserve enough local diagnostics to tell whether a stuck turn is model/backend stall, transport retry, context compaction, app-server hydration, renderer detachment, or post-tool continuation loss.
Individual reports often look like separate bugs because the immediate symptom differs: slow thread switching, V8 string crash, image-heavy freeze, compaction memory spike, stuck Thinking, invisible active turn, failed Stop, composer timeout, missing tool traces, or CLI/extension first-output stalls.
But the recurring evidence points to one architectural boundary that needs a coordinated fix: Codex should treat durable history, model context, renderer transcript, live turn stream, and recovery state as separate bounded contracts. Until those contracts are bounded and authoritative, fixes to only one surface, such as rendering virtualization or a single timeout tweak, are likely to leave other variants open.
What steps can reproduce the bug?
User-visible failure modes
Desktop becomes slow or unresponsive when opening a long thread, switching threads, sending a new prompt in an old thread, or after repeated context compactions.
The app can become unrecoverable on launch if the most recent session auto-resumes an oversized rollout.
A short follow-up in an old session can freeze the UI even though a fresh session still works.
Tool outputs can return successfully, but the assistant continuation never resumes.
A prompt can be accepted and backend work can continue, while the Desktop UI stays stuck in Thinking and no progress traces appear.
Stop/cancel can become unavailable, misleading, or ineffective because the UI has lost the active turn reference.
After restart or renderer reload, prompts/traces/tool calls can be missing, stale, or only partially recovered.
The same root state bloat can also drive fast context growth and usage drain through retained tool output, compaction history, and replayed approval/diagnostic transcripts.
What is the expected behavior?
Expected behavior
Opening or switching to a thread should load metadata plus a bounded recent tail, not the entire transcript/tool/image history.
Sending a short prompt in an old thread should not synchronously parse, render, serialize, or replay hundreds of MB of session data.
Context compaction should reduce active prompt pressure and should not retain raw large tool outputs, image data, or repeated transcript-injection blocks.
Rollout JSONL should not store raw image bytes or unbounded tool output inline when references, caps, summaries, or external artifacts would work.
If one thread is too large or malformed, Desktop should fail that thread safely while leaving the rest of the app usable.
If a prompt is accepted, the originating thread should durably show the user prompt, trace/progress stream, and Stop/cancel state.
If the renderer loses ownership of an active backend turn, it should show an explicit recovery/reattach state rather than generic Thinking.
Terminal backend states should be authoritative: completed, failed, interrupted, or cancelled turns should not rehydrate as streaming.
Tool-returned, waiting-for-first-output, reconnecting, context-compacting, renderer-detached, and post-tool-continuation-stalled states should be distinguishable in UI and logs.
What version of the Codex App are you using (From “About Codex” dialog)?
26.527.60818
What subscription do you have?
pro
What platform is your computer?
Microsoft Windows NT 10.0.19045.0
What issue are you seeing?
Codex Desktop meta-bug: unbounded session/turn state causes freezes, context bloat, and lost active-turn control
Suggested labels
bug,app,session,context,tool-calls,performance,app-serverSummary
Several open Codex issues appear to be different symptoms of the same larger reliability problem:
Codex Desktop allows local session/turn state to become unbounded, then places that state on hot UI, resume, context assembly, IPC, and active-turn ownership paths. Once a thread is long-running, tool-heavy, image-heavy, repeatedly compacted, or has experienced renderer/restart/interrupt recovery, the app can freeze, become extremely slow, overfill context, lose trace/progress visibility, fail Stop/cancel, or continue backend work while the UI appears stuck in
Thinking.I do not think every linked issue is the same single code bug. The clearer framing is one parent/meta bug with several fix surfaces:
Thinking,Working, orReconnecting.Evidence from related open reports
1. Large local rollout/history hydration makes Desktop slow, frozen, or unrecoverable
455 MiB,507 MiB,561 MiB,584 MiB, and1867 MiB; operations such as resume/read/unsubscribe could take tens of seconds, with worst cases around82s.718.6 MBrollout JSONL with3,305lines. After force quit, every launch auto-resumed the oversized session and froze. Renderer CPU was around140%of one core and working-set memory grew by about1.177 GBin10s.500 MBor larger. The strongest freeze occurred not merely when opening a large conversation, but when sending a new prompt in an already-open large thread.5s, and pruning old local JSONL session files is reported as a workaround.2. Hard size limits and inline payloads expose the same design failure
RangeError: Invalid string lengthwhen loading sessions whose rollout JSONL exceeds V8's max string length #22004 reports a reproducible Electron main-process crash,RangeError: Invalid string length, when loading sessions whose rollout JSONL exceeds V8's max string length. Reported sizes include506.8 MB,786.7 MB,963.5 MB,1050 MB, and1601 MB. Related comments report an825 MBrollout producing a dropped405 MBIPC payload, and a macOS1.37 GBrollout where app-server grew to6-8 GBRSS.512 MBcrash threshold. A182.8 MBimage-heavy rollout had41,361lines, a max JSONL line of6,850,815chars, and many inlineimage_url/payload.imagesrecords. A newer repro was only102.44 MB, but almost the whole file wasdata:imagepayloads; a compacted record alone was24.207 MBwith16image references.137,293 / 258,400tokens after a small number of visible interactions, and an earlier diagnostic reached234,757 / 258,400, then re-inflated after compaction. Several retained tool outputs were around37K-40Kcharacters each.3. Context compaction/replay can amplify the same state bloat
context_compactedevents and approval transcript injection. The report found repeatedapproval assessment/TRANSCRIPT DELTAblocks containing prior tool calls, tool outputs, retry reasons, and planned actions.815,104; the final token-count event reached258,400 / 258,400.4. Active turn ownership and renderer/session state can desynchronize
Thinking, Stop fails or is misleading, progress traces disappear, and backend usage can continue decreasing while no activity is visible. It also reports multi-window rehydration making already-visible traces disappear and state disagreements between the goal bar, prompt box, and chat/trace area.pwdandrg --filesreturned successfully almost instantly, then no assistant message, no new tool call, and no task completion occurred for about6m59suntil manual interrupt.markedStreaming=true,Received turn/started for unknown conversation, and5,466Item not found in turn stateerrors in one Desktop log.pending_request_count=20,thread_count_active=15,thread_count_streaming_owner=6,thread_count_streaming_without_active_runtime=13,item_count_total_loaded=22550, and about201 MBof estimated delta bytes. Restart cleared the issue.task_startedwithouttask_completepoisoning reopen/resume. The rollout parsed cleanly, but balancing the orphan turn with a synthetic completion made the lifecycle valid again. This suggests interrupted/failed turns need durable terminal state or tolerant resume logic.task_startedwithouttask_complete, tool outputs returned without assistant continuation, and long non-image sessions with multiple compactions and unfinished turns.5. Transport/first-output stalls are adjacent and currently indistinguishable in the UI
Some reports may not share the same local-history root cause, but they matter because the UI collapses them into the same stuck state and recovery paths can interact with renderer/session state:
gpt-5.5xhigh turn accepted immediately, then30m38sbefore the first persisted reasoning item. Later comments includeresponses_httpidle spans of hundreds of seconds and a packet-capture case where a reused connection was reset but recovery waited for the300sstream idle timeout.Thinkingfor minutes even for simple prompts.Workinghangs and reconnect retries; interrupting and resubmitting the same prompt can run normally.These may require separate transport/request watchdog fixes, but the Desktop issue should still distinguish them from renderer detachment, post-tool continuation stalls, and full-history hydration.
Suspected root cause family
The common design problem seems to be that "thread state" is doing too many jobs at once:
When those roles are all served by unbounded JSONL and large in-memory turn arrays, one oversized or inconsistent thread can poison many surfaces.
Requested fixes / invariants
Bound persisted session records
function_call_output, custom tool output,event_msgpayloads, andInputText/image fields before they enter rollout JSONL.data:image/ base64 image payloads in normal rollout records orcompacted.payload.replacement_history; use file/blob references, hashes, or placeholders.response_itemandevent_msg.Make thread loading lazy and paged
thread/read,thread/resume,thread/turns/list, stream-state snapshots, and sidebar state should have hard byte/count caps.Separate context assembly from durable transcript history
Make active-turn ownership durable and recoverable
Make resume/reconciliation authoritative
markedStreaming=true.Item not found in turn state.task_startedturns should be repaired as interrupted/failed during resume rather than crashing LocalConversationPage.Improve diagnostics
Related issues
Large history / full hydration:
Hard size limits / image or tool payloads:
RangeError: Invalid string lengthwhen loading sessions whose rollout JSONL exceeds V8's max string length #22004Context compaction / context replay / usage amplification:
Turn lifecycle / UI ownership / trace and Stop desync:
Adjacent transport / first-output stall reports that should be distinguished in diagnostics:
Potentially adjacent Desktop lifecycle reports:
Why this should be tracked as a meta issue
Individual reports often look like separate bugs because the immediate symptom differs: slow thread switching, V8 string crash, image-heavy freeze, compaction memory spike, stuck
Thinking, invisible active turn, failed Stop, composer timeout, missing tool traces, or CLI/extension first-output stalls.But the recurring evidence points to one architectural boundary that needs a coordinated fix: Codex should treat durable history, model context, renderer transcript, live turn stream, and recovery state as separate bounded contracts. Until those contracts are bounded and authoritative, fixes to only one surface, such as rendering virtualization or a single timeout tweak, are likely to leave other variants open.
What steps can reproduce the bug?
User-visible failure modes
Thinkingand no progress traces appear.What is the expected behavior?
Expected behavior
Thinking.Additional information
No response