What version of the Codex App are you using (From "About Codex" dialog)?
26.506.31421 (bundle version 2620)
What subscription do you have?
Pro
What platform is your computer?
Darwin 25.3.0 arm64 arm
macOS 26.3 (25D125), Apple Silicon
What issue are you seeing?
The Codex Desktop app appears to have a recent regression where old and tool-heavy sessions can quickly become unusable:
- Today I hit two UI freezes / unresponsive states in Codex Desktop and had to kill the app process and restart.
- After the first kill/restart, I sent only one message in each of two older sessions, and the app became unresponsive again.
- In a fresh new session, the UI has not frozen yet, but the context meter still grows very quickly. After only a small number of visible user interactions, the session already reached about half of the model context window.
This does not look like a simple user-configuration issue. The same general configuration had been used for a long time without obvious UI freezes. The current app bundle on this machine was updated/modified locally on 2026-05-09, and the issue became obvious recently.
Local token counters confirm the context meter behavior:
- Current fresh session id:
019e150c-336e-7220-a03d-bc2e3187603c
- Current fresh session reached
last_input_tokens = 137293
- Model context window:
258400
- Context usage: about
53.1%
- The thread only had a few visible interactions.
An earlier same-topic diagnostic session showed a more severe version:
- Session id:
019e14f1-6997-7f52-b854-97813060dee7
- Peak observed
last_input_tokens = 234757
- Model context window:
258400
- After compaction, it dropped to about
30045, then quickly re-inflated to 171023
- Total accumulated input tokens reached
5610822 during a short diagnostic thread
In both cases, the direct local evidence points to large tool outputs being retained in the transcript and replayed into later turns. In the current session, the largest retained tool outputs were approximately:
40153 characters
39910 characters
39893 characters
39440 characters
37293 characters
Those outputs were not pasted by the user; they were shell/tool results shown during diagnosis. Once retained, they appear to drive rapid context growth and may also contribute to the Electron renderer becoming unresponsive when older sessions are resumed or displayed.
What steps can reproduce the bug?
I do not have a minimal public repo yet, but the local reproduction pattern is:
- Use Codex Desktop
26.506.31421 on macOS / Apple Silicon.
- Open or create a session using
gpt-5.5.
- Run several diagnostic shell commands that produce moderately large outputs, e.g. log searches,
tail, rg, process listings, or JSONL session inspection. Some outputs around 37K-40K characters are enough to make the effect visible.
- Continue the conversation with a few short user messages.
- Observe that
last_input_tokens and the UI context meter rise very quickly, even when the user-visible conversation is short.
- In older / already-large sessions, after killing and restarting Codex Desktop, send one message in the old session. The UI may freeze / become unresponsive again.
Observed token progression in a fresh session:
27541 / 258400
42797 / 258400
60516 / 258400
68688 / 258400
106885 / 258400
115264 / 258400
137293 / 258400
Observed token progression in the previous same-topic session:
199962 / 258400
234757 / 258400
compaction or reset to about 30045 / 258400
144346 / 258400
171023 / 258400
What is the expected behavior?
- Tool output should be aggressively bounded, summarized, or excluded from later prompt replay when it is too large.
- Context compaction should keep the active prompt budget under control after tool-heavy turns.
- Opening or sending a short message in an older session should not freeze the Desktop app.
- The UI context meter should not jump to half or near-full context after only a few short visible user interactions unless the app can clearly explain what hidden retained content is being counted.
Additional information
Related but not identical issues found before filing:
This report is specifically for Codex Desktop on macOS 26.506.31421, with two observed UI freezes today and numeric evidence that tool output retention/replay can push a new thread to ~53% of the context window after only a few visible interactions.
No Crashpad dump was found locally, which is consistent with a UI hang / unresponsive renderer rather than a clean crash.
What version of the Codex App are you using (From "About Codex" dialog)?
26.506.31421 (bundle version 2620)
What subscription do you have?
Pro
What platform is your computer?
Darwin 25.3.0 arm64 arm
macOS 26.3 (25D125), Apple Silicon
What issue are you seeing?
The Codex Desktop app appears to have a recent regression where old and tool-heavy sessions can quickly become unusable:
This does not look like a simple user-configuration issue. The same general configuration had been used for a long time without obvious UI freezes. The current app bundle on this machine was updated/modified locally on 2026-05-09, and the issue became obvious recently.
Local token counters confirm the context meter behavior:
019e150c-336e-7220-a03d-bc2e3187603clast_input_tokens = 13729325840053.1%An earlier same-topic diagnostic session showed a more severe version:
019e14f1-6997-7f52-b854-97813060dee7last_input_tokens = 23475725840030045, then quickly re-inflated to1710235610822during a short diagnostic threadIn both cases, the direct local evidence points to large tool outputs being retained in the transcript and replayed into later turns. In the current session, the largest retained tool outputs were approximately:
40153characters39910characters39893characters39440characters37293charactersThose outputs were not pasted by the user; they were shell/tool results shown during diagnosis. Once retained, they appear to drive rapid context growth and may also contribute to the Electron renderer becoming unresponsive when older sessions are resumed or displayed.
What steps can reproduce the bug?
I do not have a minimal public repo yet, but the local reproduction pattern is:
26.506.31421on macOS / Apple Silicon.gpt-5.5.tail,rg, process listings, or JSONL session inspection. Some outputs around 37K-40K characters are enough to make the effect visible.last_input_tokensand the UI context meter rise very quickly, even when the user-visible conversation is short.Observed token progression in a fresh session:
Observed token progression in the previous same-topic session:
What is the expected behavior?
Additional information
Related but not identical issues found before filing:
This report is specifically for Codex Desktop on macOS
26.506.31421, with two observed UI freezes today and numeric evidence that tool output retention/replay can push a new thread to ~53% of the context window after only a few visible interactions.No Crashpad dump was found locally, which is consistent with a UI hang / unresponsive renderer rather than a clean crash.