Summary
Codex Desktop appears to consume 5-hour/weekly usage while idle (~2% every few moments) when background memory generation is enabled. Local logs show a real gpt-5.4 Responses sampling request running from the memory subsystem (cwd=~/.codex/memories) while I was not actively using Codex. After changing only generate_memories = false and reloading the app, the idle usage drain stopped in my observation window and no further sampling requests appeared in local logs.
This looks closely related to #19105 and possibly #19123, but I am opening a new issue because #19105 is closed and I have before/after local diagnostics that may help narrow the behavior.
Environment
- Product: Codex Desktop app
- App version:
26.422.30944
- Platform: macOS
- Plan: ChatGPT Pro
- Model observed in the background request:
gpt-5.4
Relevant config before mitigation:
[features]
memories = true
[memories]
generate_memories = true
use_memories = true
max_rollouts_per_startup = 6
min_rollout_idle_hours = 12
Current mitigation:
[features]
memories = true
[memories]
generate_memories = false
use_memories = true
max_rollouts_per_startup = 6
min_rollout_idle_hours = 12
What happened
While Codex Desktop was open but idle, I watched my 5-hour limit continue dropping even though I was not initiating work in Codex.
Local logs showed a background model request from the memory subsystem during that period:
start_local: <redacted>
end_local: <redacted, about 1.5 minutes later>
model: gpt-5.4
cwd: ~/.codex/memories
log context: run_sampling_request, responses_websocket, api.path="responses"
This appears to be an actual model sampling request, not UI polling or thread-list activity.
After setting generate_memories = false and reloading Codex Desktop, I checked local logs for new run_sampling_request / responses_websocket entries after reload and saw no new sampling requests during the idle observation window. My 5-hour limit also stopped dropping during that same window.
Local diagnostics used
I used queries like this against ~/.codex/logs_2.sqlite:
sqlite3 -readonly ~/.codex/logs_2.sqlite "
select
datetime(min(ts),'unixepoch','localtime') start_local,
datetime(max(ts),'unixepoch','localtime') end_local,
count(*) log_rows,
substr(feedback_log_body, instr(feedback_log_body,'model='), 20) model,
substr(feedback_log_body, instr(feedback_log_body,'cwd='), 90) cwd
from logs
where feedback_log_body like '%run_sampling_request%'
group by model, cwd
order by min(ts);
"
And, after reload/mitigation, checked for new sampling requests after a cutoff timestamp:
sqlite3 -readonly ~/.codex/logs_2.sqlite "
select
datetime(min(ts),'unixepoch','localtime') start_local,
datetime(max(ts),'unixepoch','localtime') end_local,
count(*) log_rows,
substr(feedback_log_body, instr(feedback_log_body,'model='), 20) model,
substr(feedback_log_body, instr(feedback_log_body,'cwd='), 90) cwd
from logs
where ts >= <cutoff_epoch>
and feedback_log_body like '%run_sampling_request%'
group by model, cwd
order by min(ts);
"
Result after mitigation (generate_memories = false & restart codex app and app server): no rows during the idle observation window.
I can share more detailed sanitized log excerpts, state DB rows, timestamps, or thread/job IDs if useful.
Expected behavior (knowing this is experimental)
More generally, if memory generation is expected to consume significant usage, Codex should make that visible and controllable:
- Show when background memory generation/consolidation is running.
- Show the model/job type and estimated/actual usage impact if possible.
- Document whether
generate_memories = false is the intended way to preserve memory reads while disabling background memory writes.
- Avoid silently consuming a meaningful amount of scarce 5-hour/weekly usage while the user is not actively using Codex.
Actual behavior
With memories = true, generate_memories = true, and use_memories = true, Codex Desktop appears to run background memory work from ~/.codex/memories using gpt-5.4, and the 5-hour limit can drop while the app appears idle.
Changing only generate_memories = false appears to stop the idle drain in my current observation window while keeping use_memories = true.
Why this matters
I understand that memory generation is experimental and consumes tokens. The problem is the control and visibility boundary: from the user perspective, Codex Desktop is idle, but 5-hour/weekly usage can still be consumed by background memory work. Which feels bad. That makes it hard to safely leave the app open unless background generation can be explicitly disabled while preserving memory reads.
I hope there's details in here that give someone an idea how to balance mem_gen - maybe use a more efficient model while still experimenting? At least now with generate_memories = false I don't have to worry and behavior seems to be normal again.
Summary
Codex Desktop appears to consume 5-hour/weekly usage while idle (~2% every few moments) when background memory generation is enabled. Local logs show a real
gpt-5.4Responses sampling request running from the memory subsystem (cwd=~/.codex/memories) while I was not actively using Codex. After changing onlygenerate_memories = falseand reloading the app, the idle usage drain stopped in my observation window and no further sampling requests appeared in local logs.This looks closely related to #19105 and possibly #19123, but I am opening a new issue because #19105 is closed and I have before/after local diagnostics that may help narrow the behavior.
Environment
26.422.30944gpt-5.4Relevant config before mitigation:
Current mitigation:
What happened
While Codex Desktop was open but idle, I watched my 5-hour limit continue dropping even though I was not initiating work in Codex.
Local logs showed a background model request from the memory subsystem during that period:
This appears to be an actual model sampling request, not UI polling or thread-list activity.
After setting
generate_memories = falseand reloading Codex Desktop, I checked local logs for newrun_sampling_request/responses_websocketentries after reload and saw no new sampling requests during the idle observation window. My 5-hour limit also stopped dropping during that same window.Local diagnostics used
I used queries like this against
~/.codex/logs_2.sqlite:And, after reload/mitigation, checked for new sampling requests after a cutoff timestamp:
Result after mitigation (generate_memories = false & restart codex app and app server): no rows during the idle observation window.
I can share more detailed sanitized log excerpts, state DB rows, timestamps, or thread/job IDs if useful.
Expected behavior (knowing this is experimental)
More generally, if memory generation is expected to consume significant usage, Codex should make that visible and controllable:
generate_memories = falseis the intended way to preserve memory reads while disabling background memory writes.Actual behavior
With
memories = true,generate_memories = true, anduse_memories = true, Codex Desktop appears to run background memory work from~/.codex/memoriesusinggpt-5.4, and the 5-hour limit can drop while the app appears idle.Changing only
generate_memories = falseappears to stop the idle drain in my current observation window while keepinguse_memories = true.Why this matters
I understand that memory generation is experimental and consumes tokens. The problem is the control and visibility boundary: from the user perspective, Codex Desktop is idle, but 5-hour/weekly usage can still be consumed by background memory work. Which feels bad. That makes it hard to safely leave the app open unless background generation can be explicitly disabled while preserving memory reads.
I hope there's details in here that give someone an idea how to balance mem_gen - maybe use a more efficient model while still experimenting? At least now with generate_memories = false I don't have to worry and behavior seems to be normal again.