Summary
CodexWrapper.Session.send/3 tries to thread session continuity across turns, but against codex-cli ≥ 0.119.0 it silently never updates session_id — so the second and subsequent turns are always dispatched as fresh sessions instead of resuming the previous one.
Discovered while fixing the separate (but related) dialyzer issue on the chore/add-dialyxir branch — see the commit fix: Session.stream/3 returns the session unchanged. That fix intentionally scoped around this bug; filing it separately so it gets a dedicated runtime fix.
Root cause
Session.execute_turn/3 (first-turn branch) calls extract_session_id/1 on the result events:
defp extract_session_id(events) do
Enum.find_value(events, fn event ->
JsonLineEvent.get(event, "session_id")
end)
end
But Codex 0.119+ does not emit a "session_id" field anywhere in its stream-json output. It emits the thread identifier as "thread_id" in the first thread.started event:
{"type":"thread.started","thread_id":"019d799c-09f1-7ea0-8e5a-7b42a640c103"}
{"type":"turn.started"}
{"type":"item.completed","item":{"id":"item_0","type":"agent_message","text":"pong"}}
{"type":"turn.completed","usage":{"input_tokens":13957,"output_tokens":17}}
So extract_session_id/1 always returns nil, and:
new_session_id = extract_session_id(events) || session.session_id
falls through to session.session_id, which is also nil on the first turn. The returned session has session_id: nil. The next call to Session.send/3 hits the session_id: nil branch of execute_turn/3 again and dispatches a fresh Exec command (not an ExecResume), losing all conversation history.
Repro
config = CodexWrapper.Config.new([])
session = CodexWrapper.Session.new(config, sandbox: :read_only, skip_git_repo_check: true)
{:ok, session, _} = CodexWrapper.Session.send(session, "Remember the number 42. Respond with 'ok'.")
IO.inspect(session.session_id) # => nil (should be the thread_id)
{:ok, _, result} = CodexWrapper.Session.send(session, "What number did I ask you to remember?")
IO.inspect(result.stdout)
# => Codex has no memory of the previous turn; it starts fresh.
Suggested fix
Update extract_session_id/1 to look for "thread_id" instead (or in addition to) "session_id". Specifically, prefer thread_id from the thread.started event, since that's where Codex emits it:
defp extract_session_id(events) do
Enum.find_value(events, fn event ->
JsonLineEvent.get(event, "thread_id") || JsonLineEvent.get(event, "session_id")
end)
end
Keeping the session_id fallback preserves compatibility with older Codex versions that may have used a different field name. The || order matters — modern Codex uses thread_id.
Workaround / background
A GenAgent backend adapter (gen_agent_codex) built on top of this package bypasses CodexWrapper.Session entirely and manages thread_id threading itself by calling Exec.execute_json/2 on the first turn and ExecResume.execute_json/2 on subsequent turns, capturing thread_id from the thread.started event. That pattern works against 0.119+ and could be adapted here — or more minimally, just fix the field-name lookup as above.
Test coverage gap
There are no tests in test/codex_wrapper/session_test.exs that exercise an actual multi-turn roundtrip against a real CLI. A live integration test that sends turn 1 → inspects captured session_id → sends turn 2 → verifies the CLI actually resumed would have caught this.
Summary
CodexWrapper.Session.send/3tries to thread session continuity across turns, but againstcodex-cli ≥ 0.119.0it silently never updatessession_id— so the second and subsequent turns are always dispatched as fresh sessions instead of resuming the previous one.Discovered while fixing the separate (but related) dialyzer issue on the
chore/add-dialyxirbranch — see the commitfix: Session.stream/3 returns the session unchanged. That fix intentionally scoped around this bug; filing it separately so it gets a dedicated runtime fix.Root cause
Session.execute_turn/3(first-turn branch) callsextract_session_id/1on the result events:But Codex 0.119+ does not emit a
"session_id"field anywhere in its stream-json output. It emits the thread identifier as"thread_id"in the firstthread.startedevent:{"type":"thread.started","thread_id":"019d799c-09f1-7ea0-8e5a-7b42a640c103"} {"type":"turn.started"} {"type":"item.completed","item":{"id":"item_0","type":"agent_message","text":"pong"}} {"type":"turn.completed","usage":{"input_tokens":13957,"output_tokens":17}}So
extract_session_id/1always returnsnil, and:falls through to
session.session_id, which is alsonilon the first turn. The returned session hassession_id: nil. The next call toSession.send/3hits thesession_id: nilbranch ofexecute_turn/3again and dispatches a freshExeccommand (not anExecResume), losing all conversation history.Repro
Suggested fix
Update
extract_session_id/1to look for"thread_id"instead (or in addition to)"session_id". Specifically, preferthread_idfrom thethread.startedevent, since that's where Codex emits it:Keeping the
session_idfallback preserves compatibility with older Codex versions that may have used a different field name. The||order matters — modern Codex usesthread_id.Workaround / background
A
GenAgentbackend adapter (gen_agent_codex) built on top of this package bypassesCodexWrapper.Sessionentirely and manages thread_id threading itself by callingExec.execute_json/2on the first turn andExecResume.execute_json/2on subsequent turns, capturingthread_idfrom thethread.startedevent. That pattern works against 0.119+ and could be adapted here — or more minimally, just fix the field-name lookup as above.Test coverage gap
There are no tests in
test/codex_wrapper/session_test.exsthat exercise an actual multi-turn roundtrip against a real CLI. A live integration test that sends turn 1 → inspects captured session_id → sends turn 2 → verifies the CLI actually resumed would have caught this.