Session.send/3 multi-turn broken on Codex 0.119+: looks for session_id, CLI emits thread_id

## Summary

`CodexWrapper.Session.send/3` tries to thread session continuity across turns, but against `codex-cli ≥ 0.119.0` it silently never updates `session_id` — so the second and subsequent turns are *always* dispatched as fresh sessions instead of resuming the previous one.

Discovered while fixing the separate (but related) dialyzer issue on the `chore/add-dialyxir` branch — see the commit `fix: Session.stream/3 returns the session unchanged`. That fix intentionally scoped around this bug; filing it separately so it gets a dedicated runtime fix.

## Root cause

`Session.execute_turn/3` (first-turn branch) calls `extract_session_id/1` on the result events:

```elixir
defp extract_session_id(events) do
  Enum.find_value(events, fn event ->
    JsonLineEvent.get(event, "session_id")
  end)
end
```

But Codex 0.119+ does not emit a `"session_id"` field anywhere in its stream-json output. It emits the thread identifier as `"thread_id"` in the first `thread.started` event:

```json
{"type":"thread.started","thread_id":"019d799c-09f1-7ea0-8e5a-7b42a640c103"}
{"type":"turn.started"}
{"type":"item.completed","item":{"id":"item_0","type":"agent_message","text":"pong"}}
{"type":"turn.completed","usage":{"input_tokens":13957,"output_tokens":17}}
```

So `extract_session_id/1` always returns `nil`, and:

```elixir
new_session_id = extract_session_id(events) || session.session_id
```

falls through to `session.session_id`, which is also `nil` on the first turn. The returned session has `session_id: nil`. The next call to `Session.send/3` hits the `session_id: nil` branch of `execute_turn/3` again and dispatches a fresh `Exec` command (not an `ExecResume`), losing all conversation history.

## Repro

```elixir
config = CodexWrapper.Config.new([])
session = CodexWrapper.Session.new(config, sandbox: :read_only, skip_git_repo_check: true)

{:ok, session, _} = CodexWrapper.Session.send(session, "Remember the number 42. Respond with 'ok'.")
IO.inspect(session.session_id)  # => nil (should be the thread_id)

{:ok, _, result} = CodexWrapper.Session.send(session, "What number did I ask you to remember?")
IO.inspect(result.stdout)
# => Codex has no memory of the previous turn; it starts fresh.
```

## Suggested fix

Update `extract_session_id/1` to look for `"thread_id"` instead (or in addition to) `"session_id"`. Specifically, prefer `thread_id` from the `thread.started` event, since that's where Codex emits it:

```elixir
defp extract_session_id(events) do
  Enum.find_value(events, fn event ->
    JsonLineEvent.get(event, "thread_id") || JsonLineEvent.get(event, "session_id")
  end)
end
```

Keeping the `session_id` fallback preserves compatibility with older Codex versions that may have used a different field name. The `||` order matters — modern Codex uses `thread_id`.

## Workaround / background

A `GenAgent` backend adapter (`gen_agent_codex`) built on top of this package bypasses `CodexWrapper.Session` entirely and manages thread_id threading itself by calling `Exec.execute_json/2` on the first turn and `ExecResume.execute_json/2` on subsequent turns, capturing `thread_id` from the `thread.started` event. That pattern works against 0.119+ and could be adapted here — or more minimally, just fix the field-name lookup as above.

## Test coverage gap

There are no tests in `test/codex_wrapper/session_test.exs` that exercise an actual multi-turn roundtrip against a real CLI. A live integration test that sends turn 1 → inspects captured session_id → sends turn 2 → verifies the CLI actually resumed would have caught this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session.send/3 multi-turn broken on Codex 0.119+: looks for session_id, CLI emits thread_id #40

Summary

Root cause

Repro

Suggested fix

Workaround / background

Test coverage gap

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Session.send/3 multi-turn broken on Codex 0.119+: looks for session_id, CLI emits thread_id #40

Description

Summary

Root cause

Repro

Suggested fix

Workaround / background

Test coverage gap

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions