Skip to content

Session-level model client#10664

Merged
pakrym-oai merged 7 commits intomainfrom
pakrym/session_level_client
Feb 5, 2026
Merged

Session-level model client#10664
pakrym-oai merged 7 commits intomainfrom
pakrym/session_level_client

Conversation

@pakrym-oai
Copy link
Collaborator

@pakrym-oai pakrym-oai commented Feb 4, 2026

Make ModelClient a session-scoped object.
Move state that is session level onto the client, and make state that is per-turn explicit on corresponding methods.
Stop taking a huge Config object, instead only pass in values that are actually needed.

@pakrym-oai pakrym-oai changed the title Restore client session with stream state encapsulation Session-level model client Feb 4, 2026
pakrym-oai and others added 5 commits February 4, 2026 13:03
Add rustdoc for ModelClient/ModelClientSession describing ownership, the
sticky-routing contract (x-codex-turn-state) and session-scoped
WebSocket fallback behavior.

Document why x-codex-beta-features is precomputed at session
creation and add brief call-site notes about reusing
ModelClientSession across retries.

No functional changes.
@joshka-oai
Copy link
Collaborator

joshka-oai commented Feb 4, 2026

Problem

Opening the Responses WebSocket connection at turn start adds avoidable latency to the first model
request. Today, the model client is created/dropped per turn and mixes session- and turn-scoped
state, which makes it hard to (a) pre-establish a connection on session creation and (b) reason
about which knobs are “sticky” across turns.

Mental model

  • ModelClient is session-scoped: it owns stable provider/auth/conversation configuration and
    session-wide transport fallback state.
  • ModelClientSession is turn-scoped: it streams one (or more) Responses requests within a single
    turn and caches per-turn state (sticky routing token, incremental append tracking, and a lazily
    established WebSocket connection).
  • Per-turn settings (model selection, reasoning controls, web search eligibility, turn metadata,
    telemetry context) are passed explicitly at the call site.

Non-goals

  • No behavioral change to request/response semantics or retry logic.
  • This PR does not implement preconnection yet; it restructures ownership so a follow-up can open
    the connection at session creation and reuse it when the first turn begins.

Tradeoffs

  • More explicit parameters on streaming/unary calls (less hidden coupling to Config, more
    call-site verbosity).
  • Session-scoped feature gating means toggles captured at session creation won’t change mid-session
    (if we ever support that).

Architecture

  • Store a session-scoped ModelClient in SessionServices; create a fresh ModelClientSession per
    turn.
  • Move transport fallback state into ModelClient session state (removing TransportManager).
  • Thread per-turn values into ModelClientSession::stream(...) and into unary helpers (compaction,
    memory trace summarize).

Observability

  • Streaming/unary calls take an OtelManager so telemetry stays tied to the active turn/model.
  • Preserve sticky routing via x-codex-turn-state (cached inside ModelClientSession).
  • Preserve timing metrics header on the WebSocket handshake when enabled.

Tests

  • Core suite updated for the new ModelClient / ModelClientSession shape.
  • cargo test -p codex-core (note: some integration tests spawn workspace binaries and require
    CARGO_BIN_EXE_codex + CARGO_BIN_EXE_test_stdio_server when running under Cargo).

Doc Changes

  • codex-rs/core/src/client.rs:1 adds module/type/method rustdoc clarifying session vs turn scope, sticky routing contract, and session-wide WebSocket fallback.
  • codex-rs/core/src/codex.rs:681 documents why x-codex-beta-features is precomputed for the session-scoped client.
  • codex-rs/core/src/codex.rs:3577 and codex-rs/core/src/compact.rs:94 add small call-site notes about reusing a single ModelClientSession across retries within a turn.
  • codex-rs/core/src/memory_trace.rs:34 and codex-rs/core/src/state/service.rs:38 add brief docs about explicit per-turn inputs and session-scoped ownership.

Risks / Inconsistencies

  • Session-scoped capture of feature-derived settings (including the beta-features header) can go stale if features are ever meant to change after session creation.
  • WebSocket handshake headers currently carry per-turn concerns (web_search_eligible, turn_metadata); preconnection before turn start will need a clear default/contract.
  • The per-turn parameter list is long; a follow-up “turn request context” struct may improve ergonomics once the design settles.

Tests

  • cd codex-rs && just fmt
  • cd codex-rs && cargo build -p codex-cli --bin codex
  • cd codex-rs && cargo build -p codex-rmcp-client --bin test_stdio_server
  • cd codex-rs && CARGO_BIN_EXE_codex=$PWD/target/debug/codex CARGO_BIN_EXE_test_stdio_server=$PWD/target/debug/test_stdio_server cargo test -p codex-core

Copy link
Collaborator

@joshka-oai joshka-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pakrym-oai pakrym-oai merged commit 0e8d359 into main Feb 5, 2026
32 checks passed
@pakrym-oai pakrym-oai deleted the pakrym/session_level_client branch February 5, 2026 00:58
@github-actions github-actions bot locked and limited conversation to collaborators Feb 5, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants