[rollout_trace] Trace sessions and multi-agent edges#18879
[rollout_trace] Trace sessions and multi-agent edges#18879cassirer-openai wants to merge 1 commit intocodex/rollout-trace-tool-code-modefrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0689ef66b5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if let Some(rollout_trace) = &self.services.rollout_trace { | ||
| rollout_trace.record_agent_result_interaction( | ||
| self.conversation_id.to_string(), | ||
| turn_context.sub_id.clone(), | ||
| parent_thread_id.to_string(), | ||
| child_agent_path.as_str(), | ||
| &message, | ||
| &status, | ||
| ); | ||
| } | ||
| if let Err(err) = self | ||
| .services | ||
| .agent_control |
There was a problem hiding this comment.
Record agent-result edge only after parent delivery succeeds
forward_child_completion_to_parent appends record_agent_result_interaction before send_inter_agent_communication. If the send fails (e.g., parent thread already gone), the trace still records a successful child→parent interaction that never happened, producing incorrect rollout edges.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
I do not review those edge cases for now
| let inherited_rollout_trace = self | ||
| .inherited_rollout_trace_for_source(&session_source) | ||
| .await; |
There was a problem hiding this comment.
Avoid inheriting trace recorder for resumed child sessions
spawn_thread_with_source now always inherits the parent recorder for ThreadSpawn sources. Resumed child threads keep the same thread_id; when they reattach to the same recorder they emit another ThreadStarted for that ID, and reducer replay treats duplicate thread starts as an error. This can make rollout traces unreplayable after close/resume cycles.
Useful? React with 👍 / 👎.
d22b579 to
47d822c
Compare
0689ef6 to
e3bfbd5
Compare
| msg: EventMsg::ShutdownComplete, | ||
| }; | ||
| sess.send_event_raw(event).await; | ||
| if let Some(rollout_trace) = &sess.services.rollout_trace { |
| if let Some(rollout_trace) = &self.services.rollout_trace { | ||
| rollout_trace.record_agent_result_interaction( | ||
| self.conversation_id.to_string(), | ||
| turn_context.sub_id.clone(), | ||
| parent_thread_id.to_string(), | ||
| child_agent_path.as_str(), | ||
| &message, | ||
| &status, | ||
| ); | ||
| } | ||
| if let Err(err) = self | ||
| .services | ||
| .agent_control |
There was a problem hiding this comment.
I do not review those edge cases for now
| /// Child agent sessions share their root recorder, so ending a child thread | ||
| /// must not close the whole rollout. Only the root thread's shutdown emits | ||
| /// `RolloutEnded`. | ||
| pub(crate) fn record_thread_ended(&self, thread_id: AgentThreadId, status: RolloutStatus) { |
There was a problem hiding this comment.
same comment as in the previous PR
| } | ||
| } | ||
|
|
||
| fn execution_status_for_exec_status(status: &ExecCommandStatus) -> ExecutionStatus { |
There was a problem hiding this comment.
those 2 functions are sus... clear sign of duplication. Not sure if we can avoid it though
| EventMsg::CollabCloseEnd(event) => { | ||
| self.tool_runtime_ended_payload(&event.call_id, ExecutionStatus::Completed, event) | ||
| } | ||
| _ => None, |
There was a problem hiding this comment.
Instead of this, can you put the full list? I know this is not fun but at least compilation will break if we add something and don't add it explicitly here. It's a good signal for maintenance
## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.
Summary
Adds the remaining session and multi-agent edge wiring needed to reconstruct rollout relationships across spawned agents, resumed sessions, and parent/child message delivery.
Stack
This is PR 4/5 in the rollout trace stack.
Review Notes
This is the stack layer that makes traces useful for multi-threaded agent workflows. The main invariant is that reconstructed relationships should come from durable rollout data rather than transient in-memory manager state wherever possible.
The PR is intentionally small relative to the preceding layers: it uses the recorder and reducer contracts already established by the stack and only adds the session/agent relationship events needed by later debug reduction.