Add explicit codex exec events#4177
Conversation
- Handle ExecCommandBegin/End events in EventProcessorWithJsonOutput - Track and report command executions as ConversationItems with proper status - Add tests for command execution, session, agent and error event handling
…nd update tests - Handle PatchApplyBegin and PatchApplyEnd events, mapping patch changes and status to ConversationEvents with FileChangeItem and PatchApplyStatus. - Update ConversationItemDetails::PatchApply to FileChange, FileUpdateItem to FileChangeItem. - Add and extend tests for patch apply success and failure cases. - Update exec_events definitions to support new FileChange event details and PatchApplyStatus.
…_with_json_output tests
…redEvent - Update EventProcessor::print_config_summary to accept a SessionConfiguredEvent reference. - Always call print_config_summary after SessionConfiguredEvent is constructed. - event_processor_with_json_output now emits SessionConfigured event in print_config_summary. - Rename tests/event_processor_with_json_output.rs to codex-rs/exec/tests/event_processor_with_json_output.rs. - Update all implementors to match new trait signature.
…lic, re-export module, add tests for event_processor_with_json_output
# Conflicts: # codex-rs/exec/src/lib.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
|
Example of an output for Details |
| @@ -0,0 +1,146 @@ | |||
| use serde::Deserialize; | |||
There was a problem hiding this comment.
I suppose now is the time to decide whether we want to go with snake_case or camelCase throughout?
There was a problem hiding this comment.
OpenAI REST APIs use snake_case.
There was a problem hiding this comment.
I don't think we are beholden to that if we think that camelCase is more appropriate for our users. It feels more JSON-y.
There was a problem hiding this comment.
I think APIs from the same company should look similar. If it was good for the mainline API should be good enough for us.
| } else { | ||
| conversation_manager.new_conversation(config).await? | ||
| conversation_manager | ||
| .new_conversation(config.clone()) |
There was a problem hiding this comment.
Why did config.clone() come into play?
There was a problem hiding this comment.
config was moved into these call before but now we need it for event_processor.print_config_summary
There was a problem hiding this comment.
Perhaps we should Arc it instead?
There was a problem hiding this comment.
Hm, am I missing something? conversation_manager.new_conversation doesn't take an Arc
…hread-safe and event-ID unique - Use AtomicU64 for next_event_id to ensure unique, thread-safe item IDs. - Change all item IDs from "itm_" to "item_". - Track running_commands and running_patch_applies in HashMaps keyed by call_id for concurrency safety. - Remove single-command/patch tracking, switching to per-call_id logic. - Update tests to expect new ID prefix "item_".
| "prompt": prompt, | ||
| fn print_config_summary(&mut self, _: &Config, _: &str, ev: &SessionConfiguredEvent) { | ||
| self.process_event(Event { | ||
| id: "".to_string(), |
There was a problem hiding this comment.
This event id doesn't matter for the logic.
| if let Some(output_file) = self.last_message_path.as_deref() { | ||
| handle_last_message(last_agent_message.as_deref(), output_file); | ||
| } | ||
| return CodexStatus::InitiateShutdown; |
There was a problem hiding this comment.
I would use else instead of return, but that's me.
| @@ -0,0 +1,146 @@ | |||
| use serde::Deserialize; | |||
There was a problem hiding this comment.
I don't think we are beholden to that if we think that camelCase is more appropriate for our users. It feels more JSON-y.
| } else { | ||
| conversation_manager.new_conversation(config).await? | ||
| conversation_manager | ||
| .new_conversation(config.clone()) |
There was a problem hiding this comment.
Perhaps we should Arc it instead?
| }), | ||
| ); | ||
| let out_end = ep.collect_conversation_events(&end); | ||
| assert_eq!(out_end.len(), 1); |
There was a problem hiding this comment.
| let out_end = ep.collect_conversation_events(&end); | ||
| assert_eq!(out_end.len(), 1); | ||
|
|
||
| // Validate structure without relying on HashMap iteration order |
There was a problem hiding this comment.
I see...Maybe it's worth using IndexMap so things are predictable.
Co-authored-by: Michael Bolin <mbolin@openai.com>
event_processor_with_json_output: log error on JSON serialization failure instead of silently ignoring
…essor Split event_processor_with_json_output into two implementations: existing one for --json, and new event_processor_with_experimental_json_output providing improved, testable JSONL output for --experimental-json. Add mutually exclusive --json and --experimental-json CLI flags and update wiring accordingly. Deprecation message is shown when --json is used. Tests updated to use experimental implementation.
…event_processor_with_json_output.rs test
This pull request add a new experimental format of JSON output.
You can try it using
codex exec --experimental-json.Design takes a lot of inspiration from Responses API items and stream format.
Session and items
Each invocation of
codex execstarts or resumes a session.Session contains multiple high-level item types:
Events
Session and items are going through their life cycles which is represented by events.
Session is
session.createdorsession.resumedItems are
item.added,item.updated,item.completed,item.require_approval(or other item types likeitem.output_deltawhen we need streaming).So a typical session can look like:
Details
The idea is to give users fully formatted items they can use directly in their rendering/application logic and avoid having them building up items manually based on events (unless they want to for streaming).
This PR implements only the
item.completedpayload for some event types, more event types and item types to come.