Conversation
e8b6ed6 to
fb3df42
Compare
a5037a9 to
c6596ea
Compare
8b58144 to
cb0ff55
Compare
8f0316b to
737b94d
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 737b94d88d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| review_id, | ||
| request, | ||
| retry_reason, | ||
| GuardianApprovalRequestSource::DelegatedSubagent, |
There was a problem hiding this comment.
Propagate the correct approval source when spawning reviews
spawn_approval_request_review always passes GuardianApprovalRequestSource::DelegatedSubagent, but this helper is also used for main-turn reviews (e.g., session::request_permissions_for_cwd). Those main-turn reviews will now be misclassified in guardian-review analytics, corrupting the approval_request_source field and downstream metrics.
Useful? React with 👍 / 👎.
| return ( | ||
| GuardianReviewSessionOutcome::PromptBuildFailed(err), | ||
| false, |
There was a problem hiding this comment.
Classify submit failures as session errors
This error path maps all pre-wait failures to PromptBuildFailed, including review_session.codex.submit(...) failures from the guardian runtime/session. Those are later emitted as GuardianReviewFailureReason::PromptBuildError, which mislabels real session/runtime faults and skews failure analytics.
Useful? React with 👍 / 👎.
5d60863 to
6117400
Compare
## Why Guardian approvals now run as review sessions, but Codex analytics did not have a terminal event for those reviews. That made it hard to measure approval outcomes, failure modes, Guardian session reuse, model metadata, token usage, and timing separately from the parent turn. ## What changed Adds `codex_guardian_review` analytics emission for Guardian approval reviews. The event is emitted from the Guardian review path with review identity, target item id, approval request source, a PII-minimized reviewed-action shape, terminal decision/status, failure reason, Guardian assessment fields, Guardian session metadata, token usage, and timing metadata. The reviewed-action payload intentionally omits high-risk fields such as shell commands, working directories, argv, file paths, network targets/hosts, rationale, retry reason, and permission justifications. It also classifies prompt-build failures separately from Guardian session/runtime failures so fail-closed cases are distinguishable in analytics. ## Verification - Guardian review analytics tests cover terminal success, timeout/cancel/fail-closed paths, session metadata, and token usage plumbing. - `cargo clippy -p codex-core --lib --tests -- -D warnings` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17693). * #17696 * #17695 * __->__ #17693
Why
The Guardian review event needs to report whether the action shown to Guardian was truncated. That field should come from the same truncation path used to build the Guardian prompt, rather than being inferred after the fact.
What changed
Plumbs truncation metadata through Guardian action formatting, prompt construction, review session execution, and analytics emission.
guardian_truncate_textnow reports both the rendered text and whether it inserted the truncation marker, andreviewed_action_truncatedis set from that prompt-building result.This keeps the analytics field aligned with the model-visible reviewed action while preserving the existing Guardian prompt behavior.
Verification
Stack created with Sapling. Best reviewed with ReviewStack.