chore(sync): merge main into kontext-dev#33
Conversation
## Why `shell_zsh_fork` already provides stronger guarantees around which executables receive elevated permissions. To reuse that machinery from unified exec without pushing Unix-specific escalation details through generic runtime code, the escalation bootstrap and session lifetime handling need a cleaner boundary. That boundary also needs to be safe for long-lived sessions: when an intercepted shell session is closed or pruned, any in-flight approval workers and any already-approved escalated child they spawned must be torn down with the session, and the inherited escalation socket must not leak into unrelated subprocesses. ## What Changed - Extracted a reusable `EscalationSession` and `EscalateServer::start_session(...)` in `shell-escalation` so callers can get the wrapper/socket env overlay and keep the escalation server alive without immediately running a one-shot command. - Documented that `EscalationSession::env()` and `ShellCommandExecutor::run(...)` exchange only that env overlay, which callers must merge into their own base shell environment. - Clarified the prepared-exec helper boundary in `core` by naming the new helper APIs around `ExecRequest`, while keeping the legacy `execute_env(...)` entrypoints as thin compatibility wrappers for existing callers that still use the older naming. - Added a small post-spawn hook on the prepared execution path so the parent copy of the inheritable escalation socket is closed immediately after both the existing one-shot shell-command spawn and the unified-exec spawn. - Made session teardown explicit with session-scoped cancellation: dropping an `EscalationSession` or canceling its parent request now stops intercept workers, and the server-spawned escalated child uses `kill_on_drop(true)` so teardown cannot orphan an already-approved child. - Added `UnifiedExecBackendConfig` plumbing through `ToolsConfig`, a `shell::zsh_fork_backend` facade, and an opaque unified-exec spawn-lifecycle hook so unified exec can prepare a wrapped `zsh -c/-lc` request without storing `EscalationSession` directly in generic process/runtime code. - Kept the existing `shell_command` zsh-fork behavior intact on top of the new bootstrap path. Tool selection is unchanged in this PR: when `shell_zsh_fork` is enabled, `ShellCommand` still wins over `exec_command`. ## Verification - `cargo test -p codex-shell-escalation` - includes coverage for `start_session_exposes_wrapper_env_overlay` - includes coverage for `exec_closes_parent_socket_after_shell_spawn` - includes coverage for `dropping_session_aborts_intercept_workers_and_kills_spawned_child` - `cargo test -p codex-core shell_zsh_fork_prefers_shell_command_over_unified_exec` - `cargo test -p codex-core --test all shell_zsh_fork_prompts_for_skill_script_execution` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13392). * openai#13432 * __->__ openai#13392
…tream-main-20260305100117
…-20260305100117 chore(sync): merge upstream/main into main
…into-kontext-dev-20260305100329
There was a problem hiding this comment.
Verification certificate:
- Verdict: REQUEST_CHANGES
- Depth: full
- Confidence: 0.75
- Certificate valid: yes
- Scenarios traced: 1
- Verification bounds:
- Changed files and explicitly traced call paths
Inline findings posted:
- [LOW][correctness] Correct Handling of Inherited File Descriptors (
unknown:1) - [LOW][maintainability] Improved Lifecycle Management for Escalation Server (
unknown:1)
Reviewer summary:
- This pull request, despite its title suggesting a simple sync, introduces a significant refactoring and feature enhancement related to shell command execution on Unix-like systems. The core changes involve integrating the
zsh-forkshell escalation backend with theunified_exectool, which was previously only available for theshell_commandtool. - A new
EscalationSessionstruct is introduced to robustly manage the lifecycle of the escalation server, its background task, and the associated socket file descriptor. This improves resource management and correctness through itsDropimplementation. - A generic
after_spawnhook has been plumbed through the process execution stack. This hook is used by thezsh-forkbackend to correctly close the parent process's copy of the inherited socket file descriptor immediately after the child process is spawned, which is a critical pattern for correctness in this scenario. - The changes are architecturally sound, with platform-specific logic being well-encapsulated. The integration with
unified_execallows PTY-based sessions to benefit from the fine-grained command interception provided by thezsh-forkmechanism.
Scores and diagnostics
- Architecture: 7.0/10
- Organization: 7.0/10
- Cleanliness: 7.0/10
- Overall: 7.0/10
- Candidate findings: 2
- Published inline: 2
- Published summary-only: 0
- Suppressed findings: 0
- Verification verdict: REQUEST_CHANGES
Limitations
-
This review is based on static analysis of the provided diff and does not include runtime testing.
-
The verification assumes that the underlying operating system's
execveand file descriptor inheritance semantics behave as expected. -
The effectiveness of the
zsh-forkcommand interception itself is outside the scope of this review, which focuses on the integration and lifecycle management aspects. -
Model:
gemini-3.1-pro-preview -
Omitted files: 0
Generated at 2026-03-05T09:08:38.827Z.
| pub async fn execute_exec_request_with_after_spawn( | ||
| exec_request: ExecRequest, | ||
| stdout_stream: Option<StdoutStream>, | ||
| after_spawn: Option<Box<dyn FnOnce() + Send>>, | ||
| ) -> crate::error::Result<ExecToolCallOutput> { | ||
| let effective_policy = exec_request.sandbox_policy.clone(); | ||
| execute_exec_request(exec_request, &effective_policy, stdout_stream, after_spawn).await | ||
| } |
There was a problem hiding this comment.
🔴 Single-use helper execute_exec_request_with_after_spawn violates AGENTS.md rule
AGENTS.md states: "Do not create small helper methods that are referenced only once." The function execute_exec_request_with_after_spawn at codex-rs/core/src/sandboxing/mod.rs:428-435 is a 7-line wrapper around execute_exec_request and is called exactly once, at codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs:768. The caller should directly invoke execute_exec_request (as codex-rs/core/src/tasks/user_shell.rs:180 already does) instead of going through this single-use wrapper.
Prompt for agents
Remove the single-use helper function `execute_exec_request_with_after_spawn` from codex-rs/core/src/sandboxing/mod.rs (lines 428-435). In its single call site at codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs line 768, replace the call with a direct invocation of `execute_exec_request`, inlining the sandbox_policy clone. For example, change:
let result = crate::sandboxing::execute_exec_request_with_after_spawn(
exec_request,
None,
after_spawn,
).await?;
to:
let effective_policy = exec_request.sandbox_policy.clone();
let result = crate::exec::execute_exec_request(
exec_request,
&effective_policy,
None,
after_spawn,
).await?;
This matches the pattern already used at codex-rs/core/src/tasks/user_shell.rs:180.
Was this helpful? React with 👍 or 👎 to provide feedback.
Hard-cutover sync stage 2: merge updated main into kontext-dev using a real merge commit (no squash).