[sandboxing] Prevent macOS fs-helper startup hangs#24956
Draft
JaviSoto wants to merge 1 commit into
Draft
Conversation
7a93538 to
2628c0b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
:minimalplatform policy into native runtime permissions and filesystem defaults.Root Cause
The fs helper used by
apply_patchis launched with a split filesystem sandbox policy. For full-read/workspace-write profiles, Codex correctly did not need the additional readable filesystem roots fromrestricted_read_only_platform_defaults.sbpl, but it incorrectly suppressed that entire policy unit. That file also contained native macOS startup permissions required before the helper reaches Rustmain, including notification and logging IPC.The exact semantic bug is that current Codex accepts
:minimal, butFileSystemSandboxPolicy::include_platform_defaults()returnsfalsewhenever the same policy has:root = read. In other words, full disk read correctly subsumes:minimal's extra filesystem roots, but Codex also erases the unrelated native runtime capability that:minimalcarries. The legacy full-read bridge has the same collapse becauseReadOnlyAccess::FullAccesscannot represent the runtime half of:minimal.As a result, a full-read fs helper can launch without the native runtime permissions implied by its
:minimalrequirement. The policy composition bug was introduced across the split-policy work in #13440 and #13448: filesystem access and native runtime capability were treated as the same semantic bit.This change models those concerns independently.
:minimalnow always enables its native runtime policy; it only enables the additional filesystem defaults when full disk read has not already made them redundant.Fixes #19020.
Observed Failure
In affected Studio threads, the model emits
apply_patch, but no tool result orpatch_apply_endfollows. Corresponding--codex-run-as-fs-helperchildren never reach Rust execution; samples stop in macOS initialization:A standalone C executable that only calls
getpwuid_r(getuid(), ...), run throughsandbox-execwith the notification-center lookup denied, reproduces the same hang on Studio. The same denied probe completes 5,000 launches locally, while allowing the lookup on Studio completes 5,000 launches. Cross-running the differing packaged Codex helpers also rules out the helper binary or Codex Remote transport as a prerequisite.The shipped current Nightly
codex sandboxcommand reproduces the same host split without the exec-server or fs-helper wrapper. On MBP,codex sandbox -c 'sandbox_mode="workspace-write"' -- getpwuid_probepassed 2,000 guarded launches. On Studio, the same exact Codex sandbox command timed out on guarded launch1; its probe child sampled ingetpwuid_r -> notify_register_check -> bootstrap_look_up2 -> mach_msg2_trapafter launchd logged the deniedcom.apple.system.notification_centerlookup.The reduction is mechanical: on Studio, the exact current Codex sandbox path with a restricted permission profile containing
":minimal" = "read"and":project_roots" = "write"passed 1,000 guarded launches. The same current binary with":root" = "read",":minimal" = "read", and":project_roots" = "write"timed out on guarded launch239in the same startup stack. That proves the failure is not "missing:minimalin config"; it is current Codex semantically dropping:minimalwhen full read is present.Host-Specific Boundary
This PR fixes the Codex policy defect; it does not claim to identify the underlying Studio-only macOS failure mode.
Both Macs run macOS
26.5 (25F71). On Studio, direct deniedbootstrap_look_up, direct deniednotify_register_check, and a one-process sequence of the four DirectoryService cache-notification names all return their expected failure values immediately. Pre-initializing Libnotify beforegetpwuid_rdoes not prevent the intermittent failure: a guarded loop still timed out under the denied policy.Therefore the prior theory that Studio generically fails to complete sandbox-denied bootstrap calls is disproved. The remaining platform-level observation is narrower: Codex's missing-runtime policy exposes an intermittent Studio-specific failure inside the
getpwuid_r/ Libinfo startup path, after the sandbox denial is logged and before the caller completes. Apple Libinfo documents those cache-registration failures as ignorable, so a blocked user lookup is unexpected platform behavior, but its Studio-specific trigger remains unknown.Validation
:minimalnative runtime permissions and for keeping those permissions out of the base policy.just fmtjust fix -p codex-protocol -p codex-sandboxing -p codex-exec-serverjust test -p codex-protocolpassed (226tests plus bench smoke).just test -p codex-exec-serverpassed (186tests plus bench smoke).bazelisk build //codex-rs/sandboxing:sandboxingpassed, covering the newinclude_str!policy resource under Bazel.codex-sandboxingregression passes. The fulljust test -p codex-sandboxingremains nonzero only for two existing shell-stderr spelling assertions that failed before this correction as well (bash: pathversusbash: line 1: path).