codex: support hooks in config.toml and requirements.toml#18893
codex: support hooks in config.toml and requirements.toml#18893eternal-openai merged 19 commits intomainfrom
Conversation
| #[derive(Clone, Copy)] | ||
| struct HookHandlerSource<'a> { | ||
| path: &'a AbsolutePathBuf, | ||
| is_managed: bool, |
There was a problem hiding this comment.
do we use is_managed anywhere?
There was a problem hiding this comment.
this will be used later when we need to implement allow_managed_hooks_only -- for now this is a placeholder
bolinfest
left a comment
There was a problem hiding this comment.
Posting the Codex review findings requested by mbolin.
| warnings, | ||
| display_order, | ||
| HookHandlerSource { | ||
| path: &source_path, |
There was a problem hiding this comment.
[P1] Managed hooks resolve relative commands in the workspace
Requirements-managed hooks validate managed_dir, but command execution still sets the child cwd to the active workspace and passes the raw shell command through unchanged. A managed policy using a relative command such as python3 pre.py will run whatever pre.py exists in the workspace instead of the MDM-managed directory, so an untrusted project can hijack or bypass the managed hook. Reject relative commands for managed handlers or run managed hooks from managed_dir, and add a regression test with a relative managed command plus a conflicting workspace script.
Anchored here because command_runner.rs is not part of this PR diff; the execution issue is at codex-rs/hooks/src/engine/command_runner.rs:33-35.
There was a problem hiding this comment.
Thanks! Okay so:
- it's expected that the hook runs in the cwd of wherever codex is, that part is normal
- it's true that the system resolves hooks from the project dir too
So the potential problem here is like this:
- admin ships a managed hook
- the managed hook command is relative
- the workspace contains a shadow script with that name
Therefore my judgement here is -- valid, could use hardening, but probably not a P1. Appears that the spec "recommends" using absolute paths for admins, but does not enforce it. Agree that we can do better
There was a problem hiding this comment.
Interesting...so you can't ship a hook like python3 ./myfile.py if you want to "include" myfile.py as part of the hook?
There was a problem hiding this comment.
For in-repo hooks, that maybe seems reasonable, but not from a plugin?
There was a problem hiding this comment.
Generally, the spec provides magic variables to this command that allows you to reference stuff like the plugin root or the project root. We don't have this implemented currently, so in the docs I have been recommending people to do something like /usr/bin/python3 \"$(git rev-parse --show-toplevel)/.codex/hooks/pre_tool_use_policy.py\" to find the correct path for the project. I'd like to get these variables implemented. Managed hooks should be absolute paths generally as discussed
| ) | ||
| .expect("config layer stack"); | ||
|
|
||
| let engine = ClaudeHooksEngine::new( |
There was a problem hiding this comment.
[P2] Requirements hooks can be silently disabled
Sessions only enable the hook engine from config.features.enabled(Feature::CodexHooks). Since CodexHooks defaults off, a requirements.toml or cloud requirements payload containing hooks alone produces managed_hooks but then ClaudeHooksEngine returns before discovery. The managed policy silently becomes a no-op unless admins also know to pin the codex_hooks feature. Either imply enablement when managed hooks are configured or reject/report this configuration, and add a core integration test that loads requirements hooks through a real session.
This test constructs the engine with enabled = true, so it does not cover the product session path at codex-rs/core/src/session/session.rs:643-646.
There was a problem hiding this comment.
Should be okay:
- we are making hooks default on with this release
- requirements.toml can also enable hooks if needed
| #[serde(rename = "command")] | ||
| Command { | ||
| command: String, | ||
| #[serde(default, rename = "timeout", alias = "timeoutSec")] |
There was a problem hiding this comment.
[P3] Schema exposes timeout instead of timeoutSec
The new config schema is generated from serde(rename = "timeout"), so schema consumers see timeout as the property even though the added TOML tests and app-server API use timeoutSec and deserialization accepts it as the existing hook shape. This will make valid configs look invalid in editor/schema tooling. Make timeoutSec the canonical serialized/schema name with timeout as an alias, or provide a custom schema that includes both.
There was a problem hiding this comment.
Yeah, timeout is the correct schema, though timeoutSec is currently a valid alternative. Probably the solution is that we should remove timeoutSec entirely and just stick to timeout
There was a problem hiding this comment.
fixed - removed timeoutSec from the config-side hook schema and normalized the config examples/tests on timeout
|
|
||
| let handle = std::thread::Builder::new() | ||
| .name(name.to_string()) | ||
| .stack_size(TEST_STACK_SIZE_BYTES) |
There was a problem hiding this comment.
I would argue this is a bad code smell. Please read #13429.
There was a problem hiding this comment.
resolved by your windows patch, thanks!
| match handle.join() { | ||
| Ok(result) => result, | ||
| Err(_) => Err(anyhow::anyhow!("{name} thread panicked")), | ||
| } |
There was a problem hiding this comment.
map_err() is a bit more canonical and I would include err using ?, if necessary
| match handle.join() { | |
| Ok(result) => result, | |
| Err(_) => Err(anyhow::anyhow!("{name} thread panicked")), | |
| } | |
| handle.join().map_err(|err| Err(anyhow::anyhow!("{name} thread panicked: {err?}")) |
There was a problem hiding this comment.
resolved by your windows patch, thanks!
| #[test] | ||
| fn deserialize_managed_hooks_requirements() -> Result<()> { | ||
| let toml_str = r#" | ||
| managed_dir = "/enterprise/hooks" |
There was a problem hiding this comment.
FYI, I would left-align the contents of this and the other strings so it's easier to copy-paste into a real .toml file for testing.
| pub fn is_empty(&self) -> bool { | ||
| self.pre_tool_use.is_empty() | ||
| && self.permission_request.is_empty() | ||
| && self.post_tool_use.is_empty() | ||
| && self.session_start.is_empty() | ||
| && self.user_prompt_submit.is_empty() | ||
| && self.stop.is_empty() | ||
| } |
There was a problem hiding this comment.
I think this is a slightly better way to do this:
| pub fn is_empty(&self) -> bool { | |
| self.pre_tool_use.is_empty() | |
| && self.permission_request.is_empty() | |
| && self.post_tool_use.is_empty() | |
| && self.session_start.is_empty() | |
| && self.user_prompt_submit.is_empty() | |
| && self.stop.is_empty() | |
| } | |
| pub fn is_empty(&self) -> bool { | |
| let Self = { | |
| pre_tool_use, | |
| permission_request, | |
| post_tool_use, | |
| session_start, | |
| user_prompt_submit, | |
| stop, | |
| } = self; | |
| pre_tool_use.is_empty() | |
| && permission_request.is_empty() | |
| && post_tool_use.is_empty() | |
| && session_start.is_empty() | |
| && user_prompt_submit.is_empty() | |
| && stop.is_empty() | |
| } |
The reason is that if another field is added to HookEventsToml, you want to be sure to update this function. When you write it this way, if you add a new field, this function will fail to compile because not all the fields will be listed in the destructuring. This ensures you handle the new field in this function.
| } | ||
|
|
||
| pub fn handler_count(&self) -> usize { | ||
| [ |
There was a problem hiding this comment.
Using the same destructuring trick as above also ensures all fields are handled.
| } | ||
|
|
||
| pub fn into_matcher_groups(self) -> [(HookEventName, Vec<MatcherGroup>); 6] { | ||
| [ |
There was a problem hiding this comment.
Why not return a HashMap that uses HookEventName as a key?
There was a problem hiding this comment.
I think the idea is to preserve ordering
|
|
||
| impl ManagedHooksRequirementsToml { | ||
| pub fn is_empty(&self) -> bool { | ||
| self.managed_dir.is_none() && self.windows_managed_dir.is_none() && self.hooks.is_empty() |
There was a problem hiding this comment.
destructure self here too
| ) | ||
| .expect("hook events TOML should deserialize"); | ||
|
|
||
| assert_eq!(parsed.pre_tool_use.len(), 1); |
There was a problem hiding this comment.
I would favor doing one assert_eq! on parsed.pre_tool_use (or on parsed itself).
| } | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
Prefer putting these in hooks_tests.rs and then referencing it here. Look how we do it in codex-core.
| .join("pre_tool_use_toml_hook_log.jsonl") | ||
| .as_path(), | ||
| )?; | ||
| assert_eq!(json_hook_inputs.len(), 1); |
There was a problem hiding this comment.
Again, if it's not too hard, assert_eq!(vec![ ... ], toml_hook_inputs) so you assert everything at once and pretty_assertions is helpful in a failure
| let source = requirement_source | ||
| .map(ToString::to_string) | ||
| .unwrap_or_else(|| "managed requirements".to_string()); | ||
| let Some(source_path) = managed_hooks.managed_dir_for_current_platform() else { | ||
| warnings.push(format!( | ||
| "skipping managed hooks from {source}: no managed hook directory is configured for this platform" | ||
| )); | ||
| return None; | ||
| }; | ||
| if !source_path.is_absolute() { | ||
| warnings.push(format!( | ||
| "skipping managed hooks from {source}: managed hook directory {} is not absolute", | ||
| source_path.display() | ||
| )); | ||
| return None; | ||
| } | ||
| if !source_path.exists() { | ||
| warnings.push(format!( | ||
| "skipping managed hooks from {source}: managed hook directory {} does not exist", | ||
| source_path.display() | ||
| )); | ||
| return None; | ||
| } | ||
| if !source_path.is_dir() { | ||
| warnings.push(format!( | ||
| "skipping managed hooks from {source}: managed hook directory {} is not a directory", | ||
| source_path.display() | ||
| )); | ||
| return None; | ||
| } | ||
|
|
||
| AbsolutePathBuf::from_absolute_path(source_path) | ||
| .inspect_err(|err| { | ||
| warnings.push(format!( | ||
| "skipping managed hooks from {source}: could not normalize managed hook directory {}: {err}", | ||
| source_path.display() | ||
| )); | ||
| }) | ||
| .ok() |
There was a problem hiding this comment.
If you want to elide return:
| let source = requirement_source | |
| .map(ToString::to_string) | |
| .unwrap_or_else(|| "managed requirements".to_string()); | |
| let Some(source_path) = managed_hooks.managed_dir_for_current_platform() else { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: no managed hook directory is configured for this platform" | |
| )); | |
| return None; | |
| }; | |
| if !source_path.is_absolute() { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: managed hook directory {} is not absolute", | |
| source_path.display() | |
| )); | |
| return None; | |
| } | |
| if !source_path.exists() { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: managed hook directory {} does not exist", | |
| source_path.display() | |
| )); | |
| return None; | |
| } | |
| if !source_path.is_dir() { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: managed hook directory {} is not a directory", | |
| source_path.display() | |
| )); | |
| return None; | |
| } | |
| AbsolutePathBuf::from_absolute_path(source_path) | |
| .inspect_err(|err| { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: could not normalize managed hook directory {}: {err}", | |
| source_path.display() | |
| )); | |
| }) | |
| .ok() | |
| let source = requirement_source | |
| .map(ToString::to_string) | |
| .unwrap_or_else(|| "managed requirements".to_string()); | |
| let Some(source_path) = managed_hooks.managed_dir_for_current_platform() else { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: no managed hook directory is configured for this platform" | |
| )); | |
| return None; | |
| }; | |
| if !source_path.is_absolute() { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: managed hook directory {} is not absolute", | |
| source_path.display() | |
| )); | |
| None | |
| } else if !source_path.exists() { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: managed hook directory {} does not exist", | |
| source_path.display() | |
| )); | |
| None | |
| } else if !source_path.is_dir() { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: managed hook directory {} is not a directory", | |
| source_path.display() | |
| )); | |
| None | |
| } else { | |
| AbsolutePathBuf::from_absolute_path(source_path) | |
| .inspect_err(|err| { | |
| warnings.push(format!( | |
| "skipping managed hooks from {source}: could not normalize managed hook directory {}: {err}", | |
| source_path.display() | |
| )); | |
| }) | |
| .ok() | |
| } |
| return None; | ||
| } | ||
|
|
||
| let contents = match fs::read_to_string(source_path.as_path()) { |
There was a problem hiding this comment.
@starr-openai @pakrym does this have to go through the Environment file reader?
| } | ||
| } | ||
|
|
||
| #[cfg(test)] |
There was a problem hiding this comment.
Let's put this in mod_tests.rs or something?
| } | ||
|
|
||
| fn managed_hooks_for_current_platform<P: Into<std::path::PathBuf>>( | ||
| managed_dir: P, |
There was a problem hiding this comment.
You can drop the <P: Into<std::path::PathBuf>> this way:
| managed_dir: P, | |
| managed_dir: impl AsRef<Path>, |
## Why While debugging the Windows stack overflows we saw in [#13429](#13429) and then again in [#18893](#18893), I hit another overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`. That test drives the legacy multi-agent spawn / close / resume path. The behavior was fine, but several thin async wrappers were still inlining much larger `AgentControl` futures into their callers, which was enough to overflow the default Windows stack. ## What - Box the thin `AgentControl` wrappers around `spawn_agent_internal`, `resume_single_agent_from_rollout`, and `shutdown_agent_tree`. - Box the corresponding legacy `multi_agents` handler calls in `spawn`, `resume_agent`, and `close_agent`. - Keep behavior unchanged while reducing future size on this call path so the Windows test no longer overflows its stack. ## Testing - `cargo test -p codex-core --lib tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed -- --exact --nocapture` - `cargo test -p codex-core` (this still hit unrelated local integration-test failures because `codex.exe` / `test_stdio_server.exe` were not present in this shell; the relevant unit tests passed)
This reverts commit b159d96.
## Summary Set `RUST_MIN_STACK=8388608` for Rust test entry points so libtest-spawned test threads get an 8 MiB stack. The Windows BuildBuddy failure on #18893 showed `//codex-rs/tui:tui-unit-tests` exiting with a stack overflow in a `#[tokio::test]` even though later test binaries in the shard printed successful summaries. Default `#[tokio::test]` uses a current-thread Tokio runtime, which means the async test body is driven on libtest's std-spawned test thread. Increasing the test thread stack addresses that failure mode directly. To date, we have been fixing these stack-pressure problems with localized future-size reductions, such as #13429, and by adding `Box::pin()` in specific async wrapper chains. This gives us a baseline test-runner stack size instead of continuing to patch individual tests only after CI finds another large async future. ## What changed - Added `common --test_env=RUST_MIN_STACK=8388608` in `.bazelrc` so Bazel test actions receive the env var through Bazel's cache-keyed test environment path. - Set the same `RUST_MIN_STACK` value for Cargo/nextest CI entry points and `just test`. - Annotated the existing Windows Bazel linker stack reserve as 8 MiB so it stays aligned with the libtest thread stack size. ## Testing - `just --list` - parsed `.github/workflows/rust-ci.yml` and `.github/workflows/rust-ci-full.yml` with Ruby's YAML loader - compared `bazel aquery` `TestRunner` action keys before/after explicit `--test_env=RUST_MIN_STACK=...` and after moving the Bazel env to `.bazelrc` - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors` - failed locally on the existing sandbox-specific status snapshot permission mismatch, but loaded the Starlark changes and ran the TUI test shards
Summary
Support the existing hooks schema in inline TOML so hooks can be configured from both
config.tomland enterprise-managedrequirements.tomlwithout requiring a separatehooks.jsonpayload.This gives enterprise admins a way to ship managed hook policy through the existing requirements channel while still leaving script delivery to MDM or other device-management tooling, and it keeps
hooks.jsonworking unchanged for existing users.This also lays the groundwork for follow-on managed filtering work such as #15937, while continuing to respect project trust gating from #14718. It does not implement
allow_managed_hooks_onlyitself.NOTE: yes, it's a bit unfortunate that the toml isn't formatted as closely as normal to our default styling. This is because we're trying to stay compatible with the spec for plugins/hooks that we'll need to support & the main usecase here is embedding into requirements.toml
What changed
codex-rs/hooksintocodex-rs/configso the same schema can powerhooks.json, inlineconfig.tomlhooks, and managedrequirements.tomlhookshookssupport to bothConfigTomlandConfigRequirementsToml, including requirements-sidemanaged_dir/windows_managed_dirConstrained, so managed hook policy is merged atomically and cannot drift across requirement sourceshooks.json, then per-layer inline TOML hooks, with a warning when a single layer defines both representations/debug-configcodex-rs/config,codex-rs/hooks,codex-rs/core/src/config_loader/tests.rs, andcodex-rs/core/tests/suite/hooks.rsTesting
cargo test -p codex-configcargo test -p codex-hookscargo test -p codex-app-server config_apiDocumentation
Companion updates are needed in the developers website repo for: