Added support for live updates to skills and AGENTS#9985
Added support for live updates to skills and AGENTS#9985etraut-openai wants to merge 22 commits intomainfrom
Conversation
Add a centralized FileWatcher in codex-core (using notify) that watches: * AGENTS.md / AGENTS.override.md and project AGENTS search dirs * Skill roots from the config layer stack (recursive) Send `AgentsChanged` and `SkillsChanged` events when relevant file system changes are detected On `SkillsChanged`: * Invalidate the skills cache immediately in ThreadManager * Emit EventMsg::SkillsUpdateAvailable to active sessions * Broadcast a new app-server notification: skills/list/updated On `AgentsChanged`: * Set a per-session agents_changed flag scoped to relevant watch dirs * On the next user turn, recompute skills + user instructions and inject an updated UserInstructions item into the event stream Wire the watcher through ThreadManager -> Codex::spawn -> SessionServices, including sub-agent sessions. Refactor project_doc discovery to expose project_doc_search_dirs for shared watch-dir computation. Add SkillsListUpdatedNotification to the app-server protocol and gate broadcast until after initialize. Testing: I did a bunch of manual testing of both AGENTS and skills updates. They work surprisingly well. Skill addition and removal are both handled seamlessly in both the UI and in the behavior of the agent on the next turn. Changes to skill details were picked up and followed on the next turn. Additions and modifications to AGENTS.md were followed well. The only failure that I found was when I deleted a major block of instructions from AGENTS.md. In that case, the model still followed those instructions from the older AGENTS.md because it was still in the context window.
xl-openai
left a comment
There was a problem hiding this comment.
Thanks for making this change! The skills part LGTM!
codex-rs/core/src/codex.rs
Outdated
| ); | ||
| } | ||
| let state = SessionState::new(session_configuration.clone()); | ||
| let agents_watch_dirs = Self::build_agents_watch_dirs(&config); |
There was a problem hiding this comment.
Aren't agents.md cwd-specific? Can we create the watcher only once for the entire session?
There was a problem hiding this comment.
This function is taking cwd into consideration. It builds a list of all directories from the cwd up to the root of the repo and watches for changes in those directories. This mirrors the same logic used when enumerating the AGENTS files.
| let enabled_skills = skills_outcome.enabled_skills(); | ||
| let user_instructions = get_user_instructions(&config, Some(&enabled_skills)).await; | ||
| let mut state = self.state.lock().await; | ||
| state.session_configuration.user_instructions = user_instructions.clone(); |
There was a problem hiding this comment.
Should we let the model know which file names changed instead? And let it decide which ones it cares about?
There was a problem hiding this comment.
The current logic can inadvertently create huge messages and flood the context. We've seen very large skills + agents combinations.
If we inject the entire thing for every change we'll exhaust the context in no time.
There was a problem hiding this comment.
I don't think it will work to only tell it which file names changed. We need to update the names and summaries of the skills. Both the name and summary are bounded in size (64 characters and 512 characters, respectively). We are not including the full skill file here.
We do include the full agents contents. I agree that's a concern in terms of size. But I don't see a way around that. We don't initially tell the model the file paths of the AGENTS files; we simply read the contents of all of these files, concatenate them, and then truncate to 32K. If we were to later tell it about file paths, it would have no way of correlating those paths with the original AGENTS text. I suppose we could modify the way we handle AGENTS, but that's a pretty big change and would probably be OOD for the model. Let me know if you have any other suggestions.
There was a problem hiding this comment.
Note that we're reinjecting only on turn boundaries. This acts as a "debounce" mechanism in the case where multiple file changes occur in quick succession. But I understand your concern about potential storms of file watcher events. Since we deliver a SkillsListUpdatedNotification app server notification each time, it could swap an app server client. I've added an additional one-second throttling/debounce mechanism.
There was a problem hiding this comment.
Needs tests.
I think re-injecting full Agents.md and skills is too much of a risk/context bloat. We should consider only injecting updated file names for both skills and agents along with some short prompt and letting the model decide.
I'd also prefer if we had a very aggressive throttle on file change events. If there is any bug in file_watcher it'll flood the session (some editors perform save operations that cause multiple file notifications, rename, create, delete etc).
And we should consider shipping this behind a feature first.
# Conflicts: # codex-rs/core/config.schema.json
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c8b6f27de5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
codex review |
|
@pakrym-oai, I've made the following changes since the last review:
I didn't make changes to the injection strategy (see responses above). Let me know if you have other thoughts here. |
This comment was marked as off-topic.
This comment was marked as off-topic.
|
|
||
| // Filter out synthetic user-instructions messages so truncation counts | ||
| // only real user turns. | ||
| items.retain(|item| match item { |
There was a problem hiding this comment.
Why did we start needing this?
There was a problem hiding this comment.
This is required for some of the existing tests to accommodate the new skills & AGENTS injection.
# Conflicts: # codex-rs/app-server/src/message_processor.rs # codex-rs/cloud-requirements/src/lib.rs # codex-rs/core/src/codex.rs
# Conflicts: # codex-rs/app-server/src/message_processor.rs
# Conflicts: # codex-rs/core/Cargo.toml # codex-rs/core/src/codex.rs
Add a centralized FileWatcher in codex-core (using notify) that watches skill roots from the config layer stack (recursive) Send `SkillsChanged` events when relevant file system changes are detected On `SkillsChanged`: * Invalidate the skills cache immediately in ThreadManager * Emit EventMsg::SkillsUpdateAvailable to active sessions * Broadcast a new app-server notification: skills/list/updated Add SkillsListUpdatedNotification to the app-server protocol and gate broadcast until after initialize. This change does not inject new events into the event stream. That means the agent will not know about new skills, so it won't be able to implicitly invoke new skills. It also won't know about changes to existing skills, so if it has already read the contents of a modified skill, it will not honor the new behavior. I plan to address these limitations in a follow-on PR modeled after #9985. Injection of new skills (and AGENTS) was deemed to risky at this point, hence the need to split the feature into two stages. Testing: * In addition to automated tests, I did manual testing to confirm that newly-created skills, deleted skills, and renamed skills are reflected in the TUI skill picker menu. Also confirmed that modifications to behaviors for explicitly-invoked skills are honored.
Add a centralized FileWatcher in codex-core (using notify) that watches skill roots from the config layer stack (recursive) Send `SkillsChanged` events when relevant file system changes are detected On `SkillsChanged`: * Invalidate the skills cache immediately in ThreadManager * Emit EventMsg::SkillsUpdateAvailable to active sessions ~~* Broadcast a new app-server notification: SkillsListUpdatedNotification~~ This change does not inject new items into the event stream. That means the agent will not know about new skills, so it won't be able to implicitly invoke new skills. It also won't know about changes to existing skills, so if it has already read the contents of a modified skill, it will not honor the new behavior. This change also does not detect modifications to AGENTS.md. I plan to address these limitations in a follow-on PR modeled after #9985. Injection of new skills and AGENTS was deemed to risky, hence the need to split the feature into two stages. The changes in this PR were designed to easily accommodate the second stage once we have some other foundational changes in place. Testing: In addition to automated tests, I did manual testing to confirm that newly-created skills, deleted skills, and renamed skills are reflected in the TUI skill picker menu. Also confirmed that modifications to behaviors for explicitly-invoked skills are honored. --------- Co-authored-by: Xin Lin <xl@openai.com>
Overview
Add a centralized FileWatcher in codex-core (using notify) that watches:
AGENTS.md/AGENTS.override.mdand projectAGENTSsearch dirsSend
AgentsChangedandSkillsChangedevents when relevant file system changes are detectedOn
SkillsChanged:On
AgentsChanged:Wire the watcher through
ThreadManager->Codex::spawn->SessionServices, including sub-agent sessions.Add
SkillsListUpdatedNotificationto the app-server protocol and gate broadcast until after initialize.Testing
I did a bunch of manual testing of both AGENTS and skills updates. They work surprisingly well. Skill addition and removal are both handled seamlessly in both the UI and in the behavior of the agent on the next turn. Changes to skill details were picked up and followed on the next turn. Additions and modifications to AGENTS.md were followed well. The only failure that I found was when I deleted a major block of instructions from AGENTS.md. In that case, the model still followed those instructions from the older AGENTS.md because it was still in the context window.