Skip to content

Added support for live updates to skills and AGENTS#9985

Open
etraut-openai wants to merge 22 commits intomainfrom
etraut/live_skill_update
Open

Added support for live updates to skills and AGENTS#9985
etraut-openai wants to merge 22 commits intomainfrom
etraut/live_skill_update

Conversation

@etraut-openai
Copy link
Collaborator

@etraut-openai etraut-openai commented Jan 27, 2026

Overview

Add a centralized FileWatcher in codex-core (using notify) that watches:

  • AGENTS.md / AGENTS.override.md and project AGENTS search dirs
  • Skill roots from the config layer stack (recursive)

Send AgentsChanged and SkillsChanged events when relevant file system changes are detected

On SkillsChanged:

  • Invalidate the skills cache immediately in ThreadManager
  • Emit EventMsg::SkillsUpdateAvailable to active sessions
  • Broadcast a new app-server notification: skills/list/updated

On AgentsChanged:

  • Set a per-session agents_changed flag scoped to relevant watch dirs
  • On the next user turn, recompute skills + user instructions and inject an updated UserInstructions item into the event stream

Wire the watcher through ThreadManager -> Codex::spawn -> SessionServices, including sub-agent sessions.

Add SkillsListUpdatedNotification to the app-server protocol and gate broadcast until after initialize.

Testing

I did a bunch of manual testing of both AGENTS and skills updates. They work surprisingly well. Skill addition and removal are both handled seamlessly in both the UI and in the behavior of the agent on the next turn. Changes to skill details were picked up and followed on the next turn. Additions and modifications to AGENTS.md were followed well. The only failure that I found was when I deleted a major block of instructions from AGENTS.md. In that case, the model still followed those instructions from the older AGENTS.md because it was still in the context window.

Add a centralized FileWatcher in codex-core (using notify) that watches:
* AGENTS.md / AGENTS.override.md and project AGENTS search dirs
* Skill roots from the config layer stack (recursive)

Send `AgentsChanged` and `SkillsChanged` events when relevant file system changes are detected

On `SkillsChanged`:
* Invalidate the skills cache immediately in ThreadManager
* Emit EventMsg::SkillsUpdateAvailable to active sessions
* Broadcast a new app-server notification: skills/list/updated

On `AgentsChanged`:
* Set a per-session agents_changed flag scoped to relevant watch dirs
* On the next user turn, recompute skills + user instructions and inject an updated UserInstructions item into the event stream

Wire the watcher through ThreadManager -> Codex::spawn -> SessionServices, including sub-agent sessions.

Refactor project_doc discovery to expose project_doc_search_dirs for shared watch-dir computation.

Add SkillsListUpdatedNotification to the app-server protocol and gate broadcast until after initialize.

Testing:

I did a bunch of manual testing of both AGENTS and skills updates. They work surprisingly well. Skill addition and removal are both handled seamlessly in both the UI and in the behavior of the agent on the next turn. Changes to skill details were picked up and followed on the next turn. Additions and modifications to AGENTS.md were followed well. The only failure that I found was when I deleted a major block of instructions from AGENTS.md. In that case, the model still followed those instructions from the older AGENTS.md because it was still in the context window.
Copy link
Collaborator

@xl-openai xl-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this change! The skills part LGTM!

);
}
let state = SessionState::new(session_configuration.clone());
let agents_watch_dirs = Self::build_agents_watch_dirs(&config);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't agents.md cwd-specific? Can we create the watcher only once for the entire session?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is taking cwd into consideration. It builds a list of all directories from the cwd up to the root of the repo and watches for changes in those directories. This mirrors the same logic used when enumerating the AGENTS files.

let enabled_skills = skills_outcome.enabled_skills();
let user_instructions = get_user_instructions(&config, Some(&enabled_skills)).await;
let mut state = self.state.lock().await;
state.session_configuration.user_instructions = user_instructions.clone();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we let the model know which file names changed instead? And let it decide which ones it cares about?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current logic can inadvertently create huge messages and flood the context. We've seen very large skills + agents combinations.

If we inject the entire thing for every change we'll exhaust the context in no time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will work to only tell it which file names changed. We need to update the names and summaries of the skills. Both the name and summary are bounded in size (64 characters and 512 characters, respectively). We are not including the full skill file here.

We do include the full agents contents. I agree that's a concern in terms of size. But I don't see a way around that. We don't initially tell the model the file paths of the AGENTS files; we simply read the contents of all of these files, concatenate them, and then truncate to 32K. If we were to later tell it about file paths, it would have no way of correlating those paths with the original AGENTS text. I suppose we could modify the way we handle AGENTS, but that's a pretty big change and would probably be OOD for the model. Let me know if you have any other suggestions.

Copy link
Collaborator Author

@etraut-openai etraut-openai Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we're reinjecting only on turn boundaries. This acts as a "debounce" mechanism in the case where multiple file changes occur in quick succession. But I understand your concern about potential storms of file watcher events. Since we deliver a SkillsListUpdatedNotification app server notification each time, it could swap an app server client. I've added an additional one-second throttling/debounce mechanism.

Copy link
Collaborator

@pakrym-oai pakrym-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs tests.

I think re-injecting full Agents.md and skills is too much of a risk/context bloat. We should consider only injecting updated file names for both skills and agents along with some short prompt and letting the model decide.

I'd also prefer if we had a very aggressive throttle on file change events. If there is any bug in file_watcher it'll flood the session (some editors perform save operations that cause multiple file notifications, rename, create, delete etc).

And we should consider shipping this behind a feature first.

@etraut-openai
Copy link
Collaborator Author

@codex review

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c8b6f27de5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@etraut-openai
Copy link
Collaborator Author

codex review

@etraut-openai
Copy link
Collaborator Author

@pakrym-oai, I've made the following changes since the last review:

  1. Added experimental feature flags live_agents_reload and live_skills_reload; they default to false for now
  2. Added an "event coalescer" mechanism for file watcher events that limits them to one per second
  3. Added integration tests
  4. Added support for fallback filenames (a config option that allows users to specify alternative names for AGENTS.md)
  5. Modified AGENTS.md for this repo to remove instructions about updating docs, since docs are no longer stored in this repo

I didn't make changes to the injection strategy (see responses above). Let me know if you have other thoughts here.

@xiaotang-12-ops

This comment was marked as off-topic.


// Filter out synthetic user-instructions messages so truncation counts
// only real user turns.
items.retain(|item| match item {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we start needing this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required for some of the existing tests to accommodate the new skills & AGENTS injection.

# Conflicts:
#	codex-rs/app-server/src/message_processor.rs
#	codex-rs/cloud-requirements/src/lib.rs
#	codex-rs/core/src/codex.rs
# Conflicts:
#	codex-rs/app-server/src/message_processor.rs
# Conflicts:
#	codex-rs/core/Cargo.toml
#	codex-rs/core/src/codex.rs
etraut-openai added a commit that referenced this pull request Feb 3, 2026
Add a centralized FileWatcher in codex-core (using notify) that watches skill roots from the config layer stack (recursive)

Send `SkillsChanged` events when relevant file system changes are detected

On `SkillsChanged`:
* Invalidate the skills cache immediately in ThreadManager
* Emit EventMsg::SkillsUpdateAvailable to active sessions
* Broadcast a new app-server notification: skills/list/updated

Add SkillsListUpdatedNotification to the app-server protocol and gate broadcast until after initialize.

This change does not inject new events into the event stream. That means the agent will not know about new skills, so it won't be able to implicitly invoke new skills. It also won't know about changes to existing skills, so if it has already read the contents of a modified skill, it will not honor the new behavior.

I plan to address these limitations in a follow-on PR modeled after #9985. Injection of new skills (and AGENTS) was deemed to risky at this point, hence the need to split the feature into two stages.

Testing:
* In addition to automated tests, I did manual testing to confirm that newly-created skills, deleted skills, and renamed skills are reflected in the TUI skill picker menu. Also confirmed that modifications to behaviors for explicitly-invoked skills are honored.
xl-openai added a commit that referenced this pull request Feb 4, 2026
Add a centralized FileWatcher in codex-core (using notify) that watches
skill roots from the config layer stack (recursive)

Send `SkillsChanged` events when relevant file system changes are
detected

On `SkillsChanged`:
* Invalidate the skills cache immediately in ThreadManager
* Emit EventMsg::SkillsUpdateAvailable to active sessions
~~* Broadcast a new app-server notification:
SkillsListUpdatedNotification~~

This change does not inject new items into the event stream. That means
the agent will not know about new skills, so it won't be able to
implicitly invoke new skills. It also won't know about changes to
existing skills, so if it has already read the contents of a modified
skill, it will not honor the new behavior.

This change also does not detect modifications to AGENTS.md.

I plan to address these limitations in a follow-on PR modeled after
#9985. Injection of new skills and AGENTS was deemed to risky, hence the
need to split the feature into two stages. The changes in this PR were
designed to easily accommodate the second stage once we have some other
foundational changes in place.

Testing: In addition to automated tests, I did manual testing to confirm
that newly-created skills, deleted skills, and renamed skills are
reflected in the TUI skill picker menu. Also confirmed that
modifications to behaviors for explicitly-invoked skills are honored.

---------

Co-authored-by: Xin Lin <xl@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants