Skip to content

fix(app-server): stop skills watcher on shutdown#23557

Closed
fcoury-oai wants to merge 1 commit into
mainfrom
fcoury/fix-slow-shutdown-skills-watcher
Closed

fix(app-server): stop skills watcher on shutdown#23557
fcoury-oai wants to merge 1 commit into
mainfrom
fcoury/fix-slow-shutdown-skills-watcher

Conversation

@fcoury-oai
Copy link
Copy Markdown
Contributor

@fcoury-oai fcoury-oai commented May 19, 2026

Why

Pressing Ctrl+C in the TUI could make Codex take about 5 seconds to exit.

Shutdown tracing showed that the delay was the in-process app-server waiting for its outbound task to finish until the shutdown timeout elapsed. The remaining outbound sender owner was consistently skills_watcher.event_loop.

The app-server-owned watcher task captured Arc<OutgoingMessageSender> and waited indefinitely on file-watch events, so shutdown could not close the outbound channel promptly.

What Changed

  • Add an explicit shutdown signal to SkillsWatcher.
  • Make the watcher event loop exit when either a file-watch event arrives or shutdown is requested.
  • Call skills_watcher.shutdown() from MessageProcessor::clear_runtime_references() during app-server teardown.

This lets the watcher task drop its OutgoingMessageSender clone during shutdown, so the outbound channel closes without waiting for the 5 second timeout.

How to Test

  1. Start Codex TUI.
  2. Press Ctrl+C.
  3. Confirm Codex exits promptly instead of pausing for about 5 seconds.

Focused regression check from the shutdown investigation:

  • Before this fix, 3/3 sampled shutdowns timed out in app_server.runtime.wait_outbound_handle, app_server_runtime.wait_shutdown_ack, and app_server_client.wait_shutdown_response.
  • With the same fix applied on the investigation branch, 2/2 sampled shutdowns had no shutdown timeouts and app_server.runtime.wait_outbound_handle completed in 0ms.

Local validation:

  • just fmt
  • git diff --check

Local lint note:

  • just argument-comment-lint did not reach this diff because local Bazel analysis failed in external llvm / BuildBuddy setup before the linter ran.

@fcoury-oai
Copy link
Copy Markdown
Contributor Author

Superseded by #23578.

@fcoury-oai fcoury-oai closed this May 19, 2026
@fcoury-oai fcoury-oai deleted the fcoury/fix-slow-shutdown-skills-watcher branch May 19, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant