Terminate stdio MCP servers on shutdown to avoid process leaks#19753
Terminate stdio MCP servers on shutdown to avoid process leaks#19753etraut-openai merged 6 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 67e318148b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1133a8e4c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- openai/codex#19773 permissions: require profiles in TUI thread state (merge-after-nits) - openai/codex#19755 Add Responses stream lifecycle diagnostics (merge-after-nits) - openai/codex#19753 Terminate stdio MCP servers on shutdown (merge-as-is) - BerriAI/litellm#26581 fix(semantic-filter): run for anthropic_messages call type (merge-as-is)
Why
Several bug reports describe thread shutdown (including subagent threads) leaving stdio MCP server processes behind. These reports all point at the same lifecycle gap: Codex launches stdio MCP servers, but the session-level shutdown path does not explicitly close MCP clients or terminate the server process tree.
Fixes #12491
Fixes #12976
Fixes #18881
Fixes #19469
History
This is best understood as a regression/coverage gap in MCP session lifecycle management, not as stdio MCP cleanup being absent all along. #10710 added process-group cleanup for stdio MCP servers, but that cleanup only runs when the
RmcpClient/transport is dropped. The older reports (#12491 and #12976) came after that cleanup existed, which suggests the remaining problem was that some higher-level shutdown paths kept the MCP manager alive or replaced it without explicitly draining clients. The newer reports (#18881 and #19469) exposed the same family around manager replacement and shutdown.What changed
codex-rmcp-clientso local MCP servers terminate their process group and executor-backed MCP servers call the executor process terminator.RmcpClient::shutdown()and manager-level MCP shutdown draining so session shutdown, channel-close fallback, MCP refresh, and connector probing stop owned MCP clients.Verification
cargo test -p codex-rmcp-clientcargo test -p codex-mcpjust fix -p codex-rmcp-clientjust fix -p codex-mcpjust fix -p codex-coreManual before/after validation with a temporary repro script:
HEAD^(fed0a8f4fa): reproduced the leak with surviving MCP server and child PIDs,survivors=[77583, 77592],leaked=true.67e318148b): verified both MCP processes were gone after interruptingcodex exec,survivors=[],leaked=false.