Stabilize plugin MCP fixture tests#19452
Merged
dylan-hurd-oai merged 2 commits intomainfrom Apr 28, 2026
Merged
Conversation
xl-openai
approved these changes
Apr 24, 2026
pakrym-oai
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Recent
mainCI had repeated flakes in the plugin fixture tests:codex-core::all suite::plugins::explicit_plugin_mentions_inject_plugin_guidancefailed in runs 24909500958, 24908076251, 24906197645, and 24898949647.codex-core::all suite::plugins::plugin_mcp_tools_are_listedfailed in runs 24909500958, 24908076251, and 24898949647.The failures were in the same plugin/MCP fixture family: assertions expected sample plugin guidance or tool inventory, but the test could observe the session before the sample MCP server had finished startup.
Root Cause
explicit_plugin_mentions_inject_plugin_guidancesubmitted the user turn immediately after constructing the session. MCP startup is asynchronous, so on a slower or busier CI runner the prompt could be built before the sample plugin MCP server had reported its tools. That made the test depend on scheduler timing rather than the fixture being ready.plugin_mcp_tools_are_listedalready needed the same readiness condition, but its wait logic was local to that test.What Changed
wait_for_sample_mcp_readyhelper for the plugin fixture tests.McpStartupCompletebefore submitting the explicit plugin mention turn.Why This Should Be Reliable
The tests now wait for the explicit readiness signal from the sample MCP server before asserting guidance or tools derived from that server. This removes the startup race while still exercising the real fixture path, so the assertions should only run after the plugin inventory is deterministic.
Verification
cargo test -p codex-core --test all plugins::