Skip to content

Stabilize standalone user shell context test#23614

Open
starr-openai wants to merge 1 commit into
mainfrom
starr/fullci-10-user-shell-reference-context-20260520
Open

Stabilize standalone user shell context test#23614
starr-openai wants to merge 1 commit into
mainfrom
starr/fullci-10-user-shell-reference-context-20260520

Conversation

@starr-openai
Copy link
Copy Markdown
Contributor

Why

rust-ci-full run 26140060917 failed codex-core::session::tests::run_user_shell_command_does_not_set_reference_context_item on windows-x64-shard-2 after both nextest attempts timed out waiting for TurnComplete.

That wait is not part of the invariant this test protects. The regression would happen when the standalone shell task is scheduled and seeds the regular-turn baseline; the test only needs to assert that reference_context_item remains unset after run_user_shell_command() returns.

What Changed

  • Stop waiting for the spawned shell task to finish in run_user_shell_command_does_not_set_reference_context_item.
  • Keep the reference-context assertion and abort the background task before the test exits.

Verification

  • Source-grounded against the failing Windows CI log from run 26140060917 / job 76885730325.
  • git diff --check

@starr-openai starr-openai requested a review from a team as a code owner May 20, 2026 04:28
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef68474214

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 7433 to 7436
assert!(
session.reference_context_item().await.is_none(),
"standalone shell tasks should not mutate previous context"
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wait for the shell task before asserting context state

With this change the assertion runs immediately after run_user_shell_command() returns, but the standalone shell work is executed in the task spawned by spawn_task; run_user_shell_command() can return before UserShellCommandTask::run has executed the code path that would mutate reference_context_item. In that interleaving, a regression that sets the baseline during the background shell task would still pass because the task is then aborted after the early assertion. Please wait for a deterministic signal that the spawned task has reached or completed the relevant shell execution path before checking the context state.

Useful? React with 👍 / 👎.

@starr-openai starr-openai force-pushed the starr/fullci-10-user-shell-reference-context-20260520 branch from ef68474 to d2c9132 Compare May 20, 2026 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant