Skip to content

test: increase core-all-test shard count to 16#19727

Merged
bolinfest merged 1 commit intomainfrom
pr19727
Apr 26, 2026
Merged

test: increase core-all-test shard count to 16#19727
bolinfest merged 1 commit intomainfrom
pr19727

Conversation

@bolinfest
Copy link
Copy Markdown
Collaborator

@bolinfest bolinfest commented Apr 26, 2026

Summary

Increase core-all-test's Bazel shard count from 8 to 16.

Why

#19609 restored bazel.yml to a 30-minute timeout and increased app-server-all-test's shard count because the bigger timeout risk was not just a cold Windows build. The more common problem was a long rust_test() shard failing and getting retried multiple times.

Recent main runs show that //codex-rs/core:core-all-test still has the same shape of problem on Windows:

  • Run 24943931330 reported //codex-rs/core:core-all-test as flaky after first-attempt failures in shard 5/8 and shard 8/8.
  • Those retries were driven by suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_config_override and suite::pending_input::steered_user_input_waits_when_tool_output_triggers_compact_before_next_request.
  • The failed shard attempts in that run took 272.61s and 259.27s before retrying, which is exactly the sort of wall-clock cost that burns through the 30-minute budget.
  • Run 24966332583 also retried //codex-rs/tui:tui-unit-tests after app::tests::update_memory_settings_updates_current_thread_memory_mode failed once on Windows.
  • Run 24965527138 and its linked BuildBuddy invocation show the other half of the problem: when Windows cache reuse is weak, the bazel test //... step can already consume 24m11s on its own, leaving very little headroom for flaky retries.

Increasing core-all-test to 16 shards does not fix the flaky tests, but it does reduce the wall-clock cost when a single shard has to be retried. That matches the mitigation we already applied to app-server-all-test in #19609.

What Changed

  • Update codex-rs/core/BUILD.bazel so core-all-test uses 16 shards instead of 8.
  • Leave core-unit-tests unchanged.

Follow-up Work

This change is meant to buy back CI headroom while we fix the flaky tests themselves in subsequent commits. The recent Windows retries that look worth addressing directly include:

  • suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_config_override
  • suite::pending_input::steered_user_input_waits_when_tool_output_triggers_compact_before_next_request
  • app::tests::update_memory_settings_updates_current_thread_memory_mode

Verification

  • Compared core-all-test's current sharding against the app-server-all-test precedent in #19609.
  • Inspected recent main Bazel workflow logs and the linked BuildBuddy invocation to confirm that Windows retries on long shards are still consuming a meaningful fraction of the 30-minute timeout budget.
  • Did not run local tests for this change because it only adjusts Bazel sharding metadata.

@bolinfest bolinfest requested a review from a team as a code owner April 26, 2026 22:41
@bolinfest bolinfest changed the title test: increase core-all-test shard count test: increase core-all-test shard count to 16 Apr 26, 2026
@bolinfest bolinfest enabled auto-merge (squash) April 26, 2026 23:01
@bolinfest bolinfest merged commit 4c58e64 into main Apr 26, 2026
39 of 50 checks passed
@bolinfest bolinfest deleted the pr19727 branch April 26, 2026 23:10
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 26, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants