test: increase core-all-test shard count to 16 by bolinfest · Pull Request #19727 · openai/codex

bolinfest · 2026-04-26T22:41:51Z

Summary

Increase core-all-test's Bazel shard count from 8 to 16.

Why

#19609 restored bazel.yml to a 30-minute timeout and increased app-server-all-test's shard count because the bigger timeout risk was not just a cold Windows build. The more common problem was a long rust_test() shard failing and getting retried multiple times.

Recent main runs show that //codex-rs/core:core-all-test still has the same shape of problem on Windows:

Run 24943931330 reported //codex-rs/core:core-all-test as flaky after first-attempt failures in shard 5/8 and shard 8/8.
Those retries were driven by suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_config_override and suite::pending_input::steered_user_input_waits_when_tool_output_triggers_compact_before_next_request.
The failed shard attempts in that run took 272.61s and 259.27s before retrying, which is exactly the sort of wall-clock cost that burns through the 30-minute budget.
Run 24966332583 also retried //codex-rs/tui:tui-unit-tests after app::tests::update_memory_settings_updates_current_thread_memory_mode failed once on Windows.
Run 24965527138 and its linked BuildBuddy invocation show the other half of the problem: when Windows cache reuse is weak, the bazel test //... step can already consume 24m11s on its own, leaving very little headroom for flaky retries.

Increasing core-all-test to 16 shards does not fix the flaky tests, but it does reduce the wall-clock cost when a single shard has to be retried. That matches the mitigation we already applied to app-server-all-test in #19609.

What Changed

Update codex-rs/core/BUILD.bazel so core-all-test uses 16 shards instead of 8.
Leave core-unit-tests unchanged.

Follow-up Work

This change is meant to buy back CI headroom while we fix the flaky tests themselves in subsequent commits. The recent Windows retries that look worth addressing directly include:

suite::cli_stream::responses_mode_stream_cli_supports_openai_base_url_config_override
suite::pending_input::steered_user_input_waits_when_tool_output_triggers_compact_before_next_request
app::tests::update_memory_settings_updates_current_thread_memory_mode

Verification

Compared core-all-test's current sharding against the app-server-all-test precedent in #19609.
Inspected recent main Bazel workflow logs and the linked BuildBuddy invocation to confirm that Windows retries on long shards are still consuming a meaningful fraction of the 30-minute timeout budget.
Did not run local tests for this change because it only adjusts Bazel sharding metadata.

bolinfest requested a review from a team as a code owner April 26, 2026 22:41

bolinfest changed the title ~~test: increase core-all-test shard count~~ test: increase core-all-test shard count to 16 Apr 26, 2026

bolinfest requested review from aibrahim-oai and pakrym-oai April 26, 2026 22:49

pakrym-oai approved these changes Apr 26, 2026

View reviewed changes

test: increase core-all-test shard count

0bd83bc

bolinfest force-pushed the pr19727 branch from cc4b12b to 0bd83bc Compare April 26, 2026 22:59

bolinfest enabled auto-merge (squash) April 26, 2026 23:01

bolinfest merged commit 4c58e64 into main Apr 26, 2026
39 of 50 checks passed

bolinfest deleted the pr19727 branch April 26, 2026 23:10

github-actions Bot locked and limited conversation to collaborators Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: increase core-all-test shard count to 16#19727

test: increase core-all-test shard count to 16#19727
bolinfest merged 1 commit intomainfrom
pr19727

bolinfest commented Apr 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bolinfest commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What Changed

Follow-up Work

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bolinfest commented Apr 26, 2026 •

edited

Loading