Skip to content

fix: deflake //rs/state_layout:state_layout_test#9203

Merged
basvandijk merged 2 commits intomasterfrom
ai/deflake-state_layout_test-2026-03-05
Mar 6, 2026
Merged

fix: deflake //rs/state_layout:state_layout_test#9203
basvandijk merged 2 commits intomasterfrom
ai/deflake-state_layout_test-2026-03-05

Conversation

@basvandijk
Copy link
Collaborator

@basvandijk basvandijk commented Mar 5, 2026

Root Cause

The checkpoints_files_are_removed_after_flushing_removal_channel test creates 20 checkpoints with 500 dummy files each (10,000 files total), then removes 19 of them (9,500 file deletions) through the async removal channel. On busy CI machines with limited I/O bandwidth, this excessive file I/O causes the entire test binary to exceed its timeout.

The 500 files per checkpoint were intended to create backlog in the checkpoint removal channel, so that we can do some assertions while the backlog is still clearing (namely that the checkpoint is no longer in the list of verified checkpoints), and other assertions after it is cleared (namely that the files are deleted from disk).

Fix

Reduce the dummy file count from 500 to 50 per checkpoint. This drops total file I/O from ~10,000 to ~1000 files. It strikes a balance between ensuring that the backlog definitely didn't clear before the assertion and overall I/O load. If this reduction is not enough, then we can consider adding artificial blockers to the backlog, but it seems not necessary for this test atm.


This PR was created following the steps in .claude/skills/fix-flaky-tests/SKILL.md.

@github-actions github-actions bot added the fix label Mar 5, 2026
@basvandijk basvandijk marked this pull request as ready for review March 5, 2026 14:47
@basvandijk basvandijk requested a review from a team as a code owner March 5, 2026 14:47
@basvandijk basvandijk enabled auto-merge March 5, 2026 17:41
@basvandijk basvandijk added this pull request to the merge queue Mar 6, 2026
Merged via the queue into master with commit 79faae8 Mar 6, 2026
40 checks passed
@basvandijk basvandijk deleted the ai/deflake-state_layout_test-2026-03-05 branch March 6, 2026 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants