Summary
Three distinct tests failed against committed code (Timer and Post Merge Action builds on main) in the 24-hour window ending 2026-05-02T10:00Z. This report documents the failures, historical flake rates, and local reproduction results.
Failing Tests
1. ConcurrentSeqNoVersioningIT.testSeqNoCASLinearizability
| Field |
Value |
| Recent builds |
75573, 75596, 75597, 75606 |
| Error |
java.lang.AssertionError: Must be linearizable |
| Seed (build 75597) |
4009CBE1E08681CF |
| Reproduced locally |
✅ Yes — deterministic with seed |
| First failure |
2024-10-03 (build 48853) |
| Total unique builds affected |
105 |
| Pattern |
Chronic flake, significantly worsening. Jumped from 3–6 builds/month to 35 builds in April 2026 — coincides with the CI runner migration to m7a.8xlarge (~April 15, 2026). Already 5 builds affected in the first 2 days of May. |
2. RecoveryWhileUnderLoadIT.testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTest
| Field |
Value |
| Recent build |
75596 |
| Error |
java.lang.RuntimeException: java.util.NoSuchElementException: No value present at OpenSearchIntegTestCase.waitForReplication |
| Seed (build 75596) |
1F84DD846AE1E50D |
| Reproduced locally |
❌ No — seed is not a reliable reproducer (expected for timing-dependent cluster tests) |
| First failure |
2024-04-03 (build 36357) |
| Total unique builds affected |
248 |
| Pattern |
Chronic flake, worsening trend in 2026. Had a major spike in mid-2025 (77 builds in June 2025), subsided, then re-escalated: 13 builds in Feb 2026, 29 in March, 28 in April. The non-reproducibility with the seed is consistent with timing-dependent behavior in cluster disruption/recovery tests. |
3. NRTReplicationEngineTests.testGetSegmentInfosSnapshotPreservesFilesUntilRelease
| Field |
Value |
| Recent build |
75566 |
| Error |
java.lang.AssertionError: expected:<2> but was:<1> |
| Seed (build 75566) |
58B8E8BD126D852E |
| Reproduced locally |
✅ Yes — deterministic with seed |
| First failure |
2024-03-26 (build 35979) |
| Total unique builds affected |
58 |
| Pattern |
Chronic low-level flake, stable to slightly worsening. Consistently 1–4 builds/month for most of its history. Saw a bump to 9 builds in Oct 2025 and 7 in April 2026, but otherwise steady. The deterministic seed reproduction suggests a test-logic or assertion bug rather than a timing issue. |
Summary Table
Sorted by total unique builds affected (descending):
| Test |
Builds Affected |
First Failure |
Reproduced |
Trend |
RecoveryWhileUnderLoadIT.testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTest |
248 |
2024-04-03 |
❌ |
Worsening |
ConcurrentSeqNoVersioningIT.testSeqNoCASLinearizability |
105 |
2024-10-03 |
✅ |
Significantly worsening |
NRTReplicationEngineTests.testGetSegmentInfosSnapshotPreservesFilesUntilRelease |
58 |
2024-03-26 |
✅ |
Stable / slightly worsening |
Notes
- The April 2026 spike in
ConcurrentSeqNoVersioningIT strongly correlates with the CI runner migration from m5.8xlarge to m7a.8xlarge (~April 15, 2026). Faster CPUs likely amplify the race condition that this test is sensitive to.
RecoveryWhileUnderLoadIT is a cluster disruption test whose failure depends on wall-clock timing, not just the random seed. The seed controls which disruption is chosen but not when packets arrive relative to application threads.
NRTReplicationEngineTests reproduces deterministically with the seed, suggesting the root cause is in test logic or an assertion that doesn't match the actual contract of the code under test.
- This report documents failures only. No code changes were made. Root cause investigation is needed for each test.
Summary
Three distinct tests failed against committed code (Timer and Post Merge Action builds on
main) in the 24-hour window ending 2026-05-02T10:00Z. This report documents the failures, historical flake rates, and local reproduction results.Failing Tests
1.
ConcurrentSeqNoVersioningIT.testSeqNoCASLinearizabilityjava.lang.AssertionError: Must be linearizable4009CBE1E08681CF2.
RecoveryWhileUnderLoadIT.testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTestjava.lang.RuntimeException: java.util.NoSuchElementException: No value presentatOpenSearchIntegTestCase.waitForReplication1F84DD846AE1E50D3.
NRTReplicationEngineTests.testGetSegmentInfosSnapshotPreservesFilesUntilReleasejava.lang.AssertionError: expected:<2> but was:<1>58B8E8BD126D852ESummary Table
Sorted by total unique builds affected (descending):
RecoveryWhileUnderLoadIT.testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTestConcurrentSeqNoVersioningIT.testSeqNoCASLinearizabilityNRTReplicationEngineTests.testGetSegmentInfosSnapshotPreservesFilesUntilReleaseNotes
ConcurrentSeqNoVersioningITstrongly correlates with the CI runner migration from m5.8xlarge to m7a.8xlarge (~April 15, 2026). Faster CPUs likely amplify the race condition that this test is sensitive to.RecoveryWhileUnderLoadITis a cluster disruption test whose failure depends on wall-clock timing, not just the random seed. The seed controls which disruption is chosen but not when packets arrive relative to application threads.NRTReplicationEngineTestsreproduces deterministically with the seed, suggesting the root cause is in test logic or an assertion that doesn't match the actual contract of the code under test.