Skip to content

Flaky test report: committed-code failures on 2026-05-02 #253

@andrross

Description

@andrross

Summary

Three distinct tests failed against committed code (Timer and Post Merge Action builds on main) in the 24-hour window ending 2026-05-02T10:00Z. This report documents the failures, historical flake rates, and local reproduction results.

Failing Tests

1. ConcurrentSeqNoVersioningIT.testSeqNoCASLinearizability

Field Value
Recent builds 75573, 75596, 75597, 75606
Error java.lang.AssertionError: Must be linearizable
Seed (build 75597) 4009CBE1E08681CF
Reproduced locally ✅ Yes — deterministic with seed
First failure 2024-10-03 (build 48853)
Total unique builds affected 105
Pattern Chronic flake, significantly worsening. Jumped from 3–6 builds/month to 35 builds in April 2026 — coincides with the CI runner migration to m7a.8xlarge (~April 15, 2026). Already 5 builds affected in the first 2 days of May.

2. RecoveryWhileUnderLoadIT.testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTest

Field Value
Recent build 75596
Error java.lang.RuntimeException: java.util.NoSuchElementException: No value present at OpenSearchIntegTestCase.waitForReplication
Seed (build 75596) 1F84DD846AE1E50D
Reproduced locally ❌ No — seed is not a reliable reproducer (expected for timing-dependent cluster tests)
First failure 2024-04-03 (build 36357)
Total unique builds affected 248
Pattern Chronic flake, worsening trend in 2026. Had a major spike in mid-2025 (77 builds in June 2025), subsided, then re-escalated: 13 builds in Feb 2026, 29 in March, 28 in April. The non-reproducibility with the seed is consistent with timing-dependent behavior in cluster disruption/recovery tests.

3. NRTReplicationEngineTests.testGetSegmentInfosSnapshotPreservesFilesUntilRelease

Field Value
Recent build 75566
Error java.lang.AssertionError: expected:<2> but was:<1>
Seed (build 75566) 58B8E8BD126D852E
Reproduced locally ✅ Yes — deterministic with seed
First failure 2024-03-26 (build 35979)
Total unique builds affected 58
Pattern Chronic low-level flake, stable to slightly worsening. Consistently 1–4 builds/month for most of its history. Saw a bump to 9 builds in Oct 2025 and 7 in April 2026, but otherwise steady. The deterministic seed reproduction suggests a test-logic or assertion bug rather than a timing issue.

Summary Table

Sorted by total unique builds affected (descending):

Test Builds Affected First Failure Reproduced Trend
RecoveryWhileUnderLoadIT.testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTest 248 2024-04-03 Worsening
ConcurrentSeqNoVersioningIT.testSeqNoCASLinearizability 105 2024-10-03 Significantly worsening
NRTReplicationEngineTests.testGetSegmentInfosSnapshotPreservesFilesUntilRelease 58 2024-03-26 Stable / slightly worsening

Notes

  • The April 2026 spike in ConcurrentSeqNoVersioningIT strongly correlates with the CI runner migration from m5.8xlarge to m7a.8xlarge (~April 15, 2026). Faster CPUs likely amplify the race condition that this test is sensitive to.
  • RecoveryWhileUnderLoadIT is a cluster disruption test whose failure depends on wall-clock timing, not just the random seed. The seed controls which disruption is chosen but not when packets arrive relative to application threads.
  • NRTReplicationEngineTests reproduces deterministically with the seed, suggesting the root cause is in test logic or an assertion that doesn't match the actual contract of the code under test.
  • This report documents failures only. No code changes were made. Root cause investigation is needed for each test.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions