Flaky test report: committed-code failures on 2026-05-07
Summary
Analysis of gradle-check failures against committed code (Timer and Post Merge Action builds) in the past 24 hours. 5 distinct tests failed across 4 builds.
Summary Table
| # |
Test |
Builds Affected (All-Time) |
First Seen |
Pattern |
Build Link |
| 1 |
MixedClusterClientYamlTestSuiteIT 310_match_bool_prefix/multi_match multiple fields partial term |
365 |
2024-03-25 |
Stable/chronic |
76064 |
| 2 |
MixedClusterClientYamlTestSuiteIT 310_match_bool_prefix/multi_match multiple fields complete term |
352 |
2024-03-25 |
Stable/chronic |
76064 |
| 3 |
IngestFromKinesisIT testKinesisIngestion |
255 |
2025-03-24 |
Worsening (spike Mar 2026) |
76083 |
| 4 |
MixedClusterClientYamlTestSuiteIT cluster.health/10_basic/cluster health with closed index |
202 |
2024-03-25 |
Stable/chronic |
76076 |
| 5 |
FlightMetricsTests testComprehensiveMetrics |
71 |
2025-07-25 |
Stable (~6/month) |
76138 |
| 6 |
EhcacheDiskCacheManagerTests testCreateAndCloseCacheConcurrently |
29 |
2025-03-05 |
Worsening (Apr-May 2026) |
76071 |
Detailed Findings
1. MixedClusterClientYamlTestSuiteIT - 310_match_bool_prefix/multi_match multiple fields partial term
- Build: 76064 (Post Merge Action)
- Seed:
AD6EEB0DC58E72AE
- Error:
hits.hits.0._id: expected String [4] but was String [1]
- Reproduced locally: N/A — BWC test requires multi-version cluster infrastructure
- First seen: 2024-03-25
- Total unique builds affected: 365
- Pattern: Chronic flake. Major spike in Sep 2024 (137 builds), otherwise steady at 3-16 builds/month. Stable and ongoing — no improvement or worsening trend in recent months.
2. MixedClusterClientYamlTestSuiteIT - 310_match_bool_prefix/multi_match multiple fields complete term
- Build: 76064 (Post Merge Action)
- Seed:
AD6EEB0DC58E72AE
- Error:
hits.hits.0._id: expected String [4] but was String [1]
- Reproduced locally: N/A — BWC test requires multi-version cluster infrastructure
- First seen: 2024-03-25
- Total unique builds affected: 352
- Pattern: Nearly identical to the partial term variant. Chronic flake with same spike in Sep 2024 (133 builds). These two tests always fail together in the same builds.
3. IngestFromKinesisIT - testKinesisIngestion
- Build: 76083 (Post Merge Action)
- Seed:
E0E189648EF687DD
- Error:
ResourceInUseException: Stream test already exists — test cleanup/setup race condition with the embedded Kinesis mock
- Reproduced locally: No — passed with seed
E0E189648EF687DD
- First seen: 2025-03-24
- Total unique builds affected: 255
- Pattern: Worsening. Large spikes in Mar 2025 (51 builds), Sep 2025 (111 builds), and Mar 2026 (53 builds). The error indicates a resource cleanup race that is timing-dependent and not seed-reproducible.
4. MixedClusterClientYamlTestSuiteIT - cluster.health/10_basic/cluster health with closed index
- Build: 76076 (Timer, main)
- Seed:
467A7A407AF287D8
- Error:
expected [2xx] status code but api [cluster.health] returned [408 Request Timeout] — cluster health timed out with status:red and 51 unassigned shards
- Reproduced locally: N/A — BWC test requires multi-version cluster infrastructure
- First seen: 2024-03-25
- Total unique builds affected: 202
- Pattern: Chronic flake. Spike in Sep 2024 (58 builds), otherwise 1-14 builds/month. Uptick in Apr 2026 (10 builds) may correlate with the m7a.8xlarge runner migration (faster CPUs can change cluster formation timing).
5. FlightMetricsTests - testComprehensiveMetrics
- Build: 76138 (Post Merge Action)
- Seed:
2E69A50FFD89D2D0
- Error:
BindTransportException: Failed to bind to [/0:0:0:0:0:0:0:1%lo, /127.0.0.1]:PortsRange{portRange='25401'} — port conflict
- Reproduced locally: No — passed with seed
2E69A50FFD89D2D0
- First seen: 2025-07-25
- Total unique builds affected: 71
- Pattern: Stable at ~5-11 builds/month since inception. Port binding failures are inherently environment-dependent and not seed-reproducible. Slight uptick in Apr 2026 (11 builds).
6. EhcacheDiskCacheManagerTests - testCreateAndCloseCacheConcurrently
- Build: 76071 (Post Merge Action)
- Seed:
DF0F4253E4345108
- Error:
Suite timeout exceeded (>= 1200000 msec) — test hung until the 20-minute suite timeout
- Reproduced locally: No — passed with seed
DF0F4253E4345108
- First seen: 2025-03-05
- Total unique builds affected: 29
- Pattern: Worsening. Was dormant Feb-Mar 2026, then 7 builds affected in both Apr and May 2026 (7 days into May). The timeout suggests a deadlock or livelock in concurrent cache creation/close that manifests under specific thread scheduling — consistent with the m7a.8xlarge runner migration amplifying latent races.
Reproduction Summary
| Test |
Seed |
Reproduced? |
| MixedClusterClientYamlTestSuiteIT (partial term) |
AD6EEB0DC58E72AE |
N/A (BWC infra required) |
| MixedClusterClientYamlTestSuiteIT (complete term) |
AD6EEB0DC58E72AE |
N/A (BWC infra required) |
| MixedClusterClientYamlTestSuiteIT (cluster.health) |
467A7A407AF287D8 |
N/A (BWC infra required) |
| EhcacheDiskCacheManagerTests |
DF0F4253E4345108 |
No |
| IngestFromKinesisIT |
E0E189648EF687DD |
No |
| FlightMetricsTests |
2E69A50FFD89D2D0 |
No |
None of the locally-runnable tests reproduced with their CI seeds, which is consistent with all failures being timing/environment-dependent rather than seed-deterministic.
Notes
- The
EhcacheDiskCacheManagerTests.classMethod failure is a secondary artifact of the suite timeout caused by testCreateAndCloseCacheConcurrently hanging — it is not a separate flaky test.
- The MixedClusterClientYamlTestSuiteIT
310_match_bool_prefix tests (partial and complete term) always fail together and share the same root cause (document scoring/ordering non-determinism in mixed-version clusters).
- Data source:
gradle-check-* index at metrics.opensearch.org, queried 2026-05-07.
Flaky test report: committed-code failures on 2026-05-07
Summary
Analysis of gradle-check failures against committed code (Timer and Post Merge Action builds) in the past 24 hours. 5 distinct tests failed across 4 builds.
Summary Table
310_match_bool_prefix/multi_match multiple fields partial term310_match_bool_prefix/multi_match multiple fields complete termtestKinesisIngestioncluster.health/10_basic/cluster health with closed indextestComprehensiveMetricstestCreateAndCloseCacheConcurrentlyDetailed Findings
1. MixedClusterClientYamlTestSuiteIT -
310_match_bool_prefix/multi_match multiple fields partial termAD6EEB0DC58E72AEhits.hits.0._id: expected String [4] but was String [1]2. MixedClusterClientYamlTestSuiteIT -
310_match_bool_prefix/multi_match multiple fields complete termAD6EEB0DC58E72AEhits.hits.0._id: expected String [4] but was String [1]3. IngestFromKinesisIT -
testKinesisIngestionE0E189648EF687DDResourceInUseException: Stream test already exists— test cleanup/setup race condition with the embedded Kinesis mockE0E189648EF687DD4. MixedClusterClientYamlTestSuiteIT -
cluster.health/10_basic/cluster health with closed index467A7A407AF287D8expected [2xx] status code but api [cluster.health] returned [408 Request Timeout]— cluster health timed out withstatus:redand 51 unassigned shards5. FlightMetricsTests -
testComprehensiveMetrics2E69A50FFD89D2D0BindTransportException: Failed to bind to [/0:0:0:0:0:0:0:1%lo, /127.0.0.1]:PortsRange{portRange='25401'}— port conflict2E69A50FFD89D2D06. EhcacheDiskCacheManagerTests -
testCreateAndCloseCacheConcurrentlyDF0F4253E4345108Suite timeout exceeded (>= 1200000 msec)— test hung until the 20-minute suite timeoutDF0F4253E4345108Reproduction Summary
None of the locally-runnable tests reproduced with their CI seeds, which is consistent with all failures being timing/environment-dependent rather than seed-deterministic.
Notes
EhcacheDiskCacheManagerTests.classMethodfailure is a secondary artifact of the suite timeout caused bytestCreateAndCloseCacheConcurrentlyhanging — it is not a separate flaky test.310_match_bool_prefixtests (partial and complete term) always fail together and share the same root cause (document scoring/ordering non-determinism in mixed-version clusters).gradle-check-*index atmetrics.opensearch.org, queried 2026-05-07.