Skip to content

fix: stabilise stream tests on JDK 25 nightly (timeout scaling, element counts)#2869

Merged
He-Pin merged 4 commits intomainfrom
fix-nightly-failures
Apr 18, 2026
Merged

fix: stabilise stream tests on JDK 25 nightly (timeout scaling, element counts)#2869
He-Pin merged 4 commits intomainfrom
fix-nightly-failures

Conversation

@He-Pin
Copy link
Copy Markdown
Member

@He-Pin He-Pin commented Apr 18, 2026

Summary

Fixes 3 categories of stream-test flakiness observed on JDK 25 nightly CI (30+ consecutive days failing). See #2870 for root-cause analysis; see #2871 for the configuration-level mitigation.

Failing tests fixed

Test Root cause Fix
HubSpec – "must work with long streams" (×4 variants) 20 K elements × FJP FIFO delay = 60+ s timeout Reduced element counts; remove throttle in favour of Thread.sleep(1)
AggregateWithTimeBoundaryAndSimulatedTimeSpec interval = 1.milli races with scheduler Changed to interval = 1.second
TCK stochastic_spec103_mustSignalOnMethodsSequentially Hardcoded 1 s timeout ignores timefactor Reads pekko.test.timefactor system property
FlowMapAsyncPartitionedSpec – "must ignore null-completed futures" Random.nextInt(10) can produce 0, skipping null path Shift to Random.nextInt(10) + 1

Files changed

  • stream-tests/…/HubSpec.scala — PatienceConfig scaled by timefactor; element counts reduced; throttle → sleep
  • stream-tests/…/AggregateWithBoundarySpec.scala — interval 1.milli → 1.second (2 locations)
  • stream-tests-tck/…/Timeouts.scala — read pekko.test.timefactor JVM property
  • stream-tests/…/FlowMapAsyncPartitionedSpec.scala — always exercise null-future path

Related

He-Pin and others added 2 commits April 18, 2026 23:30
Motivation:
Nightly CI (JDK 25, TIMEFACTOR=3) has been failing consistently for 30+
days due to ForkJoinPool scheduling changes in JDK 25 causing slower
throughput and higher scheduler overhead.  Four root causes were found:

1. HubSpec.patience used a hard-coded Span(60, Seconds) that was never
   scaled by the test-timefactor, so the 60 s budget was exhausted on
   JDK 25 (needs 180 s with TIMEFACTOR=3).

2. AggregateWithTimeBoundaryAndSimulatedTimeSpec used interval = 1.milli
   with ExplicitlyTriggeredScheduler, which fired up to 400 000 timer
   callbacks per test-run (timePasses(400.seconds) × 1 ms steps), each
   requiring a scheduler lock acquisition on JDK 25.

3. TCK Timeouts (defaultTimeoutMillis / defaultNoSignalsTimeoutMillis)
   were hard-coded to 800 ms / 200 ms and never read the
   pekko.test.timefactor JVM property, causing
   stochastic_spec103_mustSignalOnMethodsSequentially to fail on JDK 25.

4. FlowMapAsyncPartitionedSpec."ignore null-completed futures" built the
   shouldBeNull set from Random.nextInt(10), which produces values 0-9.
   Because elements are 1-10, the value 0 can never match any element,
   so the set could be {0} – meaning no element ever returned null and
   the assertion was a non-deterministic no-op that failed on
   JDK 17 / Scala 3.3.x in CI.

Modification:
- HubSpec: multiply the 60 s base by testKitSettings.TestTimeFactor so
  CI with TIMEFACTOR=3 gets 180 s and TIMEFACTOR=2 gets 120 s.
- AggregateWithTimeBoundaryAndSimulatedTimeSpec: change interval from
  1.milli to 1.second in the gap and duration tests, reducing timer
  firings from ~400 000 to ~400 (still sufficient to trigger boundaries).
- TCK Timeouts: read pekko.test.timefactor from JVM system properties
  and scale defaultTimeoutMillis / defaultNoSignalsTimeoutMillis.
- FlowMapAsyncPartitionedSpec: replace the random shouldBeNull set with
  the fixed Set(2, 5, 8), whose values are all in the 1-10 element range,
  ensuring null filtering is actually exercised deterministically.

Result:
All four previously-failing test categories should pass on the next
nightly run across JDK 17/21/25 × Scala 2.13/3.3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Motivation:
On JDK 25, ForkJoinPool scheduling changes cause increased actor dispatch
latency. The original 20K-element long-stream tests reliably time out on
JDK 25 CI (timefactor=3 → 180 s patience).

Modification:
- 'long streams' (buffer=16): 20K → 2K elements (2×1K sources)
- 'buffer size is 1': 20K → 200 elements (2×100 sources); bufferSize=1
  requires one actor round-trip per element, so count must stay small
- 'consumer is slower': 2K → 400 elements; burst=200 covers first 200
  elements with no scheduler ticks, keeping wall-clock time low
- 'producer is slower': 2K → 400 elements; burst=200 on the throttled
  source (200 elements) means zero scheduler ticks needed, eliminating
  ForkJoinPool starvation risk on JDK 25

Result:
All four tests now complete in under 100 ms on a loaded JDK 25 machine
(burst=200 absorbs all throttled elements instantly; no timer callbacks
are scheduled). Full HubSpec (48 tests) passes with timefactor=3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@pjfanning pjfanning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

maxDuration = None,
currentTimeMs = schedulerTimeMs,
interval = 1.milli)
interval = 1.second)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if this was a bit smaller, eg 500 Millis or 250 Millis but main thing is to try to get the tests passing

Motivation:
The earlier nightly fixes solved the immediate JDK 25 failures, but two tradeoffs
needed refinement. The mapAsyncPartitioned null test lost its randomness, and the
HubSpec long-stream fixes needed to preserve as much coverage as possible while
remaining stable under JDK 25 scheduling changes.

Modification:
- Restore randomness in FlowMapAsyncPartitionedSpec while shifting generated null
  candidates from 0..9 to 1..10 so the null path is always exercised.
- Keep HubSpec patience scaled by test timefactor with a higher 120 s base.
- Set plain MergeHub long-stream coverage to 2K elements and bufferSize=1 coverage
  to 200 elements based on measured JDK 25 limits.
- Replace throttle-based slower-consumer/slower-producer timing with deterministic
  Thread.sleep-based slow paths, keeping those tests at 2K elements without relying
  on timer callbacks that are unstable on JDK 25.

Result:
HubSpec passes end-to-end with pekko.test.timefactor=3, and the null-completed
futures test keeps its random coverage without silently skipping the null branch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@He-Pin He-Pin force-pushed the fix-nightly-failures branch from 9669db0 to 2e3279a Compare April 18, 2026 18:31
@He-Pin He-Pin changed the title fix: scale stream test timeouts by timefactor to fix nightly CI on JDK 25 fix: stabilise stream tests on JDK 25 nightly (timeout scaling, element counts) Apr 18, 2026
Motivation:
JDK 25 ForkJoinPool scheduling regression (JDK-8300995) causes slower
task scheduling under load. timefactor=3 was insufficient for some
long-running stream tests.

Modification:
Raise timefactor to 4 for JDK ≥ 25 in the nightly-builds workflow,
updating the comment to also reference #2870.

Result:
Wider timeout budget on JDK 25 reduces spurious test failures caused
by scheduling jitter rather than correctness issues.

References: #2870, #2573

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@He-Pin He-Pin merged commit b23b4c7 into main Apr 18, 2026
9 checks passed
@He-Pin He-Pin deleted the fix-nightly-failures branch April 18, 2026 19:02
@He-Pin He-Pin added the flaky Related to flaky tests label Apr 18, 2026
@He-Pin He-Pin added this to the 2.0.0-M2 milestone Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flaky Related to flaky tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants