Skip to content

optimize: FlattenMerge avoids substream materialization for value-presented sources#2977

Open
He-Pin wants to merge 5 commits into
apache:mainfrom
He-Pin:optimize-flatten-merge-avoid-materialization
Open

optimize: FlattenMerge avoids substream materialization for value-presented sources#2977
He-Pin wants to merge 5 commits into
apache:mainfrom
He-Pin:optimize-flatten-merge-avoid-materialization

Conversation

@He-Pin
Copy link
Copy Markdown
Member

@He-Pin He-Pin commented May 17, 2026

Motivation

FlattenMerge previously only fast-pathed SingleSource. Every other
inner source — Source.empty, Source(List), Source.fromJavaStream,
Source.future(Future.successful(...)), range / iterator / repeat
sources — paid the cost of materializing a SubSinkInlet plus a full
subFusingMaterializer.materialize(...) round-trip. FlattenConcat
already eliminates this via TraversalBuilder.getValuePresentedSource
(introduced in 1.2.0); FlattenMerge should benefit too.

This is especially important because the single-argument
flatMapConcat(f) is implemented as via(new FlattenMerge(1)), so all
default flatMapConcat users go through FlattenMerge and miss out on
the value-presented optimization until now.

Modification

  • Switch FlattenMerge.addSource from getSingleSource to
    getValuePresentedSource, dispatching on SingleSource,
    IterableSource, IteratorSource, RangeSource, RepeatSource,
    JavaStreamSource, FutureSource, FailedSource, and empty sources
    in place — mirroring FlattenConcat.
  • Introduce an InflightSource[T] family in a new FlattenMerge
    companion to occupy a breadth slot for multi-element value-presented
    sources without materializing a substream. Tracked via a new
    pendingInflightSources counter so activeSources still bounds the
    breadth budget correctly.
  • Preserve merge semantics: after each push from a multi-element
    inflight source, re-enqueue it so other concurrent sources keep
    interleaving (rather than draining one source first as FlattenConcat
    does).
  • Completed Futures / FailedSource are folded directly: success
    pushes or queues a single element; failure calls failStage.
  • Pending Futures register an getAsyncCallback and occupy a slot
    until completion.
  • Empty inner sources are discarded in place and consume no slot.

FlattenMerge is @InternalApi private[pekko] so this is purely an
internal performance change; MiMa is clean.

Result

  • flatMapMerge(breadth, ...) and the default flatMapConcat(...)
    (which routes through FlattenMerge(1)) skip substream materialization
    for value-presented inner sources, cutting per-source GC and stage
    overhead.
  • Behaviour is unchanged for non-value-presented sources (regular
    Sources built from graphs, lazy futures, etc.) — they still
    materialize a SubSinkInlet as before.

Tests

  • All existing FlowFlattenMergeSpec tests pass.
  • All existing FlowFlatMapConcatParallelismSpec tests pass (covers the
    FlattenMerge(1) path used by single-arg flatMapConcat).
  • New tests in FlowFlattenMergeSpec mirror
    FlowFlatMapConcatParallelismSpec:
    • work with value presented sources with breadth: {1,2,4,8,16,32,64,128}
      — covers empty / single / iterable / completed-future / lazy-future /
      delayed-future inner sources.
    • work with generated value presented sources with breadth: ...
      randomised mix of value-presented sources at scale.
    • work with value presented failed sources — failure inside the
      optimized path.
    • avoid pre-materialization for value-presented sources and
      not materialize value-presented sources — assert the fast path
      actually fires.

Local commands run:

  • sbt "stream-tests/testOnly org.apache.pekko.stream.scaladsl.FlowFlattenMergeSpec" → 37/37 pass
  • sbt "stream-tests/testOnly *FlatMap* *Flatten* *PrefixAndTail* *Concat*Spec" → 232/232 pass
  • sbt scalafmtAll headerCreateAll
  • sbt "stream/mimaReportBinaryIssues" → no issues

References

  • Mirrors the value-presented-source optimization added to
    FlattenConcat in 1.2.0 (TraversalBuilder.getValuePresentedSource).

He-Pin added 2 commits May 17, 2026 18:09
…sented sources

Motivation:
FlattenMerge previously only fast-pathed `SingleSource`. For all other
inner sources -- `Source.empty`, `Source(List)`, `Source.fromJavaStream`,
`Source.future(Future.successful(...))`, range/iterator/repeat sources --
each one paid the cost of materializing a `SubSinkInlet` and
`subFusingMaterializer.materialize(...)`. FlattenConcat already does this
optimization via `TraversalBuilder.getValuePresentedSource`; FlattenMerge
should benefit too, especially because the single-arg `flatMapConcat(f)`
internally uses `FlattenMerge(1)` and so depends on FlattenMerge for its
hot path.

Modification:
- Generalize FlattenMerge to dispatch on `getValuePresentedSource`
  (instead of `getSingleSource`) and consume `SingleSource`,
  `IterableSource`, `IteratorSource`, `RangeSource`, `RepeatSource`,
  `JavaStreamSource`, `FutureSource`, `FailedSource`, and empty sources
  in-place without materialization, mirroring FlattenConcat.
- Add an `InflightSource[T]` family inside the new `FlattenMerge`
  companion to occupy a breadth slot for multi-element value-presented
  sources. Track them via a new `pendingInflightSources` counter so
  `activeSources` correctly bounds the breadth budget.
- Preserve merge semantics: when an inflight source still has more
  elements after a push, re-enqueue it so other concurrent sources keep
  interleaving (instead of draining one source first, which is the
  concat behaviour).
- Fold completed `Future`s and `FailedSource` directly: success pushes
  or queues a single element, failure calls `failStage`.
- Pending `Future`s register a callback via `getAsyncCallback` and
  occupy a breadth slot until completion.
- Empty inner sources are discarded in place (no slot consumed).

Result:
- `flatMapMerge(breadth, ...)` and the default `flatMapConcat(...)`
  (which routes through `FlattenMerge(1)`) skip substream materialization
  for value-presented inner sources, reducing per-source GC and stage
  overhead.
- All existing FlattenMerge / flatMapConcat tests pass; new tests cover
  empty / single / iterable / range / java stream / completed and
  delayed future / failed inner sources across breadth = 1..128.
- Internal API only (`@InternalApi private[pekko]`); MiMa is clean.

References:
- The optimization mirrors FlattenConcat's value-presented-source
  handling introduced in 1.2.0.
…enMerge

Motivation:
After the previous commit, FlattenMerge grew its own copy of the
`InflightSource[T]` hierarchy (Iterator/Range/Repeat/CompletedFuture/
PendingFuture) duplicating what FlattenConcat already had in its
companion object. Two near-identical families across two files is a
maintenance hazard: any future tweak to the value-presented optimization
(e.g. adding a new source type, fixing a Java-stream cleanup leak) would
have to be mirrored, and the families had already drifted in small ways
(e.g. `tryPull`/`cancel`/`materialize` declared abstract in concat with
no-op overrides on every subclass; concat used `isClosed = true` for the
completed-future variant while merge used `!_hasNext`).

Modification:
- Extract the common `InflightSource[T]` base and the five value-presented
  subclasses (Iterator/Range/Repeat/CompletedFuture/PendingFuture) into
  a new `pekko.stream.impl.fusing.InflightSources` package-private object.
- Promote `tryPull` / `cancel` / `materialize` from abstract to concrete
  no-op defaults, so the value-presented subclasses no longer carry empty
  overrides. Stages that wrap a real `SubSinkInlet` (only FlattenConcat's
  `attachAndMaterializeSource` does this) override what they need.
- Align `InflightCompletedFutureSource.isClosed` to FlattenConcat's
  `true` semantics — behaviorally equivalent in both stages, but more
  faithful to the source being a one-shot cached value.
- Drop the `sealed` modifier on `InflightSource` so FlattenConcat's
  attached-substream anonymous subclass can still extend it from another
  file in the same package.
- Remove the duplicate definitions from FlattenConcat's companion (now
  unused, drop the empty companion entirely) and from FlattenMerge's
  companion. Both stages import from the shared object instead.

Result:
- Net -176 lines of duplication; one canonical home for the
  optimization's data types.
- Future additions (e.g. extending the optimization to other stream-of-
  streams stages such as `MergeMany`-style operators) only need to
  reference `InflightSources`.
- All FlattenConcat / FlattenMerge / flatMapConcat parallelism tests
  remain green; MiMa is clean (`@InternalApi private[fusing]`).
@He-Pin He-Pin force-pushed the optimize-flatten-merge-avoid-materialization branch from 8f297aa to 81c6aa4 Compare May 17, 2026 11:26
@He-Pin He-Pin added the t:stream Pekko Streams label May 17, 2026
@He-Pin He-Pin added this to the 2.0.0-M3 milestone May 17, 2026
@He-Pin He-Pin requested a review from pjfanning May 17, 2026 11:39
@He-Pin
Copy link
Copy Markdown
Member Author

He-Pin commented May 17, 2026

I will do a follow up optimization around the ++ operation once this got merged.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comment thread stream/src/main/scala/org/apache/pekko/stream/impl/fusing/StreamOfStreams.scala Outdated
He-Pin added a commit to He-Pin/incubator-pekko that referenced this pull request May 18, 2026
Motivation:
Copilot review on PR apache#2977 flagged two issues: (1) Java streams obtained
via Source.fromJavaStream were converted to Scala iterators and fed into
InflightIteratorSource, dropping the BaseStream close contract and leaking
onClose handlers and underlying resources; (2) the test purporting to
verify "no pre-materialization for value-presented sources" actually
counted lazySingle materializations and was misleading because Source.lazySingle
is itself non-VP.

Modification:
- Add InflightJavaStreamSource in InflightSources.scala that wraps the
  BaseStream directly, eagerly closes empty streams, closes on exhaustion,
  and closes on cancel.
- Wire the new wrapper through FlattenConcat.addJavaStreamSource and
  FlattenMerge.addInflightJavaStreamSource so both stages honor the
  close contract on the value-presented fast path.
- Cancel queued inflight sources in FlattenMerge.postStop so JavaStream
  resources held in the queue (not yet promoted to active SubSinkInlets)
  are released on stage termination.
- Replace the misleading test with one that mixes value-presented and
  non-VP inner sources via lazySingle().buffer() and asserts the counter
  equals only the non-VP count, proving the VP fast path skips
  materialization.
- Add two regression tests for the close contract: exhaustion of
  finite Java streams and downstream cancel against infinite ones.

Result:
Java streams routed through the VP fast path now close deterministically
on exhaustion, cancel, and stage termination. The materialization-skip
property is demonstrated by a meaningful counter test rather than a
tautology. All 39 FlowFlattenMergeSpec tests pass.
Motivation:
Copilot review on PR apache#2977 flagged two issues: (1) Java streams obtained
via Source.fromJavaStream were converted to Scala iterators and fed into
InflightIteratorSource, dropping the BaseStream close contract and leaking
onClose handlers and underlying resources; (2) the test purporting to
verify "no pre-materialization for value-presented sources" actually
counted lazySingle materializations and was misleading because Source.lazySingle
is itself non-VP.

Modification:
- Add InflightJavaStreamSource in InflightSources.scala that wraps the
  BaseStream directly, eagerly closes empty streams, closes on exhaustion,
  and closes on cancel.
- Wire the new wrapper through FlattenConcat.addJavaStreamSource and
  FlattenMerge.addInflightJavaStreamSource so both stages honor the
  close contract on the value-presented fast path.
- Cancel queued inflight sources in FlattenMerge.postStop so JavaStream
  resources held in the queue (not yet promoted to active SubSinkInlets)
  are released on stage termination.
- Replace the misleading test with one that mixes value-presented and
  non-VP inner sources via lazySingle().buffer() and asserts the counter
  equals only the non-VP count, proving the VP fast path skips
  materialization.
- Add two regression tests for the close contract: exhaustion of
  finite Java streams and downstream cancel against infinite ones.

Result:
Java streams routed through the VP fast path now close deterministically
on exhaustion, cancel, and stage termination. The materialization-skip
property is demonstrated by a meaningful counter test rather than a
tautology. All 39 FlowFlattenMergeSpec tests pass.
@He-Pin He-Pin force-pushed the optimize-flatten-merge-avoid-materialization branch from 12ffef2 to 5f731d5 Compare May 18, 2026 03:58
Motivation:
PR apache#2977 CI failed on Scala 3.3.7 with a Type Mismatch in FlattenConcat
and FlattenMerge: the dispatch pattern matched JavaStreamSource[T, _] and
forwarded the value to a helper requiring [S <: BaseStream[T, S]]. Scala 2
implicitly skolemized the existential, but Scala 3 does not, causing the
stream module to fail to compile on Scala 3.

Modification:
- Drop the recursive S type parameter from InflightJavaStreamSource;
  internally only iterator() and close() are invoked, both of which work
  on BaseStream[T, _].
- Drop the matching S type parameter from FlattenConcat.addJavaStreamSource
  and FlattenMerge.addInflightJavaStreamSource, accepting JavaStreamSource[T, _]
  directly so the existential never needs to be opened.

Result:
Scala 3.3.7 cross-compile is clean, Scala 2.13 still compiles, MiMa is
green, scalafmt/headerCheck pass, and FlowFlattenMergeSpec (39/39) plus
FlowFlattenConcatSpec / FlowFlatMapConcatSpec all pass on both Scala
versions.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Comment thread stream/src/main/scala/org/apache/pekko/stream/impl/fusing/InflightSources.scala Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@He-Pin
Copy link
Copy Markdown
Member Author

He-Pin commented May 18, 2026

Hope this can improve some performance of pekko-http

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t:stream Pekko Streams

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants