Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flatMap method for collapsing streams into one #4227

Merged
merged 46 commits into from Apr 9, 2024

Conversation

sleeplesslord
Copy link
Contributor

Motivation:
StreamMessage has map, but it is inconvenient to use it when the mapping function itself returns a StreamMessage. This PR adds flatMap which allows applying a function that returns a StreamMessage without ending up with nested StreamMessages.

Modifications:

  • Add FlatMapStreamMessage, which uses a function that returns a StreamMessage to modify the stream.
  • Add StreamMessage.flatMap() to allow easily creating a FlatMapStreamMessage from an existing stream.

Result:

@codecov
Copy link

codecov bot commented Apr 26, 2022

Codecov Report

Attention: Patch coverage is 76.92308% with 48 lines in your changes are missing coverage. Please review.

Project coverage is 74.09%. Comparing base (8194436) to head (b7ac1c4).
Report is 45 commits behind head on main.

❗ Current head b7ac1c4 differs from pull request most recent head 3b35003. Consider uploading reports for the commit 3b35003 to get more accurate results

Files Patch % Lines
...rp/armeria/common/stream/FlatMapStreamMessage.java 76.84% 30 Missing and 17 partials ⚠️
.../linecorp/armeria/common/stream/StreamMessage.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #4227       +/-   ##
===========================================
+ Coverage        0   74.09%   +74.09%     
- Complexity      0    20908    +20908     
===========================================
  Files           0     1810     +1810     
  Lines           0    77148    +77148     
  Branches        0     9867     +9867     
===========================================
+ Hits            0    57161    +57161     
- Misses          0    15328    +15328     
- Partials        0     4659     +4659     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


@Override
public CompletableFuture<Void> whenComplete() {
return source.whenComplete();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we complete this stream when all nested stream are completed?


@Override
public long demand() {
return source.demand();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see we can provide the current demand with source.demand(). How about tracking the demand manually rather than delegating it?


@Override
public void abort() {
source.abort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also abort the nested StreamMessages in progress?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea. If I call source.abort() here, will that propagate to the Subscription so that I can just cancel my nested subscriptions in onError?


private void handleRequest(long n) {
requestedByDownstream = LongMath.saturatedAdd(requestedByDownstream, n);
upstream.request(maxConcurrency - sourceSubscriptions.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any cases where sourceSubscriptions.size() is greater than maxConcurrency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should never happen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add:

if (maxConcurrency - sourceSubscriptions.size() != 0) {...}

so that we don't request with 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we call flush before request? so that we don't call upstream.request(...) if we have enough elements to hand on to the downstream?

.limit(available).forEach(sub -> sub.request(1));
}

private void dump() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about flush()? I found it not easy to understand this behavior with dump().

this.maxConcurrency = maxConcurrency;

sourceSubscriptions = new HashSet<>();
buffer = new LinkedList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using ArrayDeque for less memory footprints?

}

final StreamMessage<U> newStreamMessage = function.apply(item);
newStreamMessage.subscribe(new FlatMapSubscriber<>(this));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
newStreamMessage.subscribe(new FlatMapSubscriber<>(this));
newStreamMessage.subscribe(new FlatMapSubscriber<>(this), executor, options);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to just pass executor here? I don't think there are any options that need to be applied

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ikhoon Gentle ping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried passing the executor here and I'm running into an issue I'm not sure how to address.

This code calls subscribe, which calls onSubscribe on FlatMapSubscriber. That will call subscribeChild on FlatMapAggregatingSubscriber, which will add the subscriber to sourceSubscriptions and then call request on FlatMapSubscriber. Then, if newStreamMessage is already completed it will call onComplete, which leads to the subscription being removed from sourceSubscriptions.

Essentially, in one call chain we go from onSubscribe to sub.request(1) all the way to onComplete, but since that removes the subscription from sourceSubscriptions I get a ConcurrentModificationException since the call to sub.request(1) is done in a forEach.

I guess I should just collect the subs into a new collection before calling forEach to avoid that? I'm not sure if there's a better approach

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about always using executor to remove the child?
executor.execute(() -> handleCompleteChild(child));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that works, we can remove all eventLoop.inEventLoop() check because we only use the executor.

}
canceled = true;

downstream.onError(cause);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto: Cancel all inner StreamMessages?

Comment on lines 234 to 235
final Optional<Long> requested = sourceSubscriptions.stream().map(FlatMapSubscriber::getRequested)
.reduce(LongMath::saturatedAdd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
final Optional<Long> requested = sourceSubscriptions.stream().map(FlatMapSubscriber::getRequested)
.reduce(LongMath::saturatedAdd);
final long requested = sourceSubscriptions.stream().mapToLong(FlatMapSubscriber::getRequested).sum();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that handle overflowing in case getRequested returns Long.MAX_VALUE?

return;
}

final long available = getAvailableBufferSpace();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If available is Long.MAX_VALUE, we don't need backpressure with buffering.

@trustin
Copy link
Member

trustin commented Jun 8, 2022

Gentle ping, @sleeplesslord 🙇

@sleeplesslord
Copy link
Contributor Author

Hey, sorry for letting this PR go stale for so long, I was addressing the comments but then I lost track of things 😅 I think I've covered most of the feedback now

@minwoox minwoox added this to the 1.22.0 milestone Dec 19, 2022
* As per
* <a href="https://github.com/reactive-streams/reactive-streams-jvm#2.13">
* Reactive Streams Specification 2.13</a>, the specified {@link Function} should not return
* a {@code null} value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* a {@code null} value
* a {@code null} value.

this.source = (StreamMessage<T>) source;
this.function = (Function<T, StreamMessage<U>>) function;
this.maxConcurrency = maxConcurrency;
completionFuture = new CompletableFuture<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We usually use EventLoopCheckingFuture for this.

}

final StreamMessage<U> newStreamMessage = function.apply(item);
newStreamMessage.subscribe(new FlatMapSubscriber<>(this));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about always using executor to remove the child?
executor.execute(() -> handleCompleteChild(child));

.filter(sub -> sub.getRequested()
== 0)
.limit(available)
.collect(Collectors.toList());
Copy link
Member

@minwoox minwoox Dec 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
.collect(Collectors.toList());
.collect(toImmutableList());

Yeah, creating a new list also works. 😄
So, Could you remove if (executor.inEventLoop()) {..} else {...} from this class where possible? 😄
e.g. completeChild, onNextChild?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because the FlatMapSubscriber will run with the same executor, so we know that completeChild etc will always run in the same executor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. 😄
All onX methods in FlatMapAggregatingSubscriber and FlatMapSubscriber are now called by the executor so we don't have to check it. 😄

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left some more comments. 😉


private void handleRequest(long n) {
requestedByDownstream = LongMath.saturatedAdd(requestedByDownstream, n);
upstream.request(maxConcurrency - sourceSubscriptions.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add:

if (maxConcurrency - sourceSubscriptions.size() != 0) {...}

so that we don't request with 0.


private void handleRequest(long n) {
requestedByDownstream = LongMath.saturatedAdd(requestedByDownstream, n);
upstream.request(maxConcurrency - sourceSubscriptions.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we call flush before request? so that we don't call upstream.request(...) if we have enough elements to hand on to the downstream?

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there 🙏

@ikhoon ikhoon modified the milestones: 1.23.0, 1.24.0 Mar 15, 2023
@ikhoon ikhoon modified the milestones: 1.24.0, 1.25.0 Jun 2, 2023
@ikhoon ikhoon modified the milestones: 1.25.0, 1.26.0 Jul 31, 2023
@minwoox minwoox modified the milestones: 1.26.0, 1.27.0 Oct 11, 2023
@ikhoon ikhoon modified the milestones: 1.27.0, 1.28.0 Jan 5, 2024
Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a small commit which:

  • Addresses a ConcurrentModificationException thrown when the forEach call inside #requestAllAvailable calls childSubscribers#remove in #completeChild. childSubscribers is copied to mitigate this issue.
  • FlatMapStreamMessage could be closed, whereas the inner FlatMapSubscriber may not be closed. This is possible if abort is called before FlatMapSubscriber#onSubscribe is invoked. I've changed so that FlatMapSubscriber is added to childSubscribers immediately.
  • Miscellaneous possible NPEs were dealt with
  • Flaky tests were dealt with

Let me know if any changes don't make sense.

Lastly, sorry this PR took so long 😅 Looks good to me for merging 👍

@minwoox
Copy link
Member

minwoox commented Apr 1, 2024

Thanks! I also put some miscellaneous changes.

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @sleeplesslord and @jrhee17!

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 🚀 💯

@ikhoon ikhoon merged commit 211a9e5 into line:main Apr 9, 2024
15 of 16 checks passed
@sleeplesslord
Copy link
Contributor Author

Thanks for the fixes and the merge, happy to see this make it in 🙏

@sleeplesslord sleeplesslord deleted the flatmap branch April 9, 2024 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add StreamMessage.flatMap()
5 participants