-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory consumption using Grouped operator in Akka Streams #25623
Comments
That's an interesting finding. Wouldn't an appropriate solution be to create a new We use |
@dembol would you be able to create a pull request? |
Sure, it would be nice to start contributing :) I'll investigate a ByteStringBuilder#clear too and prepare a PR this week. |
That seems to be also a candidate for an upstream fix in Scala itself. Might make sense to check if that is a known issue for |
@dembol Are you still plan to contribute to this? |
I updated it in scala/scala#10019 |
@patriknw I think once Akka updated to scala 2.12.16, then this issue can be closed then. |
If it is fixed upstream I think we can close it already now since Akka does not affect the Scala patch version of consuming project? |
I agree. Fixed by scala/scala#10019 due for Scala 2.12.16, thanks for following up on that, @hepin1989! |
(Not adding a milestone since nothing seems appropriate, technically it's a "will not fix (here)".) |
I’ve noticed a serious problem with high memory consumption while using streams terminated by SinkRef and
grouped
operator somewhere in a stream topology. Thegrouped
operator is backed by a VectorBuilder which is cleared whenonPush
oronUpstreamFinish
callbacks are executed. Unfortunately clearing the vector in Scala 2.12 means allocating new root node but still keeping hard references to the obsolete nodes on the higher levels of a tree which cannot be Garbage Collected. I see that some additional cleaning has been added in Scala 2.13 which will eliminate the problem in the future - scala/scala@845b0f0#diff-59f3462485b74027de4fd5e9febcc81bR627The situation can be very dangerous when somebody wants to use
grouped
operator with high number of grouped elements, connect it to the sink produced byStreamRefs.sourceRef()
and send materializedSourceRef
over the network. Why it’s so dangerous?SourceRefImpl
andSinkRefImpl
start watching their partners what causes sending a WATCH SystemMessage with watchee and watcher remote paths. Those paths are deserialized bySystemMessageSerializer.deserializeSystemMessage
which in turn callsRemoteActorRefProvider.resolveActorRef
which usesActorRefResolveThreadLocalCache
backed byLruBoundedCache
. If we havegrouped
stage with lots of heavy elements we may observe OutOfMemoryErrors in short time if we materialize many such streams.Here are screenshots from profiler:
The quick fix is to use modified
grouped
operator using a List instead of a Vector (https://gist.github.com/dembol/b69d205ca35af7ec19453e66affbb10c) or usegrouped
operator with less than 32 elements.The text was updated successfully, but these errors were encountered: