Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Brain messages still occupies memory after recovery #10325

sjursky opened this issue Apr 12, 2017 · 2 comments

Split Brain messages still occupies memory after recovery #10325

sjursky opened this issue Apr 12, 2017 · 2 comments


Copy link

@sjursky sjursky commented Apr 12, 2017

we have Hazelcast v3.8 cluster with around 80 nodes distributed over 4 servers (2x24,2x16)
Usually the cluster breaks down during server-by-server apps redeployment and recovers after a while.

When we analyzed our last heap dump (~97% full), we found ~66k messages of com.hazelcast.internal.cluster.impl.SplitBrainJoinMessage allocated in 616MB. That's a lot.

Each message occupied 9kB and most of the message (8KB) consists of memberAddresses ArrayList<com.hazelcast.nio.Address> of 80 items (=number of nodes).

Those messages cannot even be GarbageCollected because of a reference from thread com.hazelcast.internal.partition.impl.MigrationThread.

We also noticed this behavior without server restart. After 26h server uptime there were 18k SplitBrainJoinMessages taking 167MB on heap. No restart was done, only some heavy tests and there are no Hazelcast errors in logs other than "MonitorInvocationsTask/BroadcastOperationControlTask delayed" right before heap dump creation.

What can we do to somehow debug this, or cleanup those messages from heap?
If you need more information please let me know.

Thanks in advance.


Hazelcast 3.8
Cluster of 80 members
0 clients, each web application acts as a cluster member.
hazelcast JVM related params: -DscMulticastAddress= -DscMulticastPort=50000 -DscIpTtl=1
OS: Oracle Linux 6.1

Copy link

@mdogan mdogan commented Apr 12, 2017

Hi @sjursky,

Can you please attach a screenshot from heap dump analysis which shows how MigrationThread keeps references to SplitBrainJoinMessages? I'm still struggling to figure out how, a concrete sample will make its easier for us.

Copy link

@sjursky sjursky commented Apr 12, 2017

using memory analyzer merge paths to GC roots

I will try to find out that MigrationThread
currently some screenshots from 18k messages heap is showing com.hazelcast.spi.impl.operationexecutor.impl.PartitionOperationThread


and from 66k messages java.lang.Thread hz._hzInstance_1_UNIUS_APP.MulticastThread


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.