Hazelcast 3.5.x Memory Leak using IMap #6317

Closed
jocelynlepage opened this Issue Sep 29, 2015 · 9 comments

Projects

None yet

5 participants

@jocelynlepage

We are experimenting memory leaks when using IMaps.

Based on our observations, the leak seems related to the fact that we're regularly destroying maps.
Based on Heap analysis, the culprit seems to be around com.hazelcast.map.impl.MapServiceContextImpl.

Setup:

  • Hazelcast 3.5.2, JDK 1.7/1.8, reproduced on Linux/Windows 7 and Mac OSX 10.9.5
  • 2 nodes with embedded Hazelcast (e.g. not using separate client JVMs)

Scenario:

  • One of our node is creating a new IMap roughly once per minute and put stuff in it at regular interval (few hundreds to few thousands per minute).
  • Few seconds/minutes later, the other node is removing data from the map, and if the map is empty, destroys the map.
  • We do notice a steady growth of the heap usage, which seems related to Hazelcast IMaps internal management.

We did run a couple of variants with the following observations:

  • Hazelcast versions 3.5.1 and 3.5.2 exhibits the same memory leak behavior
  • We tested both with JMX/ManCenter enabled and disabled, without changes on the results.
  • We disable Map "rotation" (e.g. a single map is used instead) and we didn't noticed the memory leak.
  • We also tested with version 3.4.6, problem is NOT present - seems something introduced in 3.5.x

We wanted to update from 3.4.x since as our production system is affected by another memory leak documented here: #4888

Please find a small project on GitHub which demonstrates the problem: https://github.com/jocelynlepage/hz-map-leak
This project includes a couple of heap dump files (produced with jmap) provided for analysis.

Thanks,
Jocelyn

@jerrinot jerrinot added this to the 3.6 milestone Sep 29, 2015
@jerrinot jerrinot added Team: Core VERIFIED and removed PENDING labels Sep 29, 2015
@jerrinot
Contributor

@jocelynlepage: Thank you for reporting the issue and for a great reproducer!

I managed to reproduce it. I can see that approximately half of partitions on each member are cleaned, but the other half still retains data. I reckon it's related to backups.

@jerrinot
Contributor

Heapdump: http://54.87.52.100/~jara/dumps/issue6317/java_pid14221.0001_.hprof.gz
This is from 3.5.2.

For whoever will fix it: Please ping me so I can delete the dump.

@jerrinot
Contributor

I believe it's a side-effect of #5649

Because of this check it MapPartitionDestroyOperation is executed on partition owners only -> The backup replicas are not touched at all. The MapPartitionDestroyOperation should implement the BackupAwareOperation interface.

The question is how to backport it into 3.x branch. As we cannot simply add a new MapPartitionDestroyBackupOperation due compatibility in patch-level releases.

@jerrinot jerrinot added a commit to jerrinot/hazelcast that referenced this issue Oct 1, 2015
@jerrinot jerrinot Fixes #6317
MapPartitionDestroyOperation is now BackupAware to destroy containers on replicas
bff44c1
@jerrinot jerrinot added a commit to jerrinot/hazelcast that referenced this issue Oct 2, 2015
@jerrinot jerrinot Fixes #6317
MapPartitionDestroyOperation is now BackupAware to destroy containers on replicas
f852df8
@bwzhang2011

@jerrinot, any update with such issue ? does it only occur under the specific scenery as the one described above ?

@jerrinot jerrinot added a commit to jerrinot/hazelcast that referenced this issue Oct 2, 2015
@jerrinot jerrinot Execute the MapPartitionDestroyOperation on all local partitions even…
… they are just backups replicas

Fixes #6317
It's not a direct backport of #6347 as we cannot add a new operation into a maintenance branch.
ee0f127
@jerrinot
Contributor
jerrinot commented Oct 2, 2015

@bwzhang2011: PRs fixing the leak in both master and maintenance branches are pending. This fix will be part of Hazelcast 3.5.3 which should be released soon.

@jerrinot jerrinot added a commit to jerrinot/hazelcast that referenced this issue Oct 2, 2015
@jerrinot jerrinot Execute the MapPartitionDestroyOperation on all local partitions even…
… they are just backups replicas

Fixes #6317
It's not a direct backport of #6347 as we cannot add a new operation into a maintenance branch.
8d56553
@jerrinot jerrinot closed this in #6347 Oct 2, 2015
@bwzhang2011

@jerrinot, thanks for feed back and make quick fix and backported plan. looking forward to new hz3.5.3 releasing and more issues to be fixed afterwards.

@jocelynlepage

Thanks guys for the quick turnaround.
Looking forward to update to 3.5.3 once available.

@jerrinot jerrinot added a commit to jerrinot/hazelcast that referenced this issue Oct 8, 2015
@jerrinot jerrinot Execute the MapPartitionDestroyOperation on all local partitions even…
… they are just backups replicas

Fixes #6317
It's not a direct backport of #6347 as we cannot add a new operation into a maintenance branch.
e33565d
@jerrinot jerrinot added a commit to jerrinot/hazelcast that referenced this issue Oct 8, 2015
@jerrinot jerrinot Execute the MapPartitionDestroyOperation on all local partitions even…
… they are just backups replicas

Fixes #6317
It's not a direct backport of #6347 as we cannot add a new operation into a maintenance branch.
(cherry picked from commit e33565d)
4b63f8a
@jerrinot jerrinot added a commit that referenced this issue Oct 8, 2015
@jerrinot jerrinot Execute the MapPartitionDestroyOperation on all local partitions even…
… they are just backups replicas

Fixes #6317
It's not a direct backport of #6347 as we cannot add a new operation into a maintenance branch.
(cherry picked from commit e33565d)
2e98b8b
@tombujok tombujok added a commit to tombujok/hazelcast that referenced this issue Oct 14, 2015
@jerrinot @tombujok jerrinot + tombujok Fixes #6317
MapPartitionDestroyOperation is now BackupAware to destroy containers on replicas
68023f5
@IsuraD
IsuraD commented Feb 22, 2016

HI jerrinot,

Is this issue fixed in 3.5.3 ?

@jerrinot
Contributor

@IsuraD: Hi, it is. See #6400

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment