New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in GelfChunkAggregator #1488

Merged
merged 1 commit into from Oct 16, 2015

Conversation

Projects
None yet
2 participants
@bernd
Member

bernd commented Oct 15, 2015

The memory leak showed up because of a problem in the ChunkEntry#compareTo() method.

If two ChunkEntry objects had the same firstTimestamp value, the compareTo() method returned a wrong result because only the timestamp has been taken into account. This made sortedEvictionSet.remove(entry) not remove the entry from the set in some cases.

The GelfChunkAggregator class maintains a chunks ConcurrentMap and a sortedEvictionSet ConcurrentSkipListSet with ChunkEntry objects. The ChunkEvictionTask gets the first entry from the sortedEvictionSet and basically removes it from the chunks map and then tries to remove it from the sortedEvictionSet as well.

The removal from the chunks map works, the removal from the sortedEvictionSet does not work. (due to the compareTo() bug)

During the next scheduled execution of ChunkEvictionTask#run(), the same entry is returned from the sortedEvictionSet. This entry is not in the chunks map anymore so chunks.remove(entry) returns null. The return value is used to remove the entry from the sortedEvictionSet. This throws a NullPointerException because the return value was null. This NPE was ignored and not logged.

This change fixes the ChunkEntry#compareTo() method to compare the ID if both timestamps are equal. We also now log any error that happens during a ChunkEvictionTask#run() execution.

This issue was found, debugged and reported in great detail with a patch to fix it by @lightpriest and @onyxmaster. Thank you!

See #1462 for details.

Fixes #1462

Fix memory leak in GelfChunkAggregator
The memory leak showed up because of a problem in the
ChunkEntry#compareTo() method.

If two ChunkEntry objects had the same firstTimestamp value, the compareTo()
method returned a wrong result because only the timestamp has been taken
into account. This made sortedEvictionSet.remove(entry) not remove the
entry from the set in some cases.

The GelfChunkAggregator class maintains a chunks ConcurrentMap and a
sortedEvictionSet ConcurrentSkipListSet with ChunkEntry objects. The
ChunkEvictionTask gets the first entry from the sortedEvictionSet and
basically removes it from the chunks map and then tries to remove it
from the sortedEvictionSet as well.

The removal from the chunks map works, the removal from the
sortedEvictionSet does not work. (due to the compareTo() bug)

During the next scheduled execution of ChunkEvictionTask#run(), the same
entry is returned from the sortedEvictionSet. This entry is not in the
chunks map anymore so chunks.remove(entry) returns null. The return
value is used to remove the entry from the sortedEvictionSet. This
throws a NullPointerException because the return value was null.
This NPE was ignored and not logged.

This change fixes the ChunkEntry#compareTo() method to compare the ID if
both timestamps are equal. We also now log any error that happens during a
ChunkEvictionTask#run() execution.

This issue was found, debugged and reported in great detail with a patch
to fix it by @lightpriest and @onyxmaster. Thank you!

See #1462 for details.

Fixes #1462

@bernd bernd added this to the 1.2.2 milestone Oct 15, 2015

@kroepke kroepke self-assigned this Oct 16, 2015

@kroepke

This comment has been minimized.

Member

kroepke commented Oct 16, 2015

lgtm!

kroepke added a commit that referenced this pull request Oct 16, 2015

Merge pull request #1488 from Graylog2/issue-1462
Fix memory leak in GelfChunkAggregator

@kroepke kroepke merged commit 6bd8345 into 1.2 Oct 16, 2015

3 checks passed

ci Jenkins build graylog2-server-integration-pr 285 has succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@kroepke kroepke deleted the issue-1462 branch Oct 16, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment