Skip to content
This repository has been archived by the owner on Mar 31, 2023. It is now read-only.

Commit

Permalink
GEODE-9854: Orphaned .drf file causing memory leak (apache#7145)
Browse files Browse the repository at this point in the history
* GEODE-9854: Orphaned .drf file causing memory leak

Issue:
An OpLog files are compacted, but the .drf file is left because it contains deletes of
entries in previous .crfs. The .crf file is deleted, but the orphaned .drf is not until all
previous .crf files (.crfs with smaller id) are deleted.

The problem is that compacted Oplog object representing orphaned .drf file holds
a structure in memory (Oplog.regionMap) that contains information that is not useful
after the compaction and it takes certain amount of memory. Besides, there is a race
condition in the code when creating .krf files that, depending on the execution order,
could make the problem more severe (it could leave pendingKrfTags structure
on the regionMap and this could take up a significant amount of memory). This
pendingKrfTags HashMap is actually empty, but consumes memory because it was used
previously and the size of the HashMap was not reduced after it is cleared.
This race condition usually happens when new Oplog is rolled out and previous Oplog
is immediately marked as eligible for compaction. Compaction and .krf creation start at
the similar time and compactor cancels creation of .krf if it is executed first.
The pendingKrfTags structure is usually cleared when .krf file is created, but since
compaction canceled creation of .krf, the pendingKrfTags structure remain in memory
until Oplog representing orphaned .drf file is deleted.

Solution:
Clear the regionMap data structure of the Oplog when it is compacted (currently it is
deleted when the Oplog is destroyed).

* introduced inner static class RegionMap in Oplog
* RegionMap.get() will return always empty map if it was closed before
* When closing disk region skip adding only drf oplog to unrecovered
map and also don't try to remove it from regionMap (it was already
removed during compaction).

* Following test cases are introduced:

1. Recovery of single region after cache is closed and then recreated
(testCompactorRegionMapDeletedForOnlyDrfOplogAfterCompactionAndRecoveryAfterCacheClosed)

2. Recovery of single region after region is closed and then recreated
(testCompactorRegionMapDeletedForOnlyDrfOplogAfterCompactionAndRecoveryAfterRegionClose)

Co-authored-by: Alberto Gomez <alberto.gomez@est.tech>
  • Loading branch information
jvarenina and albertogpz committed Dec 9, 2021
1 parent c65f048 commit 324ed89
Show file tree
Hide file tree
Showing 2 changed files with 457 additions and 20 deletions.

0 comments on commit 324ed89

Please sign in to comment.