Remove GC candidates found to still be referenced#3738
Conversation
570f69b to
8ce0dac
Compare
* Switch to using the GcCandidate class for Gc file reference deletions * Enables deletion mutations to only delete unique deletion candidate entries. * Adds tests to GarbageCollectorIT to ensure that new candidates persist when old candidates are deleted. * Removes InUse candidates from the metadata table * Modifies the Garbage Collector to remove GcCandidates that map to file references currently used by tablets. * Consolidate deletion path for GcCandidates
core/src/main/java/org/apache/accumulo/core/gc/GcCandidate.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/apache/accumulo/core/gc/GcCandidate.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| @Test | ||
| public void testDeletingInUseReferenceCandidates() throws Exception { |
There was a problem hiding this comment.
If they are not covered elsewhere, would be nice to cover a few more things in this test.
- test directory candidates w/ the in use case
- test multiple candidates w/ the in use case where some are in use and some are not
- verify the algorithm passes the expected GcCandidateType and timestamp when it does delete a candidate
Thinking that TestGCE.addCandidate could be modified to return the added candidate so would always know the expected timestamp for the in use case. TestGCE may need be modified to track more info to verify the expected GcCandidateType and timestamp when deleting a candidate
There was a problem hiding this comment.
Test cases now cover directory candidates, and have covered mixed candidates in the metadata of valid and inUse. types.
See the assertRemoved vs assertCandidateRemoved methods for validation of if we've deleted the file vs just removed the candidate.
addCandidate was modified to return the candidate which cut down on new object creation calls and tracking timestamps.
There was a problem hiding this comment.
Those test changes look good. Played around locally with tracking the candidate type to see if that was workable and it seemed it is. The following changes are where I tried that. I only asserted it in one random place, not sure how many more places the type could be asserted.
diff --git a/server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java b/server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java
index 1515910287..6e866610e3 100644
--- a/server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java
+++ b/server/gc/src/test/java/org/apache/accumulo/gc/GarbageCollectionTest.java
@@ -19,6 +19,7 @@
package org.apache.accumulo.gc;
import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
@@ -55,7 +56,7 @@ public class GarbageCollectionTest {
static class TestGCE implements GarbageCollectionEnvironment {
TreeSet<GcCandidate> candidates = new TreeSet<>();
- TreeSet<GcCandidate> deletedCandidates = new TreeSet<>();
+ TreeMap<GcCandidate, GcCandidateType> deletedCandidates = new TreeMap<>();
ArrayList<String> blips = new ArrayList<>();
Map<String,Reference> references = new TreeMap<>();
HashSet<TableId> tableIds = new HashSet<>();
@@ -123,7 +124,7 @@ public class GarbageCollectionTest {
if (type.equals(GcCandidateType.INUSE) && this.level.equals(Ample.DataLevel.ROOT)) {
return;
}
- deletedCandidates.addAll(refCandidates);
+ refCandidates.forEach(gcCandidate -> deletedCandidates.put(gcCandidate, type));
this.candidates.removeAll(refCandidates);
}
@@ -226,12 +227,20 @@ public class GarbageCollectionTest {
private void assertCandidateRemoved(TestGCE gce, GcCandidate... gcCandidates) {
for (GcCandidate gcCandidate : gcCandidates) {
- assertTrue(gce.deletedCandidates.remove(gcCandidate));
+ assertNotNull(gce.deletedCandidates.remove(gcCandidate));
}
assertEquals(0, gce.deletedCandidates.size(),
"Deleted Candidates not empty: " + gce.deleteInUseRefs);
}
+ private void assertCandidateRemoved(TestGCE gce, GcCandidateType gcCandidateType, GcCandidate... gcCandidates) {
+ for (GcCandidate gcCandidate : gcCandidates) {
+ assertEquals(gcCandidateType, gce.deletedCandidates.remove(gcCandidate));
+ }
+ assertEquals(0, gce.deletedCandidates.size(),
+ "Deleted Candidates not empty: " + gce.deleteInUseRefs);
+ }
+
// This test was created to help track down a ConcurrentModificationException error that was
// occurring with the unit tests once the GC was refactored to use a single iterator for the
// collect process. This was a minimal test case that would cause the exception to occur.
@@ -1012,7 +1021,7 @@ public class GarbageCollectionTest {
var removedCandidate = gce.addCandidate("6/t-0/F003.rf");
gca.collect(gce);
- assertCandidateRemoved(gce, removedCandidate);
+ assertCandidateRemoved(gce, GcCandidateType.INUSE, removedCandidate);
assertRemoved(gce);
// Check and make sure the InUse directory candidates are not removed.
assertEquals(1, gce.candidates.size());
There was a problem hiding this comment.
Added the GcCandidateType to the standard assertCandidateRemoved method and ensured the proper type was passed for each test.
I also added the assertNoCandidatesRemoved(TestGCE gce) method so that a unused GcCandidateType did not need to be added for validating an empty candidates TreeMap
There was a problem hiding this comment.
Looks like you were always able to assert the type, that is nice. I was not sure if that would be possible.
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| context.getAmple().deleteGcCandidates(level, processedDeletes); | ||
| deleteGcCandidates(processedDeletes, GcCandidateType.VALID); |
There was a problem hiding this comment.
Is per candidate logging happening anywhere that includes the GcCandidate type?
There was a problem hiding this comment.
I've added a logging statement in the deleteGcCandidates class.
It's technically not per-candidate as we assign a type to a collection of candidates.
test/src/main/java/org/apache/accumulo/test/functional/GarbageCollectorIT.java
Show resolved
Hide resolved
* Modifies the log messages to provide INUSE dir context * Removes the INUSE dir candidates from being added as they will not be recreated and candidate removal could cause empty subdirs to persist.
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java
Show resolved
Hide resolved
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java
Show resolved
Hide resolved
Directory deletion candidates are not recreated unlike file candidates. So if a directory candidate is found to be InUse, it should not be removed from the metadata location. * Updates logging messages * Moves property check up to the GC algorithm level. * Fixes incorrect javadoc return message * Adds test for InUse Root Candidates
* Adds GcCandidateType validation when checking deleted candidates. * Added method for asserting no candidates have been removed.
The current GC behavior does not remove InUse candidates and would spend subsequent runs processing and discarding the same InUse Candidates in each batch size.
This is inefficient and can lead to unnecessarily long GC run times in high-load system environments.
This change adds the ability to remove Garbage Collection Candidates that still have matching tablet references.
A new class called GcCandidate and a GcCandidateType Enum have been added. The new GcCandidate type values are "VALID, INUSE, and INVALID".
A new property has been added called
GC_REMOVE_IN_USE_CANDIDATESwhich controls if the garbage collector deletes the InUse candidates.This property is disabled by default to ensure that existing behavior is not changed.
Additional tests have been added to the GarbageCollectorIT and GarbageCollectionTest classes.
This PR contains the relevant changes for the 2.1.x branch .
Supporting removal of InUse candidates for the Root table breaks version compatibility for 2.1.x, so that work has been targeted for a 3.1.x release.
Once the 3.1.x work is completed #3693 can be closed.