HDDS-8940. Fix for missing SST files on optimized snapDiff code path#5465
HDDS-8940. Fix for missing SST files on optimized snapDiff code path#5465hemantk-12 merged 5 commits intoapache:masterfrom
Conversation
| if (sameFiles.contains(node.getFileName()) || | ||
| differentFiles.contains(node.getFileName())) { | ||
| LOG.debug("Skipping known processed SST: {}", node.getFileName()); | ||
| for (CompactionNode nextNodes : successors) { |
There was a problem hiding this comment.
nit: variable node could be better than nextNodes & successors. Probably sourceNode & sourceNodes
There was a problem hiding this comment.
I think successors is OK because it is current node's successors. Also Guava calls it successors.
Another thing they are target nodes of current node, not source nodes.
NextNode is fine as well because it will get added to next level's node list.
If you want I can change it to successor/successors or nextNode/nextNodes.
There was a problem hiding this comment.
Since this is a dag successor doesn't really sound good. But that is fine though
swamirishi
left a comment
There was a problem hiding this comment.
Thanks for the patch @hemantk-12. Overall the patch looks good apart from a few nitpicky comments here and there.
...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotUtils.java
Outdated
Show resolved
Hide resolved
swamirishi
left a comment
There was a problem hiding this comment.
Thanks for fixing the issues @hemantk-12 LGTM
|
Thanks for the review @swamirishi |
…f code path (apache#5465) (cherry picked from commit 3d7e786) Change-Id: If3f9d0b350d5cd57bdbc5d863e4f0fa7526ff2e7
What changes were proposed in this pull request?
Problem:
The problem is that SSTFilteringService and SST pruning service work independently and try to optimize the space by deleting unnecessary SST files. SSTFilteringService deletes some files which don't belongs to the snapshotted bucket and SST prune service deletes the file which are not required for diff calculations. On the other hand compaction DAG is global at Ozone level and is kind a not aware of the above two clean ups. Problem arises when calculating the delta files for two snapshots and traversal reaches to this condition. Graph traversal adds a node because it is not present in the toSnapshot (because it might be deleted by SSTFilteringService) and later gets added to diff file because of this condition.
Before returning delta files to SnapshotDiffManager, we look for the files in either active DB dir and SST backup dir. Active DB dir doesn't have these files because they were compacted and SST backup dir doesn't have because of SST pruning service.
Detailed explanation and example.
Fix:
In this PR, it is proposed to keep key range in the DAG node and use that to early return while traversing.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8940
How was this patch tested?