HDDS-8940. Fix for missing SST files on optimized snapDiff code path by hemantk-12 · Pull Request #5465 · apache/ozone

hemantk-12 · 2023-10-18T20:28:21Z

What changes were proposed in this pull request?

Problem:
The problem is that SSTFilteringService and SST pruning service work independently and try to optimize the space by deleting unnecessary SST files. SSTFilteringService deletes some files which don't belongs to the snapshotted bucket and SST prune service deletes the file which are not required for diff calculations. On the other hand compaction DAG is global at Ozone level and is kind a not aware of the above two clean ups. Problem arises when calculating the delta files for two snapshots and traversal reaches to this condition. Graph traversal adds a node because it is not present in the toSnapshot (because it might be deleted by SSTFilteringService) and later gets added to diff file because of this condition.
Before returning delta files to SnapshotDiffManager, we look for the files in either active DB dir and SST backup dir. Active DB dir doesn't have these files because they were compacted and SST backup dir doesn't have because of SST pruning service.

Detailed explanation and example.

Fix:
In this PR, it is proposed to keep key range in the DAG node and use that to early return while traversing.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8940

How was this patch tested?

Existing unit tests.
New tests are in progress.

swamirishi · 2023-10-20T00:19:58Z

...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java

-            if (sameFiles.contains(node.getFileName()) ||
-                differentFiles.contains(node.getFileName())) {
-              LOG.debug("Skipping known processed SST: {}", node.getFileName());
+          for (CompactionNode nextNodes : successors) {


nit: variable node could be better than nextNodes & successors. Probably sourceNode & sourceNodes

I think successors is OK because it is current node's successors. Also Guava calls it successors.
Another thing they are target nodes of current node, not source nodes.

NextNode is fine as well because it will get added to next level's node list.

If you want I can change it to successor/successors or nextNode/nextNodes.

Since this is a dag successor doesn't really sound good. But that is fine though

swamirishi

Thanks for the patch @hemantk-12. Overall the patch looks good apart from a few nitpicky comments here and there.

...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotUtils.java

swamirishi

Thanks for fixing the issues @hemantk-12 LGTM

hemantk-12 · 2023-10-27T22:33:40Z

Thanks for the review @swamirishi

…pache#5465)

…f code path (apache#5465) (cherry picked from commit 3d7e786) Change-Id: If3f9d0b350d5cd57bdbc5d863e4f0fa7526ff2e7

hemantk-12 added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Oct 18, 2023

hemantk-12 requested review from aswinshakil, smengcl and swamirishi October 19, 2023 22:32

swamirishi reviewed Oct 20, 2023

View reviewed changes

swamirishi requested changes Oct 20, 2023

View reviewed changes

...ksdb-checkpoint-differ/src/main/java/org/apache/ozone/rocksdiff/RocksDBCheckpointDiffer.java Show resolved Hide resolved

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotUtils.java Outdated Show resolved Hide resolved

hemantk-12 marked this pull request as ready for review October 24, 2023 00:28

hemantk-12 added 4 commits October 25, 2023 15:02

Fix for missing SSt files

5c36ae1

Fixed typos

6023fe8

Added testcases

8bd9044

Addresed review comments

c51733a

hemantk-12 force-pushed the HDDS-8940 branch from 7ad8236 to c51733a Compare October 25, 2023 22:03

Fixed the bug

2c44a97

swamirishi approved these changes Oct 27, 2023

View reviewed changes

hemantk-12 merged commit 3d7e786 into apache:master Oct 27, 2023

ibrusentsev pushed a commit to ibrusentsev/ozone that referenced this pull request Nov 14, 2023

HDDS-8940. Fix for missing SST files on optimized snapDiff code path (a…

aca8777

…pache#5465)

hemantk-12 mentioned this pull request Jan 24, 2024

HDDS-9824. Provide CLI that scans keys for a matching container and lists keys #5724

Closed

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Feb 1, 2024

CDPD-58290. HDDS-8940. Fix for missing SST files on optimized snapDif…

d1ebdf3

…f code path (apache#5465) (cherry picked from commit 3d7e786) Change-Id: If3f9d0b350d5cd57bdbc5d863e4f0fa7526ff2e7

hemantk-12 deleted the HDDS-8940 branch October 28, 2024 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-8940. Fix for missing SST files on optimized snapDiff code path#5465

HDDS-8940. Fix for missing SST files on optimized snapDiff code path#5465
hemantk-12 merged 5 commits intoapache:masterfrom
hemantk-12:HDDS-8940

hemantk-12 commented Oct 18, 2023

Uh oh!

swamirishi Oct 20, 2023

Uh oh!

hemantk-12 Oct 20, 2023

Uh oh!

swamirishi Oct 27, 2023

Uh oh!

swamirishi left a comment

Uh oh!

Uh oh!

Uh oh!

swamirishi left a comment

Uh oh!

hemantk-12 commented Oct 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hemantk-12 commented Oct 18, 2023

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

swamirishi Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

hemantk-12 Oct 20, 2023

Choose a reason for hiding this comment

Uh oh!

swamirishi Oct 27, 2023

Choose a reason for hiding this comment

Uh oh!

swamirishi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

swamirishi left a comment

Choose a reason for hiding this comment

Uh oh!

hemantk-12 commented Oct 27, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants