Skip to content

HDDS-8940. Fix for missing SST files on optimized snapDiff code path#5465

Merged
hemantk-12 merged 5 commits intoapache:masterfrom
hemantk-12:HDDS-8940
Oct 27, 2023
Merged

HDDS-8940. Fix for missing SST files on optimized snapDiff code path#5465
hemantk-12 merged 5 commits intoapache:masterfrom
hemantk-12:HDDS-8940

Conversation

@hemantk-12
Copy link
Contributor

What changes were proposed in this pull request?

Problem:
The problem is that SSTFilteringService and SST pruning service work independently and try to optimize the space by deleting unnecessary SST files. SSTFilteringService deletes some files which don't belongs to the snapshotted bucket and SST prune service deletes the file which are not required for diff calculations. On the other hand compaction DAG is global at Ozone level and is kind a not aware of the above two clean ups. Problem arises when calculating the delta files for two snapshots and traversal reaches to this condition. Graph traversal adds a node because it is not present in the toSnapshot (because it might be deleted by SSTFilteringService) and later gets added to diff file because of this condition.
Before returning delta files to SnapshotDiffManager, we look for the files in either active DB dir and SST backup dir. Active DB dir doesn't have these files because they were compacted and SST backup dir doesn't have because of SST pruning service.

Detailed explanation and example.

Fix:
In this PR, it is proposed to keep key range in the DAG node and use that to early return while traversing.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8940

How was this patch tested?

  • Existing unit tests.
  • New tests are in progress.

@hemantk-12 hemantk-12 added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Oct 18, 2023
if (sameFiles.contains(node.getFileName()) ||
differentFiles.contains(node.getFileName())) {
LOG.debug("Skipping known processed SST: {}", node.getFileName());
for (CompactionNode nextNodes : successors) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: variable node could be better than nextNodes & successors. Probably sourceNode & sourceNodes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think successors is OK because it is current node's successors. Also Guava calls it successors.
Another thing they are target nodes of current node, not source nodes.

NextNode is fine as well because it will get added to next level's node list.

If you want I can change it to successor/successors or nextNode/nextNodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a dag successor doesn't really sound good. But that is fine though

Copy link
Contributor

@swamirishi swamirishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @hemantk-12. Overall the patch looks good apart from a few nitpicky comments here and there.

@hemantk-12 hemantk-12 marked this pull request as ready for review October 24, 2023 00:28
Copy link
Contributor

@swamirishi swamirishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the issues @hemantk-12 LGTM

@hemantk-12 hemantk-12 merged commit 3d7e786 into apache:master Oct 27, 2023
@hemantk-12
Copy link
Contributor Author

Thanks for the review @swamirishi

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Feb 1, 2024
…f code path (apache#5465)

(cherry picked from commit 3d7e786)
Change-Id: If3f9d0b350d5cd57bdbc5d863e4f0fa7526ff2e7
@hemantk-12 hemantk-12 deleted the HDDS-8940 branch October 28, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants