HDDS-8580. Reduce memory usage in ContainerKeyMapperTask#reprocess #4696

ivandika3 · 2023-05-11T03:48:27Z

What changes were proposed in this pull request?

While setting up for OM performance test during Recon OM full snapshot, I removed the Recon DB directory before restarting Recon to trigger full snapshot (essentially bootstrapping a new Recon).

However, it is found after OM DB is successfully downloaded, during the reprocess of ContainerKeyMapperTask, the Recon heap usage increased significantly for large keys table (our cluster has around 350 million keys).

It is found that the issue was caused due to in-memory maps that store all the OM keys during the reprocess. This is a regression introduced in HDDS-6783. In essence, the patch is to revert the implementation of HDDS-6783 ONLY for ContainerKeyMapperTask#reprocess.

ContainerKeyMapperTask#process should not increase the heap memory significantly since the number of delta updates are already limited by the Recon configurations
HDDS-6783 aims for atomicity during the Recon OM task updates. However since ContainerKeyMapperTask#reprocess truncate all the Recon Container DB before it starts and rebuilt the Recon Container DB, I think it's acceptable.

After the patch is applied, the Recon heap size stays stable during the full snapshot.

Any suggestion for better approach is welcomed.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8580

How was this patch tested?

Manual test.

Attached is Recon heap memory before and after the patch.

ivandika3 · 2023-05-11T03:50:27Z

@dombizita @smengcl Could you help take a look?

ashishkumar50

Hi @ivandika3, Thanks for working on this. Please find my comments inline.

...op-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java

devmadhuu · 2023-05-12T12:50:47Z

...op-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java

+  private boolean flushAndCommitContainerKeyMapToDB(
+      Map<ContainerKeyPrefix, Integer> containerKeyMap) {
+    try {
+      writeToTheDB(containerKeyMap, Collections.emptyMap(),


I think, we should flush the "containerKeyCountMap" also and optimize more memory footprint if we are introducing this flush mechanism, so you can refactor the code a bit and pass "containerKeyCountMap" instead of sending empty map in 2nd argument.

Thank you for the suggestion. I have included containerKeyCountMap into the flush, and removing the last writeToDB in the 'reprocess' function. I have also moved the flush function to an outer scope to prevent inaccurate container key info when there are multiple bucket layouts.

Thanks @ivandika3 for fixing comments. Patch LGTM. +1

@devmadhuu Thank you for the review.

devmadhuu

Added few comments, pls check

ivandika3 · 2023-05-16T01:16:28Z

CI failure does not seem to be related.

ivandika3 · 2023-05-17T03:32:21Z

Hi @dombizita could you help review this?

dombizita

thanks for working on this @ivandika3, I only had one question about a method (to make sure that I understand it correctly) and one about a comment change, beside this it looks good to me!

...op-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java

ferhui · 2023-05-23T06:44:42Z

@ashishkumar50 do you have other comments? Plan to merge this PR. Thanks

dombizita · 2023-05-23T08:44:03Z

@ashishkumar50 do you have other comments? Plan to merge this PR. Thanks

I'm waiting to get a reply on this comment.

cc @devmadhuu @ivandika3

ashishkumar50

@ivandika3, Thanks for updating the patch, LGTM +1.

dombizita · 2023-05-25T14:23:02Z

I just wanted to merge this PR but there is a merge conflict, can you resolve it @ivandika3?

ivandika3 · 2023-05-25T14:46:24Z

Hi @dombizita, I have resolved the conflict. Thank you.

dombizita · 2023-06-03T09:32:26Z

thanks for the patch @ivandika3! thanks for the review @devmadhuu @ashishkumar50!

whbing · 2023-06-04T15:05:12Z

mvn build failed in https://github.com/apache/ozone/actions/runs/5162988533, also failed in master branch now.
If it's related to this PR? Thanks !

Error:  /home/runner/work/ozone/ozone/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java:[472,49] 
org.apache.hadoop.ozone.recon.api.types.ContainerKeyPrefix is abstract; cannot be instantiated

ivandika3 · 2023-06-04T15:32:09Z

Hi @whbing, there is an incompatibility due to refactoring done on HDDS-8733. I have raised #4826 to include the fix for the incompatibility. @dombizita @szetszwo Could you help to reconcile the conflict? Sorry for the inconvenience.

dombizita · 2023-06-04T19:08:46Z

thanks for finding this @whbing, #4826 is merged. thanks for the quick fix @ivandika3!

HDDS-8580. Reduce memory usage in ContainerKeyMapperTask#reprocess

9964bd8

adoroszlai added the recon label May 11, 2023

ashishkumar50 reviewed May 11, 2023

View reviewed changes

...op-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java Show resolved Hide resolved

...op-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java Outdated Show resolved Hide resolved

Use in-memory map for containerKeyCount

c7bc435

devmadhuu reviewed May 12, 2023

View reviewed changes

...op-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ContainerKeyMapperTask.java Outdated Show resolved Hide resolved

Incremental batching in for containerKey in reprocess

24d030e

devmadhuu reviewed May 12, 2023

View reviewed changes

ivandika3 added 2 commits May 13, 2023 10:52

Guard for containerKeyMap for speedup

5d38657

Include containerKeyCount when flush and committing

868d031

smengcl requested a review from dombizita May 13, 2023 08:42

Add configuration to ozone-default.xml

6bc0a5b

dombizita reviewed May 17, 2023

View reviewed changes

Clarify comments

5e4a91a

ivandika3 requested a review from dombizita May 19, 2023 01:11

dombizita approved these changes May 21, 2023

View reviewed changes

Update default configuration value

5e020f1

dombizita requested review from ashishkumar50 and devmadhuu May 24, 2023 08:24

ashishkumar50 reviewed May 25, 2023

View reviewed changes

Merge branch 'master' into HDDS-8580

386bb42

Resolve compile issue with master

e41d25b

ivandika3 requested a review from dombizita May 31, 2023 01:30

dombizita merged commit a2aa1d4 into apache:master Jun 3, 2023

ivandika3 deleted the HDDS-8580 branch June 3, 2023 09:39

ivandika3 self-assigned this Apr 23, 2024

HDDS-8580. Reduce memory usage in ContainerKeyMapperTask#reprocess #4696

HDDS-8580. Reduce memory usage in ContainerKeyMapperTask#reprocess #4696

Uh oh!

Conversation

ivandika3 commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

ivandika3 commented May 11, 2023

Uh oh!

ashishkumar50 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devmadhuu May 12, 2023

Choose a reason for hiding this comment

Uh oh!

ivandika3 May 13, 2023

Choose a reason for hiding this comment

Uh oh!

devmadhuu May 15, 2023

Choose a reason for hiding this comment

Uh oh!

ivandika3 May 15, 2023

Choose a reason for hiding this comment

Uh oh!

devmadhuu left a comment

Choose a reason for hiding this comment

Uh oh!

ivandika3 commented May 16, 2023

Uh oh!

ivandika3 commented May 17, 2023

Uh oh!

dombizita left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ferhui commented May 23, 2023

Uh oh!

dombizita commented May 23, 2023

Uh oh!

ashishkumar50 left a comment

Choose a reason for hiding this comment

Uh oh!

dombizita commented May 25, 2023

Uh oh!

ivandika3 commented May 25, 2023

Uh oh!

dombizita commented Jun 3, 2023

Uh oh!

whbing commented Jun 4, 2023

Uh oh!

ivandika3 commented Jun 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dombizita commented Jun 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ivandika3 commented May 11, 2023 •

edited

Loading

ivandika3 commented Jun 4, 2023 •

edited

Loading